Skip to content
software

Data Lake

Data Lake

Definition

A data lake is a centralized repository that stores raw, unprocessed data in its native format — structured, semi-structured, and unstructured — at any scale. Unlike a data warehouse, a data lake imposes no schema at write time (schema-on-read).

Platforms like AWS S3 with Athena, Delta Lake, and Apache Iceberg are common data lake implementations. Data lakes enable flexible analytics and ML on heterogeneous data.


Ship secure code faster

Crash Override integrates security into the developer workflow. No context switching, no waiting on reviews.