szarnyasg 6 days ago

The YouTube video “Apache Iceberg: What It Is and Why Everyone’s Talking About It” by Tim Berglund explains data lakes really well in the opening minutes: https://www.youtube.com/watch?v=TsmhRZElPvM

1
adastra22 6 days ago

Thanks but I don’t have the time to watch YouTube.

dsp_person 5 days ago

he explains

~40y ago invented data warehouse, where an ETL process overnight would collect data from smaller dbs into a central db (the data warehouse)

~15y ago, data lake (i.e. hadoop) emerged to address scaling and other things. Same idea but ELT instead of ETL: less focus on schema, collect the data into S3 and transform it later

adastra22 5 days ago

Thank you!

simlevesque 5 days ago

It's your db but on s3.