I love duckDB and this looks just absolutely brilliant!
One question for me is, lets say i want to start using this today and at work we are running snowflake. I get that each analytics person would have to run duckdb + this extension on their local machines and point to the blob store and the database that is running datalake extension, for now that would be say a VM running duckdb. When I run the actual query where does the computation happen? And what if I want a lot of computation?
Is the solution currently to host a huge duckdb VM that everyone ssh's into and run their queries or how does that part work?
the compute would happen on your machine if you were running duckdb locally.
And yes, to get more compute, you'd want to spin up a VM.
What's cool is you can do both (run locally for small stuff, run on VM for heavy workloads).
so in other words we could replace snowflake/bigquery with this solution and get pretty much the same performance?
it depends... but yes, you could likely set it up in a way that matches (or beats) snowflake performance.