DuckDB interoperates with polars dataframes easily. I see DuckDB as a SQL engine for dataframes.
Any DuckDB result is easily converted to Pandas (by appending .df()) or Polars (by appending .pl()).
The conversion to polars is instantaneous because it’s zero copy because it all goes through Arrow in-memory format.
So I usually write complex queries in DuckDB SQL but if I need to manipulate it in polars I just convert it in my workflow midstream (only takes milliseconds) and then continue working with that in DuckDB. It’s seamless due to Apache Arrow.
Wow, what a cool workflow. I looks like the interop promise of Apache Arrow is real. It's a great thing when your computer works as fast as you think as opposed to sitting around waiting for queries to finish.
Any DuckDB result is easily converted to Pandas (by appending .df()) or Polars (by appending .pl()).
The conversion to polars is instantaneous because it’s zero copy because it all goes through Arrow in-memory format.
So I usually write complex queries in DuckDB SQL but if I need to manipulate it in polars I just convert it in my workflow midstream (only takes milliseconds) and then continue working with that in DuckDB. It’s seamless due to Apache Arrow.
https://duckdb.org/docs/guides/python/polars.html