-
Apache Arrow
Apache Arrow is a multi-language toolbox for accelerated data interchange and in-memory processing
-
InfluxDB
Power Real-Time Data Analytics at Scale. Get real-time insights from all types of time series data with InfluxDB. Ingest, query, and analyze billions of data points in real-time with unbounded cardinality.
Feel free to download the dataset and the Jupyter notebook and run it in your own environment if you'd like to follow along. The notebook is available here: https://github.com/gaborschulz/learning-polars/blob/main/01-time-series-analysis/time-series-analysis.ipynb
One is related to the heritage of being built around the NumPy library, which is great for processing numerical data, but becomes an issue as soon as the data is anything else. Pandas 2.0 has started to bring in Arrow, but it's not yet the standard (you have to opt-in and according to the developers it's going to stay that way for the foreseeable future). Also, pandas's Arrow-based features are not yet entirely on par with its NumPy-based features. Polars was built around Arrow from the get go. This makes it very powerful when it comes to exchanging data with other languages and reducing the number of in-memory copying operations, thus leading to better performance.