CSV or Parquet File Format

InfluxDB - Power Real-Time Data Analytics at Scale

Get real-time insights from all types of time series data with InfluxDB. Ingest, query, and analyze billions of data points in real-time with unbounded cardinality.

www.influxdata.com

featured

SaaSHub - Software Alternatives and Reviews

SaaSHub helps you find the best software and product alternatives

www.saashub.com

featured

delta

69 6,980 9.9 Scala

An open-source storage framework that enables building a Lakehouse architecture with compute engines including Spark, PrestoDB, Flink, Trino, and Hive and APIs (by delta-io)

I prefer parquet (or delta for larger datasets. CSV for very small datasets, or the ones that will be later used/edited in Excel or Googke sheets.

Apache Arrow

76 13,698 10.0 C++

Apache Arrow is a multi-language toolbox for accelerated data interchange and in-memory processing

In fact I have asked Apache Github how to read select column of particular row group of a parquet file. https://github.com/apache/arrow/issues/35688

InfluxDB

www.influxdata.com featured

Power Real-Time Data Analytics at Scale. Get real-time insights from all types of time series data with InfluxDB. Ingest, query, and analyze billions of data points in real-time with unbounded cardinality.
duckdb

52 17,924 10.0 C++

DuckDB is an in-process SQL OLAP Database Management System

The Parquet-Go library is very complex, not yet success to use it. So I ask whether DuckDB can provide API https://github.com/duckdb/duckdb/issues/7776

NOTE: The number of mentions on this list indicates mentions on common posts plus user suggested alternatives. Hence, a higher number means a more popular project.

Suggest a related project

DataFusion Comet: Apache Spark Accelerator

4 projects | news.ycombinator.com | 31 May 2024
[D] Is there other better data format for LLM to generate structured data?

1 project | /r/MachineLearning | 10 Dec 2023
DuckDB performance improvements with the latest release

8 projects | news.ycombinator.com | 6 Nov 2023
Full-fledged APIs for slowly moving datasets without writing code

1 project | news.ycombinator.com | 25 Oct 2023
Delta vs Iceberg: make love not war

1 project | /r/MicrosoftFabric | 30 Jun 2023

CSV or Parquet File Format

This page summarizes the projects mentioned and recommended in the original post on /r/Python
Analytics Spark Arrow SQL Acid
Post date: 1 Jun 2023

delta

Apache Arrow

InfluxDB

duckdb

Related posts

DataFusion Comet: Apache Spark Accelerator

[D] Is there other better data format for LLM to generate structured data?

DuckDB performance improvements with the latest release

Full-fledged APIs for slowly moving datasets without writing code

Delta vs Iceberg: make love not war

CSV or Parquet File Format

This page summarizes the projects mentioned and recommended in the original post on /r/Python Analytics Spark Arrow SQL Acid Post date: 1 Jun 2023

delta

Apache Arrow

InfluxDB

duckdb

Related posts

DataFusion Comet: Apache Spark Accelerator

[D] Is there other better data format for LLM to generate structured data?

DuckDB performance improvements with the latest release

Full-fledged APIs for slowly moving datasets without writing code

Delta vs Iceberg: make love not war

This page summarizes the projects mentioned and recommended in the original post on /r/Python
Analytics Spark Arrow SQL Acid
Post date: 1 Jun 2023