Pg_lakehouse: Query Any Data Lake from Postgres

InfluxDB - Power Real-Time Data Analytics at Scale

Get real-time insights from all types of time series data with InfluxDB. Ingest, query, and analyze billions of data points in real-time with unbounded cardinality.

www.influxdata.com

featured

SaaSHub - Software Alternatives and Reviews

SaaSHub helps you find the best software and product alternatives

www.saashub.com

featured

Daft

9 1,792 9.8 Rust

Distributed DataFrame for Python designed for the cloud, powered by Rust

There’s a lot of interesting work happening in this area (see: XTable).
We are building a Python distributed query engine, and share a lot of the same frustrations… in fact until quite recently most of the table formats only had JVM client libraries and so integrating it purely natively with Daft was really difficult.
We finally managed to get read integrations across Iceberg/DeltaLake/Hudi recently as all 3 now have Python/Rust-facing APIs. Funny enough, the only non-JVM implementation of Hudi was contributed by the Hudi team and currently still lives in our repo :D (https://github.com/Eventual-Inc/Daft/tree/main/daft/hudi/pyh...)
It’s still the case that these libraries still lag behind their JVM counterparts though, so it’s going to be a while before we see full support across the full featureset of each table format. But we’re definitely seeing a large appetite for working with table formats outside of the JVM ecosystem (e.g. in Python and Rust)

paradedb

20 4,314 9.8 Rust

Postgres for Search and Analytics
InfluxDB

www.influxdata.com featured

Power Real-Time Data Analytics at Scale. Get real-time insights from all types of time series data with InfluxDB. Ingest, query, and analyze billions of data points in real-time with unbounded cardinality.
hydra

27 2,668 8.5 C

Hydra: Column-oriented Postgres. Add scalable analytics to your project in minutes. (by hydradatabase)

How does this compare to Hydra? https://www.hydra.so/

ClickBench

72 585 9.1 HTML

ClickBench: a Benchmark For Analytical Databases

You can see performance comparison to Hydra on ClickBench: https://benchmark.clickhouse.com/ by selecting ParadeDB and Hydra. Tl;dr: It is much faster.
From a feature-set perspective, in addition to querying local disk, we can query remote object stores (S3, GCS, etc.), table format providers (Delta Lake, soon Iceberg too).
From a code perspective, we're written in Rust on top of open-source standards like OpenDAL and DataFusion, while Hydra is their own codebase built from a fork of Citus columnar, in C.
Hydra is a cool project. Hope this helps! :)

pgrx

14 3,306 9.5 Rust

Build Postgres Extensions with Rust!

Yet another amazing postgres plugin made possible by pgrx (https://github.com/pgcentralfoundation/pgrx)
It's really crazy how some projects just instantly enable a whole generation of new possibilities.
If you are impressed like this and want to build something like it -- check out pgrx, it's a pretty great experience.

db-benchmark

12 124 8.0 R

reproducible benchmark of database-like ops (by duckdblabs)

Would be great to also see new pg_lakehouse and datafusion benchmark results here: https://duckdblabs.github.io/db-benchmark/
Currently Datafusion is much slower than duckdb or OOMing.

NOTE: The number of mentions on this list indicates mentions on common posts plus user suggested alternatives. Hence, a higher number means a more popular project.

Suggest a related project

The evolution of Serverless Postgres

1 project | dev.to | 30 May 2024
How to ditch Neon

2 projects | dev.to | 1 May 2024
Squawk – A Linter for Postgres Migrations

1 project | news.ycombinator.com | 30 Apr 2024
Serverless Postgres with Neon - My first impression

1 project | dev.to | 24 Apr 2024
Building a Managed Postgres Service in Rust

5 projects | news.ycombinator.com | 8 Apr 2024

Pg_lakehouse: Query Any Data Lake from Postgres

This page summarizes the projects mentioned and recommended in the original post on news.ycombinator.com
Postgres Postgresql Rust Python Big Data
Post date: 13 May 2024

Daft

paradedb

InfluxDB

hydra

ClickBench

pgrx

db-benchmark

Related posts

The evolution of Serverless Postgres

How to ditch Neon

Squawk – A Linter for Postgres Migrations

Serverless Postgres with Neon - My first impression

Building a Managed Postgres Service in Rust

Pg_lakehouse: Query Any Data Lake from Postgres

This page summarizes the projects mentioned and recommended in the original post on news.ycombinator.com Postgres Postgresql Rust Python Big Data Post date: 13 May 2024

Daft

paradedb

InfluxDB

hydra

ClickBench

pgrx

db-benchmark

Related posts

The evolution of Serverless Postgres

How to ditch Neon

Squawk – A Linter for Postgres Migrations

Serverless Postgres with Neon - My first impression

Building a Managed Postgres Service in Rust

This page summarizes the projects mentioned and recommended in the original post on news.ycombinator.com
Postgres Postgresql Rust Python Big Data
Post date: 13 May 2024