Query Engines: Push vs. Pull

InfluxDB - Power Real-Time Data Analytics at Scale

Get real-time insights from all types of time series data with InfluxDB. Ingest, query, and analyze billions of data points in real-time with unbounded cardinality.

www.influxdata.com

featured

SaaSHub - Software Alternatives and Reviews

SaaSHub helps you find the best software and product alternatives

www.saashub.com

featured

arroyo

13 3,326 9.6 Rust

Distributed stream processing engine in Rust

Interesting - I looked into your code a bit. I found your window aggregation library [1]. You may be interested in looking into the Rust implementation of some of the research work I've been a part of [2].
In Flink, I believe the reason they need to implement their own backpressure system is that they multiplex TCP connections. That is, they have multiple logical streams flowing through a single TCP connection. If that's the case, you need to do some work to 1) detect which logical stream is the one that's blocking, and 2) don't block because other logical streams may be able to use the active TCP connection.
Thinking it through, I think what Flink's approach buys is not necessarily better performance, but better just a manageable number of connections. That is, imagine you have a process P1 with operators A, B and C. And then P2 has D, E, F. Now imagine that this is a shuffle, where A, B and C are fully connected to D, E and F. In my old system, you would have 9 TCP connections. In Flink, you will have 1.
[1] https://github.com/ArroyoSystems/arroyo/blob/master/arroyo-w...

ClickHouse

209 34,645 10.0 C++

ClickHouse® is a free analytics DBMS for big data

We initially had a pull-based query engine in ClickHouse, but then migrated to a dataflow graph query engine: https://github.com/ClickHouse/ClickHouse/blob/master/src/Pro...
It allows decoupling of the control flow and the data flow. The movement of the data in the query pipeline is controlled explicitly.
We did this migration a few years ago. Many database engines forked or influenced by ClickHouse still use pull-based query engines.

InfluxDB

www.influxdata.com featured

Power Real-Time Data Analytics at Scale. Get real-time insights from all types of time series data with InfluxDB. Ingest, query, and analyze billions of data points in real-time with unbounded cardinality.
sliding-window-aggregators

2 42 4.2 C++

Reference implementations of sliding window aggregation algorithms

NOTE: The number of mentions on this list indicates mentions on common posts plus user suggested alternatives. Hence, a higher number means a more popular project.

Suggest a related project

We Built a 19 PiB Logging Platform with ClickHouse and Saved Millions

1 project | news.ycombinator.com | 2 Apr 2024
Erasure Coding versus Tail Latency

1 project | news.ycombinator.com | 28 Mar 2024
Malloy: A language for describing data relationships and transformations

1 project | news.ycombinator.com | 17 Mar 2024
Malloy: Open-source language for analyzing, transforming, and modeling data

1 project | news.ycombinator.com | 18 Feb 2024
Malloy

1 project | news.ycombinator.com | 16 Feb 2024

Query Engines: Push vs. Pull

This page summarizes the projects mentioned and recommended in the original post on news.ycombinator.com
SQL Database Data Dbms Dev Tools
Post date: 1 Aug 2023

arroyo

ClickHouse

InfluxDB

sliding-window-aggregators

Related posts

We Built a 19 PiB Logging Platform with ClickHouse and Saved Millions

Erasure Coding versus Tail Latency

Malloy: A language for describing data relationships and transformations

Malloy: Open-source language for analyzing, transforming, and modeling data

Malloy

Query Engines: Push vs. Pull

This page summarizes the projects mentioned and recommended in the original post on news.ycombinator.com SQL Database Data Dbms Dev Tools Post date: 1 Aug 2023

arroyo

ClickHouse

InfluxDB

sliding-window-aggregators

Related posts

We Built a 19 PiB Logging Platform with ClickHouse and Saved Millions

Erasure Coding versus Tail Latency

Malloy: A language for describing data relationships and transformations

Malloy: Open-source language for analyzing, transforming, and modeling data

Malloy

This page summarizes the projects mentioned and recommended in the original post on news.ycombinator.com
SQL Database Data Dbms Dev Tools
Post date: 1 Aug 2023