Get real-time insights from all types of time series data with InfluxDB. Ingest, query, and analyze billions of data points in real-time with unbounded cardinality. Learn more →
Top 23 distributed-database Open-Source Projects
-
-
InfluxDB
Power Real-Time Data Analytics at Scale. Get real-time insights from all types of time series data with InfluxDB. Ingest, query, and analyze billions of data points in real-time with unbounded cardinality.
-
tidb
TiDB is an open-source, cloud-native, distributed, MySQL-Compatible database for elastic scale and real-time analytics. Try AI-powered Chat2Query free at : https://www.pingcap.com/tidb-serverless/
-
-
-
-
shardingsphere
Distributed SQL transaction & query engine for data sharding, scaling, encryption, and more - on any database.
-
-
SaaSHub
SaaSHub - Software Alternatives and Reviews. SaaSHub helps you find the best software and product alternatives
-
-
ArangoDB
🥑 ArangoDB is a native multi-model database with flexible data models for documents, graphs, and key-values. Build high performance applications using a convenient SQL-like query language or JavaScript extensions.
-
-
-
-
-
yugabyte-db
YugabyteDB - the cloud native distributed SQL database for mission-critical applications.
-
starrocks
StarRocks, a Linux Foundation project, is a next-generation sub-second MPP OLAP database for full analytics scenarios, including multi-dimensional analytics, real-time analytics, and ad-hoc queries.
-
oceanbase
OceanBase is an enterprise distributed relational database with high availability, high performance, horizontal scalability, and compatibility with SQL standards.
-
risingwave
SQL stream processing, analytics, and management. We decouple storage and compute to offer instant failover, dynamic scaling, speedy bootstrapping, and efficient joins.
-
-
Crate
CrateDB is a distributed and scalable SQL database for storing and analyzing massive amounts of data in near real-time, even with complex queries. It is PostgreSQL-compatible, and based on Lucene.
-
awesome-blockchains
A collection about awesome blockchains - open distributed public databases w/ crypto hashes incl. git ;-). Blockchains are the new tulips :tulip::tulip::tulip:. Distributed is the new centralized.
-
ydb
YDB is an open source Distributed SQL Database that combines high availability and scalability with strong consistency and ACID transactions
-
Olric
Distributed in-memory object store. It can be used as an embedded Go library and a language-independent service.
-
SaaSHub
SaaSHub - Software Alternatives and Reviews. SaaSHub helps you find the best software and product alternatives
Each time we create or update a K8s resource, the Kubernetes API stores it in its database — etcd. etcd is a distributed key-value store used to store all of your resource configurations, such as deployments, services, and so on. A neat feature of etcd is that you can subscribe to changes in some keys in the database, which is used by other Kubernetes mechanisms.
Project mention: A MySQL compatible database engine written in pure Go | news.ycombinator.com | 2024-04-09tidb has been around for a while, it is distributed, written in Go and Rust, and MySQL compatible. https://github.com/pingcap/tidb
Somewhat relatedly, StarRocks is also MySQL compatible, written in Java and C++, but it's tackling OLAP use-cases. https://github.com/StarRocks/starrocks
Project mention: Universal Data Migration: Using Slingdata to Transfer Data Between Databases | dev.to | 2024-05-24ClickHouse installed and running.
Project mention: Show HN: Restate, low-latency durable workflows for JavaScript/Java, in Rust | news.ycombinator.com | 2024-06-12Restate is built as a sharded replicated state machine similar to how TiKV (https://tikv.org/), Kudu (https://kudu.apache.org/kudu.pdf) or CockroachDB (https://github.com/cockroachdb/cockroach) are designed. Instead of relying on a specific consensus implementation, we have decided to encapsulate this part into a virtual log (inspired by Delos https://www.usenix.org/system/files/osdi20-balakrishnan.pdf) since it makes it possible to tune the system more easily for different deployment scenarios (on-prem, cloud, cost-effective blob storage). Moreover, it allows for some other cool things like seamlessly moving from one log implementation to another. Apart from that the whole system design has been influenced by ideas from stream processing systems such as Apache Flink (https://flink.apache.org/), log storage systems such as LogDevice (https://logdevice.io/) and others.
We plan to publish a more detailed follow-up blog post where we explain why we developed a new stateful system, how we implemented it, and what the benefits are. Stay tuned!
Also to keep store the data of the products and their available units we will be using a database called SurrealDB. I have chosen SurrealDB because of a specific reason which we will explore later in this article. Now that we have produced a message from the inventory we need a consumer to consume this message by connecting to the Kafka broker. So for this, we will create a shipment service using Go to simulate the shipping process when the products are released from the inventory but to keep this project short and concise we are not going to build the whole shipment system.
Project mention: Why SQLite Is Taking over with Brian Holt and Marco Bambini | news.ycombinator.com | 2024-06-12SQLite is not competing with RDMBSes. SQLite is competing with fopen().
There are of course solutions which wrap this fopen() replacement in a network/cluster-aware tools, e.g. https://github.com/rqlite/rqlite - these are competing with postgres.
Actually, Apple does this for iCloud! They use FoundationDB[1] to store billions of databases, one for each user (plus shared or global databases).
See: https://read.engineerscodex.com/p/how-apple-built-icloud-to-...
Discussed on HN at the time: https://news.ycombinator.com/item?id=39028672
[1]: https://github.com/apple/foundationdb https://en.wikipedia.org/wiki/FoundationDB
ArangoDB
12. Awesome Big Data
Apache ZooKeeper — a distributed coordination, synchronization, and configuration service (written in Java);
Project mention: Need insights to build a distributed key value store from scratch. | /r/DistributedComputing | 2023-12-08Pls check this course: https://github.com/pingcap/talent-plan . It includes how to implement sql and key value store. It’s an awesome course
By the way, I wanted to continue to use the previous experiment with Flink SQL and Iceberg, but I found out Trino doesn't support Iceberg's DynamoDB catalog. Therefore, I had to create a new one.
In a distributed database, NTP synchronization is essential and should be carefully monitored and fixed in case of any failures. To allow some time drift, a maximum clock skew is set. This skew should be kept low enough for performance to avoid too many read retries and high enough for availability to avoid any node evictions caused by network errors. It is a good idea to check the NTP synchronization when starting a YugabyteDB node. This will be implemented by 22255.
Project mention: A MySQL compatible database engine written in pure Go | news.ycombinator.com | 2024-04-09tidb has been around for a while, it is distributed, written in Go and Rust, and MySQL compatible. https://github.com/pingcap/tidb
Somewhat relatedly, StarRocks is also MySQL compatible, written in Java and C++, but it's tackling OLAP use-cases. https://github.com/StarRocks/starrocks
Project mention: Proton, a fast and lightweight alternative to Apache Flink | news.ycombinator.com | 2024-01-30How does this compare to RisingWave and Materialize?
https://github.com/risingwavelabs/risingwave
There https://ydb.tech/ open source db that uses erasure coding for replication in single zone/region.
Project mention: Olric: Distributed, embeddable in-memory data structures in Go | news.ycombinator.com | 2024-02-05
distributed-database discussion
distributed-database related posts
-
Event Driven services using Kafka, SurrealDB, Rust, and Go.
-
Apache HoraeDB is a high-performance, distributed, time-series database in Rust
-
Crash on clock skew: performance vs availability
-
Why SurrealDB is the Future of Database Technology - An In-Depth Look
-
Advisory/Custom/Application Lock with YugabyteDB
-
A lightweight YugabyteDB docker image for CI/CD
-
Multi-region YugabyteDB deployment on AWS EKS with Istio
-
A note from our sponsor - InfluxDB
www.influxdata.com | 12 Jun 2024
Index
What are some of the best open-source distributed-database projects? This list will help you:
Project | Stars | |
---|---|---|
1 | etcd | 46,614 |
2 | tidb | 36,358 |
3 | ClickHouse | 34,961 |
4 | cockroach | 29,329 |
5 | surrealdb | 25,834 |
6 | shardingsphere | 19,555 |
7 | rqlite | 15,098 |
8 | foundationdb | 14,128 |
9 | ArangoDB | 13,404 |
10 | awesome-bigdata | 12,890 |
11 | Apache ZooKeeper | 11,997 |
12 | citus | 9,991 |
13 | talent-plan | 9,890 |
14 | Trino | 9,735 |
15 | yugabyte-db | 8,580 |
16 | starrocks | 8,092 |
17 | oceanbase | 7,534 |
18 | risingwave | 6,456 |
19 | dynomite | 4,164 |
20 | Crate | 3,982 |
21 | awesome-blockchains | 3,712 |
22 | ydb | 3,485 |
23 | Olric | 3,028 |