SaaSHub helps you find the best software and product alternatives Learn more →
Top 23 data-warehouse Open-Source Projects
-
Greenplum
Greenplum Database - Massively Parallel PostgreSQL for Analytics. An open-source massively parallel data platform for analytics, machine learning and AI.
-
InfluxDB
Power Real-Time Data Analytics at Scale. Get real-time insights from all types of time series data with InfluxDB. Ingest, query, and analyze billions of data points in real-time with unbounded cardinality.
-
hydra
Hydra: Column-oriented Postgres. Add scalable analytics to your project in minutes. (by hydradatabase)
-
elementary
The dbt-native data observability solution for data & analytics engineers. Monitor your data pipelines in minutes. Available as self-hosted or cloud service with premium features.
-
SaaSHub
SaaSHub - Software Alternatives and Reviews. SaaSHub helps you find the best software and product alternatives
-
Udacity-Data-Engineering-Projects
Few projects related to Data Engineering including Data Modeling, Infrastructure setup on cloud, Data Warehousing and Data Lake development.
-
bigquery-utils
Useful scripts, udfs, views, and other utilities for migration and data warehouse operations in BigQuery.
-
optimus
Optimus is an easy-to-use, reliable, and performant workflow orchestrator for data transformation, data modeling, pipelines, and data quality management. (by raystack)
-
DomainMOD
DomainMOD is an open source application written in PHP & MySQL used to manage your domains and other internet assets in a central location. DomainMOD also includes a Data Warehouse framework that allows you to import your web server data so that you can view, export, and report on your live data.
-
data-engineering-project-template
This is a template you can use for your next data engineering portfolio project.
-
SaaSHub
SaaSHub - Software Alternatives and Reviews. SaaSHub helps you find the best software and product alternatives
Project mention: The Notifier Pattern for Applications That Use Postgres | news.ycombinator.com | 2024-05-14Those updates are not retroactive. They apply on a go forward basis. Each day's changes become Apache 2.0 licensed on that day four years in the future.
For example, v0.28 was released on October 18, 2022, and becomes Apache 2.0 licensed four years after that date (i.e., 2.5 years from today), on October 18, 2026.
[0]: https://github.com/MaterializeInc/materialize/blob/76cb6647d...
Project mention: Pg_lakehouse: Query Any Data Lake from Postgres | news.ycombinator.com | 2024-05-13How does this compare to Hydra? https://www.hydra.so/
Project mention: Ask HN: Freelancer? Seeking freelancer? (December 2023) | news.ycombinator.com | 2023-12-03SEEKING FREELANCER | REMOTE | GERMANY
dltHub is looking for a freelance help in the following repos:
- https://github.com/dlt-hub/dlt
Project mention: Swirl: An open-source search engine with LLMs and ChatGPT to provide all the answers you need 🌌 | dev.to | 2023-09-06Using the Galaxy UI, knowledge workers can systematically review the best results from all configured services including Apache Solr, ChatGPT, Elastic, OpenSearch, PostgreSQL, Google BigQuery, plus generic HTTP/GET/POST with configurations for premium services like Google's Programmable Search Engine, Miro and Northern Light Research.
Go team does acknowledge [1] it as a bug, so there is some point here
However, that said, I wonder if OP (duckdb) could have written their solution [2] differently. Shouldn't they be able to select from a Pipe as well as Error channel simultaneously? (similar to how they are doing it inside here [3]). If not, I would have create a go-routine that does blocking read on the Pipe and then pass it on to another channel to select on.
[1] https://github.com/golang/go/issues/66239
[2] https://github.com/scratchdata/scratchdata/blob/7c1a0fcd0e20...
[3] https://github.com/scratchdata/scratchdata/blob/7c1a0fcd0e20...
Project mention: Multiwoven Reverse ETL (0.2.0) – Open-Source Alternative to Hightouch and Census | news.ycombinator.com | 2024-04-19Multiwoven is now a leading Open Source Alternative to Hightouch, Census, and Rudderstack.
It's been a great journey so far, and we are excited to announce a major update to Multiwoven - our new release, Multiwoven 0.2.0, is now available!
Repo: https://github.com/Multiwoven/multiwoven
This release brings a host of new features, enhancements, and bug fixes to streamline data syncs and user experience.
From new connectors to advanced reporting dashboards, as a team, we have been working hard on these updates based on the feedback and requests from our customers and the community.
- 10+ new connectors added to Multiwoven, including
Project mention: Shout out to Appsmith developers to check out this new tool! | /r/lowcode | 2023-07-09I am one of the members of an open-source project VulcanSQL, a Data API Framework for data applications that helps data folks create and share data APIs faster.
DomainMOD - Application to manage your domains and other internet assets in a central location. DomainMOD includes a Data Warehouse framework that allows you to import your WHM/cPanel web server data so that you can view, export, and report on your data.
Project mention: Unified storage framework for the entire machine learning lifecycle | news.ycombinator.com | 2024-02-28
data-warehouse related posts
-
Using ClickHouse to scale an events engine
-
Debugging a Golang Bug with Non-Blocking Reads
-
Moving a Billion Postgres Rows on a $100 Budget
-
Hydra (YC W22) adds upsert to columnar Postgres
-
Show HN: ScratchDB – Open-Source Snowflake on ClickHouse
-
Hydra
-
Show HN: ScratchDB – Open-Source Snowflake on ClickHouse
-
A note from our sponsor - SaaSHub
www.saashub.com | 22 May 2024
Index
What are some of the best open-source data-warehouse projects? This list will help you:
Project | Stars | |
---|---|---|
1 | awesome-bigdata | 12,845 |
2 | Greenplum | 6,213 |
3 | materialize | 5,608 |
4 | Rudderstack | 3,947 |
5 | hydra | 2,651 |
6 | DXY-COVID-19-Data | 2,175 |
7 | elementary | 1,746 |
8 | dlt | 1,792 |
9 | Cubes | 1,490 |
10 | tensorbase | 1,429 |
11 | Udacity-Data-Engineering-Projects | 1,295 |
12 | bigquery-utils | 1,042 |
13 | scratchdata | 1,041 |
14 | optimus | 737 |
15 | Data-Engineering-Projects | 722 |
16 | multiwoven | 654 |
17 | vulcan-sql | 596 |
18 | DomainMOD | 454 |
19 | versatile-data-kit | 412 |
20 | space | 137 |
21 | data-engineering-project-template | 112 |
22 | beneath | 81 |
23 | pgwarehouse | 64 |
Sponsored