Get real-time insights from all types of time series data with InfluxDB. Ingest, query, and analyze billions of data points in real-time with unbounded cardinality. Learn more →
Top 23 Python dbt Projects
-
Mage
🧙 The modern replacement for Airflow. Mage is an open-source data pipeline tool for transforming and integrating data. https://github.com/mage-ai/mage-ai
-
Scout Monitoring
Free Django app performance insights with Scout Monitoring. Get Scout setup in minutes, and let us sweat the small stuff. A couple lines in settings.py is all you need to start monitoring your apps. Sign up for our free tier today.
-
soda-core
:zap: Data quality testing for the modern data stack (SQL, Spark, and Pandas) https://www.soda.io
-
streamify
A data engineering project with Kafka, Spark Streaming, dbt, Docker, Airflow, Terraform, GCP and much more!
-
InfluxDB
Power Real-Time Data Analytics at Scale. Get real-time insights from all types of time series data with InfluxDB. Ingest, query, and analyze billions of data points in real-time with unbounded cardinality.
-
astronomer-cosmos
Run your dbt Core projects as Apache Airflow DAGs and Task Groups with a few lines of code
-
dbt-data-reliability
dbt package that is part of Elementary, the dbt-native data observability solution for data & analytics engineers. Monitor your data pipelines in minutes. Available as self-hosted or cloud service with premium features.
-
recs-at-resonable-scale
Recommendations at "Reasonable Scale": joining dataOps with recSys through dbt, Merlin and Metaflow
-
dbt-ml-preprocessing
A SQL port of python's scikit-learn preprocessing module, provided as cross-database dbt macros.
-
valmi-activation
âš¡ valmi.io reverse ETL (data activation) is the open source ( OSS ) data activation platform to load data from warehouses into Webhooks and SaaS tools like Klaviyo, Facebook Ads, Salesforce, Braze etc. Valmi.io Customer Data Platform (CDP) helps track and ingest user activity events from websites, shopify, serverside events. https://cloud.valmi.io
-
SaaSHub
SaaSHub - Software Alternatives and Reviews. SaaSHub helps you find the best software and product alternatives
If the issue happen a lot, there is also: https://github.com/datafold/data-diff
That is a nice tool to do it cross database as well.
I think it's based on checksum method.
Project mention: Launch HN: Serra (YC S23) – Open-source, Python-based dbt alternative | news.ycombinator.com | 2023-08-14There is also sqlmesh (https://sqlmesh.com/). Pretty new as well. It introduces some interesting concepts. For smaller dbt projects it could be a drop-in replacement as it allows importing dbt projects.
Project mention: Show HN: PipeRider – open-source Data Impact Analysis for dbt changes | news.ycombinator.com | 2023-09-06
Project mention: Launch HN: Grai (YC S22) – Open-Source Data Observability Platform | news.ycombinator.com | 2023-07-17Elastic v2 if one is interested in such things: https://github.com/grai-io/grai-core/blob/v0.1.33/LICENSE
Good paper, and in response to that one a team from Coveo, wrote this paper on behavioral tests for recommender systems... and also this repo.
End-to-end stuff, full-fledge stacks: https://github.com/jacopotagliabue/post-modern-stack
Project mention: Show HN: Valmi.io Open Source Reverse-ETL Engine | news.ycombinator.com | 2023-06-21
Python dbt related posts
-
Launch HN: Grai (YC S22) – Open-Source Data Observability Platform
-
When writing ML software - how do you use TDD?
-
[Advice] MLOps Course recommendations
-
Run dbt projects as Apache Airflow DAGs and Task Groups with a few lines of code
-
Curious if anyone has adopted a stack to do raw data ingestion in Databricks?
-
Running dbt core on airflow
-
dolly-v2-12b
-
A note from our sponsor - InfluxDB
www.influxdata.com | 2 Jun 2024
Index
What are some of the best open-source dbt projects in Python? This list will help you:
Project | Stars | |
---|---|---|
1 | Mage | 7,202 |
2 | data-diff | 2,899 |
3 | soda-core | 1,786 |
4 | sqlmesh | 1,383 |
5 | dbt-duckdb | 754 |
6 | streamify | 474 |
7 | piperider | 471 |
8 | astronomer-cosmos | 476 |
9 | dbt-metabase | 432 |
10 | airflow-dbt | 382 |
11 | dbt-data-reliability | 349 |
12 | grai-core | 271 |
13 | recs-at-resonable-scale | 218 |
14 | dbt-clickhouse | 222 |
15 | dbt-coves | 216 |
16 | dbt-athena | 194 |
17 | dbt-databricks | 190 |
18 | post-modern-stack | 181 |
19 | dbt-ml-preprocessing | 176 |
20 | dbt-coverage | 174 |
21 | dbt2looker | 171 |
22 | dbterd | 171 |
23 | valmi-activation | 130 |
Sponsored