Get Scout setup in minutes, and let us sweat the small stuff. A couple lines in settings.py is all you need to start monitoring your apps. Sign up for our free tier today. Learn more →
Top 23 Snowflake Open-Source Projects
-
airbyte
The leading data integration platform for ETL / ELT data pipelines from APIs, databases & files to data warehouses, data lakes & data lakehouses. Both self-hosted and Cloud-hosted.
-
Scout Monitoring
Free Django app performance insights with Scout Monitoring. Get Scout setup in minutes, and let us sweat the small stuff. A couple lines in settings.py is all you need to start monitoring your apps. Sign up for our free tier today.
-
bytebase
The GitHub/GitLab for database DevOps. World's most advanced database DevOps and CI/CD for Developer, DBA and Platform Engineering teams.
-
Ockam
Orchestrate end-to-end encryption, cryptographic identities, mutual authentication, and authorization policies between distributed applications – at massive scale.
-
InfluxDB
Power Real-Time Data Analytics at Scale. Get real-time insights from all types of time series data with InfluxDB. Ingest, query, and analyze billions of data points in real-time with unbounded cardinality.
-
jitsu
Jitsu is an open-source Segment alternative. Fully-scriptable data ingestion engine for modern data teams. Set-up a real-time data pipeline in minutes, not days
-
snowflake
A simple to use Go (golang) package to generate or parse Twitter snowflake IDs (by bwmarrin)
-
peerdb
Fast, Simple and a cost effective tool to replicate data from Postgres to Data Warehouses, Queues and Storage
-
soda-core
:zap: Data quality testing for the modern data stack (SQL, Spark, and Pandas) https://www.soda.io
-
elementary
The dbt-native data observability solution for data & analytics engineers. Monitor your data pipelines in minutes. Available as self-hosted or cloud service with premium features.
-
dozer
Dozer is a real-time data movement tool that leverages CDC from various sources and moves data into various sinks. (by getdozer)
-
SaaSHub
SaaSHub - Software Alternatives and Reviews. SaaSHub helps you find the best software and product alternatives
Project mention: How to Build a Chat App with Your Postgres Data using Agent Cloud | dev.to | 2024-05-13AgentCloud uses Airbyte to build data pipelines, which allow us to split, chunk, and embed data from over 300 data sources, including Postgres.
Project mention: SQL Convertor for Easy Migration from Presto, Trino, ClickHouse, and Hive to Apache Doris | dev.to | 2024-05-27Apache Doris is an all-in-one data platform that is capable of real-time reporting, ad-hoc queries, data lakehousing, log management and analysis, and batch data processing. As more and more companies have been replacing their component-heavy data architecture with Apache Doris, there is an increasing need for a more convenient data migration solution. That's why the Doris SQL Convertor is made.
Project mention: The Future of MySQL is PostgreSQL: an extension for the MySQL wire protocol | news.ycombinator.com | 2024-04-26This is probably referring to "zero changes to your driver code" and not "zero changes to the SQL you send over this driver".
Translating between SQL dialects is notoriously hard and attempts to translate [1] are working in 95% of cases. But the last 5% would require 5x amount of work. That's because "SQL dialect" also includes weird edge cases of type inference of things like COALESCE(5, FALSE) and emulation of system catalogs (pg_catalog, information_schema).
[1] https://github.com/tobymao/sqlglot
Project mention: GrowthBook: Open-source feature flagging and A/B testing platform | /r/opensource | 2023-10-20
disclosure: I work at Ockam.
The Portals for Mac app is an example of the type of thing you could build using the open source stack of protocols. The README (linked by parent) links out to all of the relevant parts of the protocol documentation to explain how these work together. The NAT Traversal (https://github.com/build-trust/ockam/blob/develop/examples/a...) part of the README is probably the best explanation of why the free relay you get via Ockam Orchestrator is a useful part of this demo.
As for why would anyone trust this: The protocols are designed so you absolutely don't have to trust the relay. Trust is pushed out to the edges that you control and so you're not susceptible to a MITM attack if something like a relay is compromised. The protocol design for all of this is open and documented, and was independently audited by (IMO) some of the best in the business, Trail of Bits: https://docs.ockam.io/reference/protocols.
Project mention: Show HN: SQLFrame – I ran PySpark without Spark on a SQL database | news.ycombinator.com | 2024-05-20
Fluent Migrator
If the issue happen a lot, there is also: https://github.com/datafold/data-diff
That is a nice tool to do it cross database as well.
I think it's based on checksum method.
Project mention: The API database architecture – Stop writing HTTP-GET endpoints | news.ycombinator.com | 2024-05-10Yeah, I fully agree. The tooling for putting that much logic into the database is just not great. I've been decently happy with Sqitch[0] for DB change management, but even with that you don't really get a good basis for testing some of the logic you could otherwise test in isolation in app code.
I've also tried to rely heavily on the database handling security and authorization, but as soon as you start to do somewhat non-trivial attribute-/relationship-based authorization (as you would find in many products nowadays), it really isn't fun anymore, and you spend a lot of the time you saved on manually building backend routes on trying to fit you authz model into those basic primitives (and avoiding performance bottlenecks). Especially compares to other modern authz solutions like OPA[1] or oso[2] it really doesn't stack up.
[0]: https://github.com/sqitchers/sqitch
[1]: https://www.openpolicyagent.org
[2]: https://www.osohq.com
Project mention: PeerDB Streams – Simple, Native Postgres Change Data Capture | news.ycombinator.com | 2024-05-06
Project mention: Show HN: Find simple open source bounties to solve and get paid | news.ycombinator.com | 2023-08-19https://github.com/getdozer/dozer/issues/1631#issuecomment-1...
and then something has gone off the rails about the accounting process since
Trigger.dev
Go team does acknowledge [1] it as a bug, so there is some point here
However, that said, I wonder if OP (duckdb) could have written their solution [2] differently. Shouldn't they be able to select from a Pipe as well as Error channel simultaneously? (similar to how they are doing it inside here [3]). If not, I would have create a go-routine that does blocking read on the Pipe and then pass it on to another channel to select on.
[1] https://github.com/golang/go/issues/66239
[2] https://github.com/scratchdata/scratchdata/blob/7c1a0fcd0e20...
[3] https://github.com/scratchdata/scratchdata/blob/7c1a0fcd0e20...
Snowflake related posts
-
Show HN: SQLFrame – I ran PySpark without Spark on a SQL database
-
Vanna.ai: Chat with your SQL database
-
Show HN: SQL Polyglot
-
Migrate mongodb Datawarehouse to snowflake
-
The Chan Zuckerberg Initiative Originally Built the Snowflake Terraform Provider
-
Preventing replication slot overflow on Postgres DB (AWS RDS)
-
Preventing WAL Growth on Postgres DB Running on AWS RDS
-
A note from our sponsor - Scout Monitoring
www.scoutapm.com | 1 Jun 2024
Index
What are some of the best open-source Snowflake projects? This list will help you:
Project | Stars | |
---|---|---|
1 | airbyte | 14,379 |
2 | doris | 11,604 |
3 | bytebase | 10,280 |
4 | sqlglot | 5,778 |
5 | growthbook | 5,633 |
6 | Ockam | 4,360 |
7 | ibis | 4,358 |
8 | Rudderstack | 3,964 |
9 | jitsu | 3,894 |
10 | sqlchat | 3,946 |
11 | FluentMigrator | 3,153 |
12 | tbls | 3,122 |
13 | data-diff | 2,899 |
14 | snowflake | 2,880 |
15 | sqitch | 2,721 |
16 | ingestr | 2,357 |
17 | peerdb | 1,861 |
18 | soda-core | 1,786 |
19 | elementary | 1,773 |
20 | dozer | 1,459 |
21 | IdGen | 1,143 |
22 | scratchdata | 1,043 |
23 | yauaa | 737 |
Sponsored