Python Bigquery

Open-source Python projects categorized as Bigquery

Top 23 Python Bigquery Projects

  • Redash

    Make Your Company Data Driven. Connect to any data source, easily visualize, dashboard and share your data.

  • Project mention: Redash: Connect to data source, easily visualize, dashboard and share your data | news.ycombinator.com | 2024-03-20
  • airbyte

    The leading data integration platform for ETL / ELT data pipelines from APIs, databases & files to data warehouses, data lakes & data lakehouses. Both self-hosted and Cloud-hosted.

  • Project mention: How to Build a Chat App with Your Postgres Data using Agent Cloud | dev.to | 2024-05-13

    AgentCloud uses Airbyte to build data pipelines, which allow us to split, chunk, and embed data from over 300 data sources, including Postgres.

  • InfluxDB

    Power Real-Time Data Analytics at Scale. Get real-time insights from all types of time series data with InfluxDB. Ingest, query, and analyze billions of data points in real-time with unbounded cardinality.

    InfluxDB logo
  • sqlglot

    Python SQL Parser and Transpiler

  • Project mention: The Future of MySQL is PostgreSQL: an extension for the MySQL wire protocol | news.ycombinator.com | 2024-04-26

    This is probably referring to "zero changes to your driver code" and not "zero changes to the SQL you send over this driver".

    Translating between SQL dialects is notoriously hard and attempts to translate [1] are working in 95% of cases. But the last 5% would require 5x amount of work. That's because "SQL dialect" also includes weird edge cases of type inference of things like COALESCE(5, FALSE) and emulation of system catalogs (pg_catalog, information_schema).

    [1] https://github.com/tobymao/sqlglot

  • ibis

    the portable Python dataframe library

  • Project mention: Show HN: Hashquery, a Python library for defining reusable analysis | news.ycombinator.com | 2024-04-23

    I really don't understand the appeal of dbt vs a proper programming language. The templating approach leads to massive spaghetti. I look forward to trying out something like Ibis [0]

    0: https://ibis-project.org/

  • ethereum-etl

    Python scripts for ETL (extract, transform and load) jobs for Ethereum blocks, transactions, ERC20 / ERC721 tokens, transfers, receipts, logs, contracts, internal transactions. Data is available in Google BigQuery https://goo.gl/oY5BCQ

  • Project mention: Blockchain transactions decoding: making wallet activity understandable | dev.to | 2023-10-27

    Event is a log entity which EVM smart contracts can emit during transaction execution. Events are very good at signalling that an some action has taken place on-chain. Applications can subscribe and listen to events to trigger some off-chain logic or they can index, transform and store events in some off-chain storage (look at The Graph protocol or Ethereum ETL).

  • professional-services

    Common solutions and tools developed by Google Cloud's Professional Services team. This repository and its contents are not an officially supported Google product.

  • ingestr

    ingestr is a CLI tool to copy data between any databases with a single command seamlessly.

  • Project mention: FLaNK 04 March 2024 | dev.to | 2024-03-04
  • SaaSHub

    SaaSHub - Software Alternatives and Reviews. SaaSHub helps you find the best software and product alternatives

    SaaSHub logo
  • Project mention: GitHub - swirlai/swirl-search: Swirl is an open-source search platform that uses AI to search multiple content and data sources simultaneously, finds the best results using a reader LLM, then prompts Generative AI, enabling you to get answers based on your data. | /r/programming | 2023-12-05
  • jupysql

    Better SQL in Jupyter. 📊

  • Project mention: Show HN: JupySQL – a SQL client for Jupyter (ipython-SQL successor) | news.ycombinator.com | 2023-12-06

    Hey, HN community!

    We're stoked to launch JupySQL today! JupySQL is an open-source library that brings a modern SQL experience to Jupyter. JupySQL is compatible with all major databases, such as Snowflake, Redshift, PostgreSQL, MySQL, MariaDB, DuckDB, SQL Server, Clickhouse, Trino, and more!

    To get started, check out our tutorial: https://jupysql.ploomber.io/en/latest/quick-start.html

    SQL is the defacto language for data analysis; however, analysis often requires a mix of SQL and Python. JupySQL bridges this gap, allowing users to execute SQL queries seamlessly in Jupyter and continue their analysis in Python. Add %%sql to the top of your cell and start writing SQL.

    Here are some of JupySQL's main features:

    - Syntax highlighting

  • BigQuery-Python

    Simple Python client for interacting with Google BigQuery.

  • python-bigquery-pandas

    Google BigQuery connector for pandas

  • pypinfo

    Easily view PyPI download statistics via Google's BigQuery.

  • astro-sdk

    Astro SDK allows rapid and clean development of {Extract, Load, Transform} workflows using Python and SQL, powered by Apache Airflow.

  • Project mention: Orchestration: Thoughts on Dagster, Airflow and Prefect? | /r/dataengineering | 2023-06-01

    Have you tried the Astro SDK? https://github.com/astronomer/astro-sdk

  • bigquery-schema-generator

    Generates the BigQuery schema from newline-delimited JSON or CSV data records.

  • dbt-coves

    CLI tool for dbt users to simplify creation of staging models (yml and sql) files

  • CueObserve

    Timeseries Anomaly detection and Root Cause Analysis on data in SQL data warehouses and databases

  • dbt-ml-preprocessing

    A SQL port of python's scikit-learn preprocessing module, provided as cross-database dbt macros.

  • premier-league

    A Data Engineering project. Repository for backend infrastructure and Streamlit app files for a Premier League Dashboard.

  • Project mention: Google Cloud Portfolio Projects? | /r/googlecloud | 2023-12-09

    I have a data engineering project that uses BigQuery, Cloud Run, Compute Engine, Cloud SQL, Artifact Registry, Firestore, and Datastream.

  • dataproc-templates

    Dataproc templates and pipelines for solving simple in-cloud data tasks

  • bigquery_fdw

    BigQuery Foreign Data Wrapper for PostgreSQL

  • prism

    Prism is the easiest way to develop, orchestrate, and execute data pipelines in Python. (by runprism)

  • Project mention: Prism: the easiest way to create robust data workflows. Accessible via CLI | /r/coolgithubprojects | 2023-09-21
  • iris3

    An upgraded and improved version of the Iris automatic GCP-labeling project

  • dbd

    dbd is a database prototyping tool that enables data analysts and engineers to quickly load and transform data in SQL databases.

  • SaaSHub

    SaaSHub - Software Alternatives and Reviews. SaaSHub helps you find the best software and product alternatives

    SaaSHub logo
NOTE: The open source projects on this list are ordered by number of github stars. The number of mentions indicates repo mentiontions in the last 12 Months or since we started tracking (Dec 2020).

Python Bigquery related posts

  • This Week In Python

    5 projects | dev.to | 17 Mar 2024
  • Show HN: I built an open-source data copy tool called ingestr

    3 projects | news.ycombinator.com | 27 Feb 2024
  • Ingestr: CLI tool to copy data between any databases with a single command

    1 project | news.ycombinator.com | 27 Feb 2024
  • JupySQL: Connecting to a SQL database from Jupyter

    1 project | /r/SQL | 9 Sep 2023
  • GitHub - ploomber/jupysql: Better SQL in Jupyter. 📊

    1 project | /r/coolgithubprojects | 6 Sep 2023
  • SQL CTE's in Jupyter notebooks, DuckDB integration and more

    1 project | /r/Jupyter | 2 Aug 2023
  • TL;DR incorporate SQL functionality within Jupyter, access to modern data processing DBs (like DuckDB), polars and data exploration through plotting easier with JupySQL.

    1 project | /r/coolgithubprojects | 2 Aug 2023
  • A note from our sponsor - InfluxDB
    www.influxdata.com | 20 May 2024
    Get real-time insights from all types of time series data with InfluxDB. Ingest, query, and analyze billions of data points in real-time with unbounded cardinality. Learn more →

Index

What are some of the best open-source Bigquery projects in Python? This list will help you:

Project Stars
1 Redash 25,057
2 airbyte 14,296
3 sqlglot 5,679
4 ibis 4,304
5 ethereum-etl 2,836
6 professional-services 2,738
7 ingestr 2,341
8 swirl-search 1,552
9 jupysql 611
10 BigQuery-Python 451
11 python-bigquery-pandas 422
12 pypinfo 394
13 astro-sdk 323
14 bigquery-schema-generator 232
15 dbt-coves 209
16 CueObserve 208
17 dbt-ml-preprocessing 176
18 premier-league 154
19 dataproc-templates 112
20 bigquery_fdw 89
21 prism 79
22 iris3 68
23 dbd 56

Sponsored
SaaSHub - Software Alternatives and Reviews
SaaSHub helps you find the best software and product alternatives
www.saashub.com