Get real-time insights from all types of time series data with InfluxDB. Ingest, query, and analyze billions of data points in real-time with unbounded cardinality. Learn more →
Top 12 Python apache-spark Projects
-
InfluxDB
Power Real-Time Data Analytics at Scale. Get real-time insights from all types of time series data with InfluxDB. Ingest, query, and analyze billions of data points in real-time with unbounded cardinality.
-
SaaSHub
SaaSHub - Software Alternatives and Reviews. SaaSHub helps you find the best software and product alternatives
-
covid-19-data-engineering-pipeline
A Covid-19 data pipeline on AWS featuring PySpark/Glue, Docker, Great Expectations, Airflow, and Redshift, templated in CloudFormation and CDK, deployable via Github Actions.
-
Traffic-Data-Analysis-with-Apache-Spark-Based-on-Mobile-Robot-Data
Mobile robot data were analyzed with Apache-Spark to extract five different statistical result such as travel time, waiting time, average speed, occupancy and density were produced.
-
xonai-dashboard
A Grafana-based application to assist Big Data infrastructure optimization initiatives where Spark applications are a dominant cost driver
-
transactional-datalake-using-amazon-msk-and-apache-iceberg-on-aws-glue
Stream CDC into an Amazon S3 data lake in Apache Iceberg format with AWS Glue Streaming using Amazon MSK and MSK Connect (Debezium)
Project mention: Mlflow: Open-source platform for the machine learning lifecycle | news.ycombinator.com | 2024-05-16
Project mention: Show HN: Open sourcing a Big Data monitoring tool | news.ycombinator.com | 2024-03-29
transactional-datalake-using-amazon-msk-serverless-and-apache-iceberg-on-aws-glue 2024-01-10T01:26:56Z https://github.com/aws-samples/transactional-datalake-using-amazon-msk-serverless-and-apache-iceberg-on-aws-glue aws-msk-serverless-cdc-data-pipeline-with-debezium 2024-01-09T01:03:38Z https://github.com/aws-samples/aws-msk-serverless-cdc-data-pipeline-with-debezium aws-healthlake-smart-on-fhir 2024-01-08T23:05:17Z https://github.com/aws-samples/aws-healthlake-smart-on-fhir aws-greengrass-custom-components 2024-01-08T11:34:12Z https://github.com/aws-samples/aws-greengrass-custom-components graviton-developer-workshop 2024-01-08T03:30:31Z https://github.com/aws-samples/graviton-developer-workshop msk-flink-streaming-cdk 2024-01-08T02:25:39Z https://github.com/aws-samples/msk-flink-streaming-cdk rag-with-amazon-postgresql-using-pgvector 2024-01-06T04:47:41Z https://github.com/aws-samples/rag-with-amazon-postgresql-using-pgvector queueTransfer_ContactTraceRecordSupport-for-Service-Cloud-Voice 2024-01-05T20:34:14Z https://github.com/aws-samples/queueTransfer_ContactTraceRecordSupport-for-Service-Cloud-Voice amazon-chime-sdk-voice-voice-translator 2024-01-05T17:25:54Z https://github.com/aws-samples/amazon-chime-sdk-voice-voice-translator private-s3-vpce 2024-01-05T06:38:52Z https://github.com/aws-samples/private-s3-vpce bedrock-contact-center-tasks-eval 2024-01-04T21:46:51Z https://github.com/aws-samples/bedrock-contact-center-tasks-eval clickstream-sdk-samples 2024-01-04T07:21:52Z https://github.com/aws-samples/clickstream-sdk-samples aws-msk-cdc-data-pipeline-with-debezium 2024-01-04T04:09:22Z https://github.com/aws-samples/aws-msk-cdc-data-pipeline-with-debezium transactional-datalake-using-amazon-msk-and-apache-iceberg-on-aws-glue 2024-01-04T03:39:04Z https://github.com/aws-samples/transactional-datalake-using-amazon-msk-and-apache-iceberg-on-aws-glue ..
Python apache-spark related posts
-
Mlflow: Open-source platform for the machine learning lifecycle
-
Observations on MLOps–A Fragmented Mosaic of Mismatched Expectations
-
Explain me how websites like Dall-E, chatgpt, thispersondoesntexit process the user data so quickly
-
[D] What licensed software do you use for machine learning experimentation tracking?
-
[Q] Is there a tool to keep track of my ML experiments?
-
Remote file access vulnerability in `mlflow server` and `mlflow ui` CLIs
-
Critical CVE in `mlflow` 2.2.0 and under: Remote file access vulnerability in `mlflow server` and `mlflow ui` CLIs; possible lateral movement into aws creds
-
A note from our sponsor - InfluxDB
www.influxdata.com | 30 May 2024
Index
What are some of the best open-source apache-spark projects in Python? This list will help you:
Project | Stars | |
---|---|---|
1 | MLflow | 17,475 |
2 | flintrock | 633 |
3 | quinn | 583 |
4 | PySpark-Boilerplate | 391 |
5 | sparktorch | 335 |
6 | dataproc-templates | 112 |
7 | Apache-Spark-Guide | 28 |
8 | covid-19-data-engineering-pipeline | 22 |
9 | Traffic-Data-Analysis-with-Apache-Spark-Based-on-Mobile-Robot-Data | 10 |
10 | xonai-dashboard | 11 |
11 | livyc | 3 |
12 | transactional-datalake-using-amazon-msk-and-apache-iceberg-on-aws-glue | 1 |
Sponsored