Python data-lake

Open-source Python projects categorized as data-lake

Top 3 Python data-lake Projects

  • dlt

    data load tool (dlt) is an open source Python library that makes data loading easy 🛠️

  • Project mention: Show HN: Automatically extract data from APIs with dlt and OpenAPI | news.ycombinator.com | 2024-05-29

    - You always have the last say. The generated code is declarative and ready to hack in case we pick the wrong paginator or response entity.

    The tool and dlt are open source, find the code here: https://github.com/dlt-hub/dlt-init-openapi and here: https://github.com/dlt-hub/dlt

  • Udacity-Data-Engineering-Projects

    Few projects related to Data Engineering including Data Modeling, Infrastructure setup on cloud, Data Warehousing and Data Lake development.

  • Project mention: Pitanje za data engineering? | /r/programiranje | 2023-06-30
  • Scout Monitoring

    Free Django app performance insights with Scout Monitoring. Get Scout setup in minutes, and let us sweat the small stuff. A couple lines in settings.py is all you need to start monitoring your apps. Sign up for our free tier today.

    Scout Monitoring logo
  • amazon-s3-find-and-forget

    Amazon S3 Find and Forget is a solution to handle data erasure requests from data lakes stored on Amazon S3, for example, pursuant to the European General Data Protection Regulation (GDPR)

NOTE: The open source projects on this list are ordered by number of github stars. The number of mentions indicates repo mentiontions in the last 12 Months or since we started tracking (Dec 2020).

Python data-lake related posts

  • Show HN: Automatically extract data from APIs with dlt and OpenAPI

    2 projects | news.ycombinator.com | 29 May 2024
  • Show HN: Data load tool(dlt)-Python library to automate the creation of datasets

    3 projects | news.ycombinator.com | 24 Oct 2023
  • Data load tool (dlt) – open-source Python library that makes data loading easy

    1 project | news.ycombinator.com | 17 Oct 2023
  • [Discussion] How to implement Data Contracts generically? Seeking advice from data contract users.

    1 project | /r/MachineLearning | 6 Sep 2023
  • Deleting particular data from S3 External Tables

    1 project | /r/dataengineering | 31 Oct 2022
  • Update S3 Files

    1 project | /r/aws | 27 Jan 2022
  • A note from our sponsor - Scout Monitoring
    www.scoutapm.com | 1 Jun 2024
    Get Scout setup in minutes, and let us sweat the small stuff. A couple lines in settings.py is all you need to start monitoring your apps. Sign up for our free tier today. Learn more →

Index

What are some of the best open-source data-lake projects in Python? This list will help you:

Project Stars
1 dlt 1,837
2 Udacity-Data-Engineering-Projects 1,363
3 amazon-s3-find-and-forget 233

Sponsored
SaaSHub - Software Alternatives and Reviews
SaaSHub helps you find the best software and product alternatives
www.saashub.com