SaaSHub helps you find the best software and product alternatives Learn more →
Top 23 Datascience Open-Source Projects
-
Scout Monitoring
Free Django app performance insights with Scout Monitoring. Get Scout setup in minutes, and let us sweat the small stuff. A couple lines in settings.py is all you need to start monitoring your apps. Sign up for our free tier today.
-
machine_learning_complete
A comprehensive machine learning repository containing 30+ notebooks on different concepts, algorithms and techniques.
-
Mimesis
Mimesis is a robust data generator for Python that can produce a wide range of fake data in multiple languages.
-
InfluxDB
Power Real-Time Data Analytics at Scale. Get real-time insights from all types of time series data with InfluxDB. Ingest, query, and analyze billions of data points in real-time with unbounded cardinality.
-
OpenMetadata
OpenMetadata is a unified platform for discovery, observability, and governance powered by a central metadata repository, in-depth lineage, and seamless team collaboration.
-
sql-translator
SQL Translator is a tool for converting natural language queries into SQL code using artificial intelligence. This project is 100% free and open source.
-
awesome-conformal-prediction
A professionally curated list of awesome Conformal Prediction videos, tutorials, books, papers, PhD and MSc theses, articles and open-source libraries.
-
An-Introduction-to-Statistical-Learning
This repository contains the exercises and its solution contained in the book "An Introduction to Statistical Learning" in python.
-
Fast-F1
FastF1 is a python package for accessing and analyzing Formula 1 results, schedules, timing data and telemetry
-
CleverCSV
CleverCSV is a Python package for handling messy CSV files. It provides a drop-in replacement for the builtin CSV module with improved dialect detection, and comes with a handy command line application for working with CSV files.
-
code
Compilation of R and Python programming codes on the Data Professor YouTube channel. (by dataprofessor)
-
SaaSHub
SaaSHub - Software Alternatives and Reviews. SaaSHub helps you find the best software and product alternatives
Project mention: Show HN: Toolkit for LLM Fine-Tuning, Ablating and Testing | news.ycombinator.com | 2024-04-07This is a great project, little bit similar to https://github.com/ludwig-ai/ludwig, but it includes testing capabilities and ablation.
questions regarding the LLM testing aspect: How extensive is the test coverage for LLM use cases, and what is the current state of this project area? Do you offer any guarantees, or is it considered an open-ended problem?
Would love to see more progress toward this area!
Project mention: Python Day 9: Building Interactive Web Apps without HTML/CSS and JavaScript | dev.to | 2024-04-26Taipy is an open-source Python library that enables data scientists and developers to build robust end-to-end data pipelines.
panel – data exploration & web app framework for Python
Project mention: How to Dynamically Adjust the Height of a Textarea in ReactJS | dev.to | 2023-10-25In this blog post, I have demonstrated how I addressed the challenge of dynamically adjusting the height of a textarea element based on its content, preventing the need for vertical scrolling in the title section of the OpenMetadata Knowledge article page.
Project mention: Dive Deep into Conformal Prediction with This Ultimate Resource Compilation | news.ycombinator.com | 2024-04-15
Project mention: Python: Uncovering the Overlooked Core Functionalities | news.ycombinator.com | 2023-07-24If you actually think this code is better there's a real library that does this: https://github.com/EntilZha/PyFunctional.
Project mention: Multiple Notepad++ Flaws Let Attackers Execute Arbitrary Code | news.ycombinator.com | 2023-09-04https://github.com/microsoft/vscode/issues/4490
It looks like there are a number of vscode extensions for recording macros:
- https://www.google.com/search?q=vscode+macro+recorder
- https://marketplace.visualstudio.com/search?term=Macro&targe...
- the macro-commander README explains its JSON-based macro language. YAML might be easier to maintain than JSON. https://github.com/jeff-hykin/macro-commander#what-are-some-...
For teams with multiple editors, you can specify workflow automation scripts with shell scripts or ci container/cmd YAML, and/or pre-commit.yml instead of with an IDE-specific tool.
Isn't there native real-time collaboration functionality in vscode/vscodium that would be useful for a native macro recording feature? (Edit) Live Share can't be installed in vscodium. https://github.com/VSCodium/vscodium/issues/128
Support for jupyter-collaboration Y.js CRDT could be added to vscode-jupyter and/or a more generic extension: "Support for real-time collaboration in the extension?" https://github.com/microsoft/vscode-jupyter/discussions/1293...
jupyterlab/jupyter-collaboration:
Datascience related posts
-
Python Day 9: Building Interactive Web Apps without HTML/CSS and JavaScript
-
Dive Deep into Conformal Prediction with This Ultimate Resource Compilation
-
+10 Resources to Empower Women in Technology
-
Show HN: Building data and AI apps, an alternative to Streamlit
-
Our open-source project for building AI / Data full-stack apps got funded! 🎉 🎉
-
Plotting 1,000,000 points on a webpage using only Python
-
Forecasts need to have error bars
-
A note from our sponsor - SaaSHub
www.saashub.com | 3 Jun 2024
Index
What are some of the best open-source Datascience projects? This list will help you:
Project | Stars | |
---|---|---|
1 | ds-cheatsheets | 13,894 |
2 | ludwig | 10,893 |
3 | modin | 9,524 |
4 | Taipy | 9,282 |
5 | metaflow | 7,688 |
6 | machine_learning_complete | 4,529 |
7 | Mimesis | 4,315 |
8 | panel | 4,308 |
9 | OpenMetadata | 4,343 |
10 | datascience | 4,130 |
11 | sql-translator | 4,025 |
12 | awesome-conformal-prediction | 24 |
13 | PyFunctional | 2,347 |
14 | An-Introduction-to-Statistical-Learning | 2,285 |
15 | Fast-F1 | 2,238 |
16 | DataScienceR | 1,959 |
17 | ggstatsplot | 1,939 |
18 | openllmetry | 1,391 |
19 | vscode-jupyter | 1,232 |
20 | CleverCSV | 1,226 |
21 | easystats | 1,040 |
22 | code | 881 |
23 | streamlit-geospatial | 814 |
Sponsored