ML

Top 23 ML Open-Source Projects

  • tensorflow

    An Open Source Machine Learning Framework for Everyone

  • Project mention: Rebuilding TensorFlow 2.8.4 on Ubuntu 22.04 to patch vulnerabilities | dev.to | 2024-06-02

    The official 2.8.4 container was published in Nov 2022. That's 1.5 years of OS updates at least. I looked up the 2.8.4 source and found that it's using Ubuntu 20.04 as the base OS. Of note, we're using the x86_64 architecture according to the container image layer: ENV NVARCH=x86_64.

  • ML-For-Beginners

    12 weeks, 26 lessons, 52 quizzes, classic Machine Learning for all

  • Project mention: Good coding groups for black women? | news.ycombinator.com | 2024-01-13

    - https://github.com/microsoft/ML-For-Beginners

    Also check out this list Pitt puts out every year:

  • Scout Monitoring

    Free Django app performance insights with Scout Monitoring. Get Scout setup in minutes, and let us sweat the small stuff. A couple lines in settings.py is all you need to start monitoring your apps. Sign up for our free tier today.

    Scout Monitoring logo
  • yolov5

    YOLOv5 🚀 in PyTorch > ONNX > CoreML > TFLite

  • Project mention: Mastering YOLOv10: A Complete Guide with Hands-On Projects | dev.to | 2024-05-30

    Docs

  • netron

    Visualizer for neural network, deep learning and machine learning models

  • Project mention: Giving Odin Intelligence | dev.to | 2024-05-21

    use handy visualizers, for example, https://netron.app/

  • handson-ml

    ⛔️ DEPRECATED – See https://github.com/ageron/handson-ml3 instead.

  • MindsDB

    The platform for customizing AI from enterprise data

  • Project mention: How to build your Developer Portfolio with MindsDB: The symbiotic relationship between developers and Opensource in 2024. | dev.to | 2024-05-23

    Developers are able to check for issues to fix on MindsDB’s Github Issues Page. The issues are marked with labels which indicate what you can work on,which you can find here. Fixing bugs showcases that you are a problem solver and capable of resolving issues. Companies find this capability very valuable as it has an impact on the quality of their product and user experience.

  • MLflow

    Open source platform for the machine learning lifecycle

  • Project mention: Mlflow: Open-source platform for the machine learning lifecycle | news.ycombinator.com | 2024-05-16
  • InfluxDB

    Power Real-Time Data Analytics at Scale. Get real-time insights from all types of time series data with InfluxDB. Ingest, query, and analyze billions of data points in real-time with unbounded cardinality.

    InfluxDB logo
  • StableLM

    StableLM: Stability AI Language Models

  • Project mention: The Era of 1-bit LLMs: ternary parameters for cost-effective computing | news.ycombinator.com | 2024-02-28

    https://github.com/Stability-AI/StableLM?tab=readme-ov-file#...

  • best-of-ml-python

    🏆 A ranked list of awesome machine learning Python libraries. Updated weekly.

  • kubeflow

    Machine Learning Toolkit for Kubernetes

  • awesome-mlops

    A curated list of references for MLOps

  • ludwig

    Low-code framework for building custom LLMs, neural networks, and other AI models

  • Project mention: Show HN: Toolkit for LLM Fine-Tuning, Ablating and Testing | news.ycombinator.com | 2024-04-07

    This is a great project, little bit similar to https://github.com/ludwig-ai/ludwig, but it includes testing capabilities and ablation.

    questions regarding the LLM testing aspect: How extensive is the test coverage for LLM use cases, and what is the current state of this project area? Do you offer any guarantees, or is it considered an open-ended problem?

    Would love to see more progress toward this area!

  • dopamine

    Dopamine is a research framework for fast prototyping of reinforcement learning algorithms.

  • ML.NET

    ML.NET is an open source and cross-platform machine learning framework for .NET.

  • pycaret

    An open-source, low-code machine learning library in Python

  • MNN

    MNN is a blazing fast, lightweight deep learning framework, battle-tested by business-critical use cases in Alibaba

  • Project mention: [D][R] Deploying deep models on memory constrained devices | /r/MachineLearning | 2023-10-03

    However, I am looking on this subject through the problem of training/finetuning deep models on the edge devices, being increasingly available thing to do. Looking at tflite, alibaba's MNN, mit-han-lab's tinyengine etc..

  • deeplake

    Database for AI. Store Vectors, Images, Texts, Videos, etc. Use with LLMs/LangChain. Store, query, version, & visualize any AI data. Stream data in real-time to PyTorch/TensorFlow. https://activeloop.ai

  • Project mention: FLaNK AI Weekly 25 March 2025 | dev.to | 2024-03-25
  • metaflow

    :rocket: Build and manage real-life ML, AI, and data science projects with ease!

  • Project mention: FLaNK Stack 05 Feb 2024 | dev.to | 2024-02-05
  • unstructured

    Open source libraries and APIs to build custom preprocessing pipelines for labeling, training, or production machine learning pipelines.

  • Project mention: LlamaCloud and LlamaParse | news.ycombinator.com | 2024-02-20

    Be careful with unstructured:

    https://github.com/Unstructured-IO/unstructured/blob/d11c70c...

    from: https://github.com/open-webui/open-webui/issues/687

  • CoreML-Models

    Largest list of models for Core ML (for iOS 11+)

  • serving

    A flexible, high-performance serving system for machine learning models

  • Project mention: Llama.cpp: Full CUDA GPU Acceleration | news.ycombinator.com | 2023-06-12

    Yet another TEDIOUS BATTLE: Python vs. C++/C stack.

    This project gained popularity due to the HIGH DEMAND for running large models with 1B+ parameters, like `llama`. Python dominates the interface and training ecosystem, but prior to llama.cpp, non-ML professionals showed little interest in a fast C++ interface library. While existing solutions like tensorflow-serving [1] in C++ were sufficiently fast with GPU support, llama.cpp took the initiative to optimize for CPU and trim unnecessary code, essentially code-golfing and sacrificing some algorithm correctness for improved performance, which isn't favored by "ML research".

    NOTE: In my opinion, a true pioneer was DarkNet, which implemented the YOLO model series and significantly outperformed others [2]. Same trick basically like llama.cpp

    [1] https://github.com/tensorflow/serving

  • llm

    An ecosystem of Rust libraries for working with large language models

  • Project mention: Open-sourcing a simple automation/agent workflow builder | /r/ChatGPTPro | 2023-10-07

    We're open-sourcing a project that lets you build simple automations/agent workflows that use LLMs for different tasks. Kinda like Zapier or IFTTT but focused on using natural language to accomplish your tasks.It's super early but we'd love to start getting feedback to steer it in the right direction. It currently supports OpenAI and local models through llm.

  • oneflow

    OneFlow is a deep learning framework designed to be user-friendly, scalable and efficient.

  • SaaSHub

    SaaSHub - Software Alternatives and Reviews. SaaSHub helps you find the best software and product alternatives

    SaaSHub logo
NOTE: The open source projects on this list are ordered by number of github stars. The number of mentions indicates repo mentiontions in the last 12 Months or since we started tracking (Dec 2020).

ML related posts

  • How to build your Developer Portfolio with MindsDB: The symbiotic relationship between developers and Opensource in 2024.

    1 project | dev.to | 23 May 2024
  • Mlflow: Open-source platform for the machine learning lifecycle

    1 project | news.ycombinator.com | 16 May 2024
  • Show HN: LLM-powered NPCs running on your hardware

    4 projects | news.ycombinator.com | 30 Apr 2024
  • Observations on MLOps–A Fragmented Mosaic of Mismatched Expectations

    1 project | dev.to | 26 Apr 2024
  • Machine Learning with PHP

    3 projects | dev.to | 22 Apr 2024
  • Show HN: Open-source Google Docs for audio transcriptions (Whisper)

    2 projects | news.ycombinator.com | 17 Apr 2024
  • What’s the Difference Between Fine-tuning, Retraining, and RAG?

    1 project | dev.to | 8 Apr 2024
  • A note from our sponsor - InfluxDB
    www.influxdata.com | 2 Jun 2024
    Get real-time insights from all types of time series data with InfluxDB. Ingest, query, and analyze billions of data points in real-time with unbounded cardinality. Learn more →

Index

What are some of the best open-source ML projects? This list will help you:

Project Stars
1 tensorflow 183,162
2 ML-For-Beginners 67,497
3 yolov5 47,719
4 netron 26,489
5 handson-ml 25,111
6 MindsDB 21,531
7 MLflow 17,475
8 StableLM 15,859
9 best-of-ml-python 15,869
10 kubeflow 13,778
11 awesome-mlops 11,865
12 ludwig 10,893
13 dopamine 10,397
14 ML.NET 8,879
15 pycaret 8,553
16 MNN 8,373
17 deeplake 7,799
18 metaflow 7,688
19 unstructured 7,017
20 CoreML-Models 6,274
21 serving 6,101
22 llm 5,980
23 oneflow 5,759

Sponsored
SaaSHub - Software Alternatives and Reviews
SaaSHub helps you find the best software and product alternatives
www.saashub.com