Top 18 Python foundation-model Projects

ColossalAI

42 38,081 9.7 Python

Making large AI models cheaper, faster and more accessible

Project mention: FLaNK AI-April 22, 2024 | dev.to | 2024-04-22

unilm

42 18,689 9.0 Python

Large-scale Self-supervised Pre-training Across Tasks, Languages, and Modalities

Project mention: 1-Bit LLMs Could Solve AI's Energy Demands | news.ycombinator.com | 2024-05-30

Scout Monitoring

www.scoutapm.com featured

Free Django app performance insights with Scout Monitoring. Get Scout setup in minutes, and let us sweat the small stuff. A couple lines in settings.py is all you need to start monitoring your apps. Sign up for our free tier today.
LLaVA

21 17,102 9.3 Python

[NeurIPS'23 Oral] Visual Instruction Tuning (LLaVA) built towards GPT-4V level capabilities and beyond.

Project mention: PaliGemma: Open-Source Multimodal Model by Google | news.ycombinator.com | 2024-05-15

Here's a tutorial https://wandb.ai/byyoung3/ml-news/reports/How-to-Fine-Tune-L...
There's not really a super easy to use software solution yet, but a few different ones have cropped up. Right now you'll have to read papers to get the training recipes.
- https://github.com/haotian-liu/LLaVA/blob/main/scripts/finet...

Otter

4 3,473 9.1 Python

🦦 Otter, a multi-modal model based on OpenFlamingo (open-sourced version of DeepMind's Flamingo), trained on MIMIC-IT and showcasing improved instruction-following and in-context learning ability.

Project mention: OpenAI vs Google, Detect ChatGPT Content with 99% accuracy, Navigating AI compute costs | /r/ChatGPT | 2023-06-15

👀 Video-LLaMA - Empower large language models with video and audio understanding capability. (link) 🦦 Otter - Multi-modal model with improved instruction-following and in-context learning ability. 🔗 Linkly.AI - AI-powered lead analytics and management platform that helps you track, analyze, and streamline your leads in one place. 🎬 Jet Cut Ready - AI plugin for Adobe Premiere Pro that automatically removes silent parts in videos. (link) 💬 HeyGen's ChatGPT Plugin - Convert text into high-quality videos using AI text and video generation.

NExT-GPT

1 2,953 9.3 Python

Code and models for NExT-GPT: Any-to-Any Multimodal Large Language Model

Project mention: Show HN: NExT-GPT – First LLM working with multimodal input and output | news.ycombinator.com | 2023-09-21

Ask-Anything

3 2,758 8.1 Python

[CVPR2024 Highlight][VideoChatGPT] ChatGPT with video understanding! And many more supported LMs such as miniGPT4, StableLM, and MOSS.
EVA

2 2,011 6.2 Python

EVA Series: Visual Representation Fantasies from BAAI (by baaivision)
InfluxDB

www.influxdata.com featured

Power Real-Time Data Analytics at Scale. Get real-time insights from all types of time series data with InfluxDB. Ingest, query, and analyze billions of data points in real-time with unbounded cardinality.
chronos-forecasting

4 1,855 7.7 Python

Chronos: Pretrained (Language) Models for Probabilistic Time Series Forecasting

Project mention: TimesFM (Time Series Foundation Model) for time-series forecasting | news.ycombinator.com | 2024-05-08

On a related note, Amazon also had a model for time series forecasting called Chronos.
https://github.com/amazon-science/chronos-forecasting

autodistill

13 1,606 9.1 Python

Images to inference with no labeling (use foundation models to train supervised models).

Project mention: Ask HN: Who is hiring? (February 2024) | news.ycombinator.com | 2024-02-01

Roboflow | Open Source Software Engineer, Web Designer / Developer, and more. | Full-time (Remote, SF, NYC) | https://roboflow.com/careers?ref=whoishiring0224
Roboflow is the fastest way to use computer vision in production. We help developers give their software the sense of sight. Our end-to-end platform[1] provides tooling for image collection, annotation, dataset exploration and curation, training, and deployment.
Over 250k engineers (including engineers from 2/3 Fortune 100 companies) build with Roboflow. We now host the largest collection of open source computer vision datasets and pre-trained models[2]. We are pushing forward the CV ecosystem with open source projects like Autodistill[3] and Supervision[4]. And we've built one of the most comprehensive resources for software engineers to learn to use computer vision with our popular blog[5] and YouTube channel[6].
We have several openings available but are primarily looking for strong technical generalists who want to help us democratize computer vision and like to wear many hats and have an outsized impact. Our engineering culture is built on a foundation of autonomy & we don't consider an engineer fully ramped until they can "choose their own loss function". At Roboflow, engineers aren't just responsible for building things but also for helping us figure out what we should build next. We're builders & problem solvers; not just coders. (For this reason we also especially love hiring past and future founders.)
We're currently hiring full-stack engineers for our ML and web platform teams, a web developer to bridge our product and marketing teams, several technical roles on the sales & field engineering teams, and our first applied machine learning researcher to help push forward the state of the art in computer vision.
[1]: https://roboflow.com/?ref=whoishiring0224
[2]: https://roboflow.com/universe?ref=whoishiring0224
[3]: https://github.com/autodistill/autodistill
[4]: https://github.com/roboflow/supervision
[5]: https://blog.roboflow.com/?ref=whoishiring0224
[6]: https://www.youtube.com/@Roboflow

Emu

2 1,519 7.4 Python

Emu Series: Generative Multimodal Models from BAAI (by baaivision)

Project mention: Show HN: Emu2 – A Gemini-like open-source 37B Multimodal Model | news.ycombinator.com | 2023-12-21

I'm excited to introduce Emu2, the latest generative multimodal model developed by the Beijing Academy of Artificial Intelligence (BAAI). Emu2 is an open-source initiative that reflects BAAI's commitment to fostering open, secure, and responsible AI research. It's designed to enhance AI's proficiency in handling tasks across various modalities with minimal examples and straightforward instructions.
Emu2 has demonstrated superior performance over other large-scale models like Flamingo-80B in few-shot multimodal understanding tasks. It serves as a versatile base model for developers, providing a flexible platform for crafting specialized multimodal applications.
Key features of Emu2 include:
- A more streamlined modeling framework than its predecessor, Emu.
- A decoder capable of reconstructing images from the encoder's semantic space.
- An expansion to 37 billion parameters, boosting both capabilities and generalization.
BAAI has also released fine-tuned versions, Emu2-Chat for visual understanding and Emu2-Gen for visual generation, which stand as some of the most powerful open-source models available today.
Here are the resources for those interested in exploring or contributing to Emu2:
- Project: https://baaivision.github.io/emu2/
- Model: https://huggingface.co/BAAI/Emu2
- Code: https://github.com/baaivision/Emu/tree/main/Emu2
- Demo: https://huggingface.co/spaces/BAAI/Emu2
- Paper: https://arxiv.org/abs/2312.13286
We're eager to see how the HN community engages with Emu2 and we welcome your feedback to help us improve. Let's collaborate to push the boundaries of multimodal AI!

lag-llama

2 1,029 8.6 Python

Lag-Llama: Towards Foundation Models for Probabilistic Time Series Forecasting

Project mention: Lag-Llama: Towards Foundation Models for Probabilistic Time Series Forecasting | news.ycombinator.com | 2024-02-26

InternVideo

3 1,013 8.4 Python

Video Foundation Models & Data for Multimodal Understanding
ONE-PEACE

2 859 8.6 Python

A general representation model across vision, audio, language modalities. Paper: ONE-PEACE: Exploring One General Representation Model Toward Unlimited Modalities
meerkat

2 814 7.3 Python

Creative interactive views of any dataset.
MindVideo

7 352 4.9 Python

Official code base for MinD-Video

Project mention: This research project on reconstructing video stimulus to the brain using an MRI scanner and AI algorithms reminds me of the RDA brain reading technology | /r/Avatar | 2023-06-23

fondant

4 322 9.6 Python

Production-ready data processing made easy and shareable

Project mention: 25 million Creative Commons image dataset released! | /r/StableDiffusion | 2023-10-01

Github: https://github.com/ml6team/fondant

GRID-playground

1 246 4.9 Python

Platform for General Robot Intelligence Development

Project mention: GRID: General Robot Intelligence Development Platform | news.ycombinator.com | 2023-10-17

meta-prompting

1 41 8.9 Python

Official implementation of BGPT @ ICLR 2024 paper "Meta Prompting for AI Systems" (https://arxiv.org/abs/2311.11482)

Project mention: Meta Prompting for AGI Systems | news.ycombinator.com | 2024-02-29

SaaSHub

www.saashub.com featured

SaaSHub - Software Alternatives and Reviews. SaaSHub helps you find the best software and product alternatives

NOTE: The open source projects on this list are ordered by number of github stars. The number of mentions indicates repo mentiontions in the last 12 Months or since we started tracking (Dec 2020).

Python foundation-models related posts

Lag-Llama: Towards Foundation Models for Probabilistic Time Series Forecasting

1 project | news.ycombinator.com | 26 Feb 2024
Show HN: Emu2 – A Gemini-like open-source 37B Multimodal Model

1 project | news.ycombinator.com | 21 Dec 2023
25 million Creative Commons image dataset released!

1 project | /r/StableDiffusion | 1 Oct 2023
Show HN: Autodistill, automated image labeling with foundation vision models

1 project | news.ycombinator.com | 6 Sep 2023
[P] AI image generation without copyright infringement

1 project | /r/MachineLearning | 29 Jun 2023
This research project on reconstructing video stimulus to the brain using an MRI scanner and AI algorithms reminds me of the RDA brain reading technology

1 project | /r/Avatar | 23 Jun 2023
Autodistill: Use foundation vision models to train smaller, supervised models

1 project | news.ycombinator.com | 22 Jun 2023
A note from our sponsor - InfluxDB
www.influxdata.com | 1 Jun 2024

Get real-time insights from all types of time series data with InfluxDB. Ingest, query, and analyze billions of data points in real-time with unbounded cardinality. Learn more →

Index

What are some of the best open-source foundation-model projects in Python? This list will help you:

	Project	Stars
1	ColossalAI	38,081
2	unilm	18,689
3	LLaVA	17,102
4	Otter	3,473
5	NExT-GPT	2,953
6	Ask-Anything	2,758
7	EVA	2,011
8	chronos-forecasting	1,855
9	autodistill	1,606
10	Emu	1,519
11	lag-llama	1,029
12	InternVideo	1,013
13	ONE-PEACE	859
14	meerkat	814
15	MindVideo	352
16	fondant	322
17	GRID-playground	246
18	meta-prompting	41