Python foundation-models

Open-source Python projects categorized as foundation-models

Top 18 Python foundation-model Projects

  • ColossalAI

    Making large AI models cheaper, faster and more accessible

  • Project mention: FLaNK AI-April 22, 2024 | dev.to | 2024-04-22
  • unilm

    Large-scale Self-supervised Pre-training Across Tasks, Languages, and Modalities

  • Project mention: 1-Bit LLMs Could Solve AI's Energy Demands | news.ycombinator.com | 2024-05-30
  • Scout Monitoring

    Free Django app performance insights with Scout Monitoring. Get Scout setup in minutes, and let us sweat the small stuff. A couple lines in settings.py is all you need to start monitoring your apps. Sign up for our free tier today.

    Scout Monitoring logo
  • LLaVA

    [NeurIPS'23 Oral] Visual Instruction Tuning (LLaVA) built towards GPT-4V level capabilities and beyond.

  • Project mention: PaliGemma: Open-Source Multimodal Model by Google | news.ycombinator.com | 2024-05-15

    Here's a tutorial https://wandb.ai/byyoung3/ml-news/reports/How-to-Fine-Tune-L...

    There's not really a super easy to use software solution yet, but a few different ones have cropped up. Right now you'll have to read papers to get the training recipes.

    - https://github.com/haotian-liu/LLaVA/blob/main/scripts/finet...

  • Otter

    🦦 Otter, a multi-modal model based on OpenFlamingo (open-sourced version of DeepMind's Flamingo), trained on MIMIC-IT and showcasing improved instruction-following and in-context learning ability.

  • Project mention: OpenAI vs Google, Detect ChatGPT Content with 99% accuracy, Navigating AI compute costs | /r/ChatGPT | 2023-06-15

    👀 Video-LLaMA - Empower large language models with video and audio understanding capability. (link) 🦦 Otter - Multi-modal model with improved instruction-following and in-context learning ability. 🔗 Linkly.AI - AI-powered lead analytics and management platform that helps you track, analyze, and streamline your leads in one place. 🎬 Jet Cut Ready - AI plugin for Adobe Premiere Pro that automatically removes silent parts in videos. (link) 💬 HeyGen's ChatGPT Plugin - Convert text into high-quality videos using AI text and video generation.

  • NExT-GPT

    Code and models for NExT-GPT: Any-to-Any Multimodal Large Language Model

  • Project mention: Show HN: NExT-GPT – First LLM working with multimodal input and output | news.ycombinator.com | 2023-09-21
  • Ask-Anything

    [CVPR2024 Highlight][VideoChatGPT] ChatGPT with video understanding! And many more supported LMs such as miniGPT4, StableLM, and MOSS.

  • EVA

    EVA Series: Visual Representation Fantasies from BAAI (by baaivision)

  • InfluxDB

    Power Real-Time Data Analytics at Scale. Get real-time insights from all types of time series data with InfluxDB. Ingest, query, and analyze billions of data points in real-time with unbounded cardinality.

    InfluxDB logo
  • chronos-forecasting

    Chronos: Pretrained (Language) Models for Probabilistic Time Series Forecasting

  • Project mention: TimesFM (Time Series Foundation Model) for time-series forecasting | news.ycombinator.com | 2024-05-08

    On a related note, Amazon also had a model for time series forecasting called Chronos.

    https://github.com/amazon-science/chronos-forecasting

  • autodistill

    Images to inference with no labeling (use foundation models to train supervised models).

  • Project mention: Ask HN: Who is hiring? (February 2024) | news.ycombinator.com | 2024-02-01

    Roboflow | Open Source Software Engineer, Web Designer / Developer, and more. | Full-time (Remote, SF, NYC) | https://roboflow.com/careers?ref=whoishiring0224

    Roboflow is the fastest way to use computer vision in production. We help developers give their software the sense of sight. Our end-to-end platform[1] provides tooling for image collection, annotation, dataset exploration and curation, training, and deployment.

    Over 250k engineers (including engineers from 2/3 Fortune 100 companies) build with Roboflow. We now host the largest collection of open source computer vision datasets and pre-trained models[2]. We are pushing forward the CV ecosystem with open source projects like Autodistill[3] and Supervision[4]. And we've built one of the most comprehensive resources for software engineers to learn to use computer vision with our popular blog[5] and YouTube channel[6].

    We have several openings available but are primarily looking for strong technical generalists who want to help us democratize computer vision and like to wear many hats and have an outsized impact. Our engineering culture is built on a foundation of autonomy & we don't consider an engineer fully ramped until they can "choose their own loss function". At Roboflow, engineers aren't just responsible for building things but also for helping us figure out what we should build next. We're builders & problem solvers; not just coders. (For this reason we also especially love hiring past and future founders.)

    We're currently hiring full-stack engineers for our ML and web platform teams, a web developer to bridge our product and marketing teams, several technical roles on the sales & field engineering teams, and our first applied machine learning researcher to help push forward the state of the art in computer vision.

    [1]: https://roboflow.com/?ref=whoishiring0224

    [2]: https://roboflow.com/universe?ref=whoishiring0224

    [3]: https://github.com/autodistill/autodistill

    [4]: https://github.com/roboflow/supervision

    [5]: https://blog.roboflow.com/?ref=whoishiring0224

    [6]: https://www.youtube.com/@Roboflow

  • Emu

    Emu Series: Generative Multimodal Models from BAAI (by baaivision)

  • Project mention: Show HN: Emu2 – A Gemini-like open-source 37B Multimodal Model | news.ycombinator.com | 2023-12-21

    I'm excited to introduce Emu2, the latest generative multimodal model developed by the Beijing Academy of Artificial Intelligence (BAAI). Emu2 is an open-source initiative that reflects BAAI's commitment to fostering open, secure, and responsible AI research. It's designed to enhance AI's proficiency in handling tasks across various modalities with minimal examples and straightforward instructions.

    Emu2 has demonstrated superior performance over other large-scale models like Flamingo-80B in few-shot multimodal understanding tasks. It serves as a versatile base model for developers, providing a flexible platform for crafting specialized multimodal applications.

    Key features of Emu2 include:

    - A more streamlined modeling framework than its predecessor, Emu.

    - A decoder capable of reconstructing images from the encoder's semantic space.

    - An expansion to 37 billion parameters, boosting both capabilities and generalization.

    BAAI has also released fine-tuned versions, Emu2-Chat for visual understanding and Emu2-Gen for visual generation, which stand as some of the most powerful open-source models available today.

    Here are the resources for those interested in exploring or contributing to Emu2:

    - Project: https://baaivision.github.io/emu2/

    - Model: https://huggingface.co/BAAI/Emu2

    - Code: https://github.com/baaivision/Emu/tree/main/Emu2

    - Demo: https://huggingface.co/spaces/BAAI/Emu2

    - Paper: https://arxiv.org/abs/2312.13286

    We're eager to see how the HN community engages with Emu2 and we welcome your feedback to help us improve. Let's collaborate to push the boundaries of multimodal AI!

  • lag-llama

    Lag-Llama: Towards Foundation Models for Probabilistic Time Series Forecasting

  • Project mention: Lag-Llama: Towards Foundation Models for Probabilistic Time Series Forecasting | news.ycombinator.com | 2024-02-26
  • InternVideo

    Video Foundation Models & Data for Multimodal Understanding

  • ONE-PEACE

    A general representation model across vision, audio, language modalities. Paper: ONE-PEACE: Exploring One General Representation Model Toward Unlimited Modalities

  • meerkat

    Creative interactive views of any dataset.

  • MindVideo

    Official code base for MinD-Video

  • Project mention: This research project on reconstructing video stimulus to the brain using an MRI scanner and AI algorithms reminds me of the RDA brain reading technology | /r/Avatar | 2023-06-23
  • fondant

    Production-ready data processing made easy and shareable

  • Project mention: 25 million Creative Commons image dataset released! | /r/StableDiffusion | 2023-10-01

    Github: https://github.com/ml6team/fondant

  • GRID-playground

    Platform for General Robot Intelligence Development

  • Project mention: GRID: General Robot Intelligence Development Platform | news.ycombinator.com | 2023-10-17
  • meta-prompting

    Official implementation of BGPT @ ICLR 2024 paper "Meta Prompting for AI Systems" (https://arxiv.org/abs/2311.11482)

  • Project mention: Meta Prompting for AGI Systems | news.ycombinator.com | 2024-02-29
  • SaaSHub

    SaaSHub - Software Alternatives and Reviews. SaaSHub helps you find the best software and product alternatives

    SaaSHub logo
NOTE: The open source projects on this list are ordered by number of github stars. The number of mentions indicates repo mentiontions in the last 12 Months or since we started tracking (Dec 2020).

Python foundation-models related posts

  • Lag-Llama: Towards Foundation Models for Probabilistic Time Series Forecasting

    1 project | news.ycombinator.com | 26 Feb 2024
  • Show HN: Emu2 – A Gemini-like open-source 37B Multimodal Model

    1 project | news.ycombinator.com | 21 Dec 2023
  • 25 million Creative Commons image dataset released!

    1 project | /r/StableDiffusion | 1 Oct 2023
  • Show HN: Autodistill, automated image labeling with foundation vision models

    1 project | news.ycombinator.com | 6 Sep 2023
  • [P] AI image generation without copyright infringement

    1 project | /r/MachineLearning | 29 Jun 2023
  • This research project on reconstructing video stimulus to the brain using an MRI scanner and AI algorithms reminds me of the RDA brain reading technology

    1 project | /r/Avatar | 23 Jun 2023
  • Autodistill: Use foundation vision models to train smaller, supervised models

    1 project | news.ycombinator.com | 22 Jun 2023
  • A note from our sponsor - InfluxDB
    www.influxdata.com | 1 Jun 2024
    Get real-time insights from all types of time series data with InfluxDB. Ingest, query, and analyze billions of data points in real-time with unbounded cardinality. Learn more →

Index

What are some of the best open-source foundation-model projects in Python? This list will help you:

Project Stars
1 ColossalAI 38,081
2 unilm 18,689
3 LLaVA 17,102
4 Otter 3,473
5 NExT-GPT 2,953
6 Ask-Anything 2,758
7 EVA 2,011
8 chronos-forecasting 1,855
9 autodistill 1,606
10 Emu 1,519
11 lag-llama 1,029
12 InternVideo 1,013
13 ONE-PEACE 859
14 meerkat 814
15 MindVideo 352
16 fondant 322
17 GRID-playground 246
18 meta-prompting 41

Sponsored
SaaSHub - Software Alternatives and Reviews
SaaSHub helps you find the best software and product alternatives
www.saashub.com