Python multi-modality

Open-source Python projects categorized as multi-modality

Top 8 Python multi-modality Projects

  • LLaVA

    [NeurIPS'23 Oral] Visual Instruction Tuning (LLaVA) built towards GPT-4V level capabilities and beyond.

  • Project mention: PaliGemma: Open-Source Multimodal Model by Google | news.ycombinator.com | 2024-05-15

    Here's a tutorial https://wandb.ai/byyoung3/ml-news/reports/How-to-Fine-Tune-L...

    There's not really a super easy to use software solution yet, but a few different ones have cropped up. Right now you'll have to read papers to get the training recipes.

    - https://github.com/haotian-liu/LLaVA/blob/main/scripts/finet...

  • clip-as-service

    🏄 Scalable embedding, reasoning, ranking for images and sentences with CLIP

  • Project mention: Search for anything ==> Immich fails to download textual.onnx | /r/immich | 2023-09-15
  • InfluxDB

    Power Real-Time Data Analytics at Scale. Get real-time insights from all types of time series data with InfluxDB. Ingest, query, and analyze billions of data points in real-time with unbounded cardinality.

    InfluxDB logo
  • deep-daze

    Simple command line tool for text to image generation using OpenAI's CLIP and Siren (Implicit neural representation network). Technique was originally created by https://twitter.com/advadnoun

  • Otter

    🦦 Otter, a multi-modal model based on OpenFlamingo (open-sourced version of DeepMind's Flamingo), trained on MIMIC-IT and showcasing improved instruction-following and in-context learning ability.

  • Project mention: OpenAI vs Google, Detect ChatGPT Content with 99% accuracy, Navigating AI compute costs | /r/ChatGPT | 2023-06-15

    👀 Video-LLaMA - Empower large language models with video and audio understanding capability. (link) 🦦 Otter - Multi-modal model with improved instruction-following and in-context learning ability. 🔗 Linkly.AI - AI-powered lead analytics and management platform that helps you track, analyze, and streamline your leads in one place. 🎬 Jet Cut Ready - AI plugin for Adobe Premiere Pro that automatically removes silent parts in videos. (link) 💬 HeyGen's ChatGPT Plugin - Convert text into high-quality videos using AI text and video generation.

  • swarms

    Orchestrate Swarms of Agents From Any Framework Like OpenAI, Langchain, and Etc for Business Operation Automation. Join our Community: https://discord.gg/DbjBMJTSWD

  • Project mention: Swarms – Automating all digital activities with millions of autonomous AI Agents | news.ycombinator.com | 2023-07-10
  • Multi-Modality-Arena

    Chatbot Arena meets multi-modality! Multi-Modality Arena allows you to benchmark vision-language models side-by-side while providing images as inputs. Supports MiniGPT-4, LLaMA-Adapter V2, LLaVA, BLIP-2, and many more!

  • Project mention: [R] Tiny LVLM-eHub: Early Multimodal Experiments with Bard - OpenGVLab, Shanghai AI Laboratory 2023 - Encourages innovative strategies aimed at advancing multimodal techniques! | /r/MachineLearning | 2023-08-13

    Github: https://github.com/OpenGVLab/Multi-Modality-Arena

  • Sophia

    Effortless plugin and play Optimizer to cut model training costs by 50%. New optimizer that is 2x faster than Adam on LLMs. (by kyegomez)

  • Project mention: [D] Potential scammer on github stealing work of other ML researchers? | /r/MachineLearning | 2023-08-17
  • SaaSHub

    SaaSHub - Software Alternatives and Reviews. SaaSHub helps you find the best software and product alternatives

    SaaSHub logo
  • multi_token

    Embed arbitrary modalities (images, audio, documents, etc) into large language models.

  • Project mention: Embed arbitrary modalities (images, audio, documents, etc.) into LLMs | news.ycombinator.com | 2023-12-18
NOTE: The open source projects on this list are ordered by number of github stars. The number of mentions indicates repo mentiontions in the last 12 Months or since we started tracking (Dec 2020).

Python multi-modality related posts

  • [R] Sophia: A Scalable Stochastic Second-order Optimizer for Language Model Pre-training

    2 projects | /r/MachineLearning | 26 May 2023
  • The Sophia optimizer, a faster alternative to AdamW

    2 projects | news.ycombinator.com | 24 May 2023

Index

What are some of the best open-source multi-modality projects in Python? This list will help you:

Project Stars
1 LLaVA 17,102
2 clip-as-service 12,232
3 deep-daze 4,379
4 Otter 3,473
5 swarms 739
6 Multi-Modality-Arena 387
7 Sophia 361
8 multi_token 150

Sponsored
SaaSHub - Software Alternatives and Reviews
SaaSHub helps you find the best software and product alternatives
www.saashub.com