Top 23 Python llm Projects

MetaGPT

32 39,707 10.0 Python

🌟 The Multi-Agent Framework: First AI Software Company, Towards Natural Language Programming

Project mention: Can AI replace a co-founder? | news.ycombinator.com | 2024-01-07

https://github.com/geekan/MetaGPT :
> MetaGPT takes a one line requirement as input and outputs user stories / competitive analysis / requirements / data structures / APIs / documents, etc.
https://news.ycombinator.com/item?id=29141796 ; "Co-Founder Equity Calculator"
"Ask HN: What are your go to SaaS products for startups/MVPs?" (2020) https://news.ycombinator.com/item?id=23535828 ; FounderKit, StackShare
> USA Small Business Administration: "10 steps to start your business." https://www.sba.gov/starting-business/how-start-business/10-...
>> "Startup Incorporation Checklist: How to bootstrap a Delaware C-corp (or S-corp) with employee(s) in California" https://github.com/leonar15/startup-checklist

llama_index

75 31,628 10.0 Python

LlamaIndex is a data framework for your LLM applications

Project mention: LlamaIndex: A data framework for your LLM applications | news.ycombinator.com | 2024-04-07

InfluxDB

www.influxdata.com featured

Power Real-Time Data Analytics at Scale. Get real-time insights from all types of time series data with InfluxDB. Ingest, query, and analyze billions of data points in real-time with unbounded cardinality.
chatgpt-on-wechat

1 25,427 9.4 Python

基于大模型搭建的聊天机器人，同时支持微信公众号、企业微信应用、飞书、钉钉等接入，可选择GPT3.5/GPT-4o/GPT4.0/ Claude/文心一言/讯飞星火/通义千问/ Gemini/GLM-4/Claude/Kimi/LinkAI，能处理文本、语音和图片，访问操作系统和互联网，支持基于自有知识库进行定制企业智能客服。
MindsDB

78 21,424 10.0 Python

The platform for customizing AI from enterprise data

Project mention: What’s the Difference Between Fine-tuning, Retraining, and RAG? | dev.to | 2024-04-08

Check us out on GitHub.

LLaMA-Factory

3 21,791 9.9 Python

Unify Efficient Fine-Tuning of 100+ LLMs

Project mention: FLaNK-AIM Weekly 06 May 2024 | dev.to | 2024-05-06

vllm

31 19,344 9.9 Python

A high-throughput and memory-efficient inference and serving engine for LLMs

Project mention: AI leaderboards are no longer useful. It's time to switch to Pareto curves | news.ycombinator.com | 2024-04-30

I guess the root cause of my claim is that OpenAI won't tell us whether or not GPT-3.5 is an MoE model, and I assumed it wasn't. Since GPT-3.5 is clearly nondeterministic at temp=0, I believed the nondeterminism was due to FPU stuff, and this effect was amplified with GPT-4's MoE. But if GPT-3.5 is also MoE then that's just wrong.
What makes this especially tricky is that small models are truly 100% deterministic at temp=0 because the relative likelihoods are too coarse for FPU issues to be a factor. I had thought 3.5 was big enough that some of its token probabilities were too fine-grained for the FPU. But that's probably wrong.
On the other hand, it's not just GPT, there are currently floating-point difficulties in vllm which significantly affect the determinism of any model run on it: https://github.com/vllm-project/vllm/issues/966 Note that a suggested fix is upcasting to float32. So it's possible that GPT-3.5 is using an especially low-precision float and introducing nondeterminism by saving money on compute costs.
Sadly I do not have the money[1] to actually run a test to falsify any of this. It seems like this would be a good little research project.
[1] Or the time, or the motivation :) But this stuff is expensive.

unilm

41 18,548 9.0 Python

Large-scale Self-supervised Pre-training Across Tasks, Languages, and Modalities

Project mention: GPUs Go Brrr | news.ycombinator.com | 2024-05-12

SaaSHub

www.saashub.com featured

SaaSHub - Software Alternatives and Reviews. SaaSHub helps you find the best software and product alternatives
Chinese-LLaMA-Alpaca

4 17,539 8.3 Python

中文LLaMA&Alpaca大语言模型+本地CPU/GPU训练部署 (Chinese LLaMA & Alpaca LLMs)

Project mention: Chinese-Alpaca-Plus-13B-GPTQ | /r/LocalLLaMA | 2023-05-30

I'd like to share with you today the Chinese-Alpaca-Plus-13B-GPTQ model, which is the GPTQ format quantised 4bit models of Yiming Cui's Chinese-LLaMA-Alpaca 13B for GPU reference.

mlc-llm

89 17,150 9.9 Python

Enable everyone to develop, optimize and deploy AI models natively on everyone's devices.

Project mention: FLaNK 04 March 2024 | dev.to | 2024-03-04

ChatGLM2-6B

4 15,534 6.6 Python

ChatGLM2-6B: An Open Bilingual Chat LLM | 开源双语对话语言模型

Project mention: Are We Overlooking China's Progress in AI? | /r/singularity | 2023-06-26

peft

26 14,083 9.7 Python

🤗 PEFT: State-of-the-art Parameter-Efficient Fine-Tuning.

Project mention: LoftQ: LoRA-fine-tuning-aware Quantization | news.ycombinator.com | 2023-12-19

Qwen

5 11,430 9.4 Python

The official repo of Qwen (通义千问) chat & pretrained large language model proposed by Alibaba Cloud.

Project mention: What the heck is so great about this model? | /r/SillyTavernAI | 2023-12-07

Qwen: https://github.com/QwenLM/Qwen

pandas-ai

14 11,140 9.8 Python

Chat with your database (SQL, CSV, pandas, polars, mongodb, noSQL, etc). PandasAI makes data analysis conversational using LLMs (GPT 3.5 / 4, Anthropic, VertexAI) and RAG.

Project mention: PandasAI is great but is there a more general library? | news.ycombinator.com | 2023-08-23

ludwig

3 10,859 9.5 Python

Low-code framework for building custom LLMs, neural networks, and other AI models

Project mention: Show HN: Toolkit for LLM Fine-Tuning, Ablating and Testing | news.ycombinator.com | 2024-04-07

This is a great project, little bit similar to https://github.com/ludwig-ai/ludwig, but it includes testing capabilities and ablation.
questions regarding the LLM testing aspect: How extensive is the test coverage for LLM use cases, and what is the current state of this project area? Do you offer any guarantees, or is it considered an open-ended problem?
Would love to see more progress toward this area!

h2ogpt

28 10,686 10.0 Python

Private chat with local GPT with document, images, video, etc. 100% private, Apache 2.0. Supports oLLaMa, Mixtral, llama.cpp, and more. Demo: https://gpt.h2o.ai/ https://codellama.h2o.ai/

Project mention: Ask HN: How do I train a custom LLM/ChatGPT on my own documents in Dec 2023? | news.ycombinator.com | 2023-12-24

As others have said you want RAG.
The most feature complete implementation I've seen is h2ogpt[0] (not affiliated).
The code is kind of a mess (most of the logic is in an ~8000 line python file) but it supports ingestion of everything from YouTube videos to docx, pdf, etc - either offline or from the web interface. It uses langchain and a ton of additional open source libraries under the hood. It can run directly on Linux, via docker, or with one-click installers for Mac and Windows.
It has various model hosting implementations built in - transformers, exllama, llama.cpp as well as support for model serving frameworks like vLLM, HF TGI, etc or just OpenAI.
You can also define your preferred embedding model along with various other parameters but I've found the out of box defaults to be pretty sane and usable.
[0] - https://github.com/h2oai/h2ogpt

gorilla

52 10,276 8.9 Python

Gorilla: An API store for LLMs

Project mention: Raft: Sailing Llama towards better domain-specific RAG | news.ycombinator.com | 2024-05-09

Retrieval-Augmented Fine-Tuning is a really promising technique.
FTA:
> Tianjun and Shishir were looking to improve these deficiencies of RAG. They hypothesized that a student who studies the textbooks before the open-book exam would be more likely to perform better than a student who references the textbook only during the exam. Translating that back to LLMs, if a model “studied” the documents beforehand, could that improve its RAG performance?
Incidentally, the team who wrote the paper released some nice code to generate domain-specific fine-tuning datasets: https://github.com/ShishirPatil/gorilla/tree/main/raft

ml-engineering

9 9,890 9.7 Python

Machine Learning Engineering Open Book

Project mention: Accelerators | news.ycombinator.com | 2024-02-22

LLMSurvey

3 8,967 7.9 Python

The official GitHub page for the survey paper "A Survey of Large Language Models".

Project mention: Ask HN: Textbook Regarding LLMs | news.ycombinator.com | 2024-03-23

Here’s another one - it’s older but has some interesting charts and graphs.
https://arxiv.org/abs/2303.18223

OpenLLM

25 8,920 9.9 Python

Run any open-source LLMs, such as Llama 2, Mistral, as OpenAI compatible API endpoint in the cloud.

Project mention: First 15 Open Source Advent projects | dev.to | 2023-12-15

13. OpenLLM by BentoML | Github | tutorial

embedchain

6 8,576 9.8 Python

Personalizing LLM Responses

Project mention: Ask HN: How do I train a custom LLM/ChatGPT on my own documents in Dec 2023? | news.ycombinator.com | 2023-12-24

You can use embedchain[1] to connect various data sources and then get a RAG application running on your local and production very easily. Embedchain is an open source RAG framework and It follows a conventional but configurable approach.
The conventional approach is suitable for software engineer where they may not be less familiar with AI. The configurable approach is suitable for ML engineer where they have sophisticated uses and would want to configure chunking, indexing and retrieval strategies.
[1]: https://github.com/embedchain/embedchain

nebuly

105 8,363 8.4 Python

The user analytics platform for LLMs

Project mention: Nebuly – The LLM Analytics Platform | news.ycombinator.com | 2023-10-07

shell_gpt

38 8,391 8.0 Python

A command-line productivity tool powered by AI large language models like GPT-4, will help you accomplish your tasks faster and more efficiently.

Project mention: Oh My Zsh | news.ycombinator.com | 2024-01-22

https://github.com/TheR1D/shell_gpt?tab=readme-ov-file#shell...

unsloth

15 9,703 9.4 Python

Finetune Llama 3, Mistral & Gemma LLMs 2-5x faster with 80% less memory

Project mention: Ask HN: Most efficient way to fine-tune an LLM in 2024? | news.ycombinator.com | 2024-04-04

Gemma 7b is 2.4x faster than HF + FA2.
Check out https://github.com/unslothai/unsloth for full benchmarks!

SaaSHub

www.saashub.com featured

SaaSHub - Software Alternatives and Reviews. SaaSHub helps you find the best software and product alternatives

NOTE: The open source projects on this list are ordered by number of github stars. The number of mentions indicates repo mentiontions in the last 12 Months or since we started tracking (Dec 2020).

Python llm related posts

Build a simple RAG chatbot with LangChain...

2 projects | dev.to | 17 May 2024
Show HN: Generate a Quiz from Any Url

1 project | news.ycombinator.com | 17 May 2024
Adding an Amazon Bedrock Knowledge Base to the Forex Rate Assistant

1 project | dev.to | 16 May 2024
Show HN: An Open source platform for building voice first multimodal agents

1 project | news.ycombinator.com | 15 May 2024
GPT-4o: Learn how to Implement a RAG on the new model, step-by-step!

1 project | dev.to | 13 May 2024
OSS framework for voice first multimodal assistants

1 project | news.ycombinator.com | 13 May 2024
FLaNK-AIM Weekly 13 May 2024

34 projects | dev.to | 13 May 2024
A note from our sponsor - InfluxDB
www.influxdata.com | 17 May 2024

Get real-time insights from all types of time series data with InfluxDB. Ingest, query, and analyze billions of data points in real-time with unbounded cardinality. Learn more →

Index

What are some of the best open-source llm projects in Python? This list will help you:

	Project	Stars
1	MetaGPT	39,707
2	llama_index	31,628
3	chatgpt-on-wechat	25,427
4	MindsDB	21,424
5	LLaMA-Factory	21,791
6	vllm	19,344
7	unilm	18,548
8	Chinese-LLaMA-Alpaca	17,539
9	mlc-llm	17,150
10	ChatGLM2-6B	15,534
11	peft	14,083
12	Qwen	11,430
13	pandas-ai	11,140
14	ludwig	10,859
15	h2ogpt	10,686
16	gorilla	10,276
17	ml-engineering	9,890
18	LLMSurvey	8,967
19	OpenLLM	8,920
20	embedchain	8,576
21	nebuly	8,363
22	shell_gpt	8,391
23	unsloth	9,703

Python llm

Top 23 Python llm Projects

Python llm related posts

Build a simple RAG chatbot with LangChain...

Show HN: Generate a Quiz from Any Url

Adding an Amazon Bedrock Knowledge Base to the Forex Rate Assistant

Show HN: An Open source platform for building voice first multimodal agents

GPT-4o: Learn how to Implement a RAG on the new model, step-by-step!

OSS framework for voice first multimodal assistants

FLaNK-AIM Weekly 13 May 2024

Index