Top 23 Python large-language-model Projects

gpt_academic

2 58,363 9.8 Python

为GPT/GLM等LLM大语言模型提供实用化交互接口，特别优化论文阅读/润色/写作体验，模块化设计，支持自定义快捷按钮&函数插件，支持Python和C++等项目剖析&自译解功能，PDF/LaTex论文翻译&总结功能，支持并行问询多种LLM模型，支持chatglm3等本地模型。接入通义千问, deepseekcoder, 讯飞星火, 文心一言, llama2, rwkv, claude2, moss等。

Project mention: Enhance Speed of AnkiBrain Addon | /r/ankibrain | 2023-12-06

I recently managed to manually install the AnkiBrain addon, utilizing my personal ChatGPT API key. I'd like to extend my appreciation for creating such a useful tool. However, I've noticed a significant difference in speed compared to a local GUI, similar to what's offered by GPT Academic.

LLaMA-Factory

3 22,453 9.9 Python

Unify Efficient Fine-Tuning of 100+ LLMs

Project mention: FLaNK-AIM Weekly 06 May 2024 | dev.to | 2024-05-06

InfluxDB

www.influxdata.com featured

Power Real-Time Data Analytics at Scale. Get real-time insights from all types of time series data with InfluxDB. Ingest, query, and analyze billions of data points in real-time with unbounded cardinality.
Chinese-LLaMA-Alpaca

4 17,539 8.3 Python

中文LLaMA&Alpaca大语言模型+本地CPU/GPU训练部署 (Chinese LLaMA & Alpaca LLMs)

Project mention: Chinese-Alpaca-Plus-13B-GPTQ | /r/LocalLLaMA | 2023-05-30

I'd like to share with you today the Chinese-Alpaca-Plus-13B-GPTQ model, which is the GPTQ format quantised 4bit models of Yiming Cui's Chinese-LLaMA-Alpaca 13B for GPU reference.

ChatGLM2-6B

4 15,546 6.6 Python

ChatGLM2-6B: An Open Bilingual Chat LLM | 开源双语对话语言模型

Project mention: Are We Overlooking China's Progress in AI? | /r/singularity | 2023-06-26

haystack

55 13,883 9.9 Python

:mag: LLM orchestration framework to build customizable, production-ready LLM applications. Connect components (models, vector DBs, file converters) to pipelines or agents that can interact with your data. With advanced retrieval methods, it's best suited for building RAG, question answering, semantic search or conversational agent chatbots.

Project mention: Haystack DB – 10x faster than FAISS with binary embeddings by default | news.ycombinator.com | 2024-04-28

I was confused for a bit but there is no relation to https://haystack.deepset.ai/

MOSS

4 11,823 4.4 Python

An open-source tool-augmented conversational language model from Fudan University
Qwen

5 11,430 9.4 Python

The official repo of Qwen (通义千问) chat & pretrained large language model proposed by Alibaba Cloud.

Project mention: What the heck is so great about this model? | /r/SillyTavernAI | 2023-12-07

Qwen: https://github.com/QwenLM/Qwen

SaaSHub

www.saashub.com featured

SaaSHub - Software Alternatives and Reviews. SaaSHub helps you find the best software and product alternatives
ml-engineering

9 9,928 9.7 Python

Machine Learning Engineering Open Book

Project mention: Accelerators | news.ycombinator.com | 2024-02-22

FlexGen

39 9,022 3.5 Python

Running large language models on a single GPU for throughput-oriented scenarios.

Project mention: Run 70B LLM Inference on a Single 4GB GPU with This New Technique | news.ycombinator.com | 2023-12-03

LLMSurvey

3 9,037 6.4 Python

The official GitHub page for the survey paper "A Survey of Large Language Models".

Project mention: Ask HN: Textbook Regarding LLMs | news.ycombinator.com | 2024-03-23

Here’s another one - it’s older but has some interesting charts and graphs.
https://arxiv.org/abs/2303.18223

petals

98 8,730 8.3 Python

🌸 Run LLMs at home, BitTorrent-style. Fine-tuning and inference up to 10x faster than offloading

Project mention: Mistral Large | news.ycombinator.com | 2024-02-26

So how long until we can do an open source Mistral Large?
We could make a start on Petals or some other open source distributed training network cluster possibly?
[0] https://petals.dev/

nebuly

105 8,363 8.4 Python

The user analytics platform for LLMs

Project mention: Nebuly – The LLM Analytics Platform | news.ycombinator.com | 2023-10-07

deeplake

13 7,751 9.8 Python

Database for AI. Store Vectors, Images, Texts, Videos, etc. Use with LLMs/LangChain. Store, query, version, & visualize any AI data. Stream data in real-time to PyTorch/TensorFlow. https://activeloop.ai

Project mention: FLaNK AI Weekly 25 March 2025 | dev.to | 2024-03-25

Yi

9 7,250 9.4 Python

A series of large language models trained from scratch by developers @01-ai

Project mention: Yi: Open Foundation Models by 01.ai | news.ycombinator.com | 2024-03-10

The model license:
https://github.com/01-ai/Yi/blob/main/MODEL_LICENSE_AGREEMEN...
1) Your use of the Yi Series Models must comply with the Laws and Regulations as

txtai

356 7,111 9.3 Python

💡 All-in-one open-source embeddings database for semantic search, LLM orchestration and language model workflows

Project mention: Show HN: FileKitty – Combine and label text files for LLM prompt contexts | news.ycombinator.com | 2024-05-01

PentestGPT

18 6,475 8.2 Python

A GPT-empowered penetration testing tool

Project mention: PentestGPT | news.ycombinator.com | 2023-06-18

Baichuan-7B

1 5,646 7.6 Python

A large-scale 7B pretraining language model developed by BaiChuan-Inc.

Project mention: Baichuan 7B reaches top of LLM leaderboard for it's size (New foundation model 4K tokens) | /r/LocalLLaMA | 2023-06-17

GitHub: baichuan-inc/baichuan-7B: A large-scale 7B pretraining language model developed by BaiChuan-Inc. (github.com)

openchat

18 4,996 9.1 Python

OpenChat: Advancing Open-source Language Models with Imperfect Data (by imoneoi)

Project mention: Alternative of bard,bing, claude | /r/artificial | 2023-12-10

Depending on your use case, https://openchat.team/ might be woth looking into

camel

5 4,504 8.9 Python

🐫 CAMEL: Communicative Agents for “Mind” Exploration of Large Language Model Society (NeruIPS'2023) https://www.camel-ai.org (by camel-ai)
awesome-pretrained-chinese-nlp-models

1 4,279 8.9 Python

Awesome Pretrained Chinese NLP Models，高质量中文预训练模型&大模型&多模态模型&大语言模型集合
marqo

114 4,189 9.3 Python

Unified embedding generation and search engine. Also available on cloud - cloud.marqo.ai

Project mention: Are we at peak vector database? | news.ycombinator.com | 2024-01-25

We (Marqo) are doing a lot on 1 and 2. There is a huge amount to be done on the ML side of vector search and we are investing heavily in it. I think it has not quite sunk in that vector search systems are ML systems and everything that comes with that. I would love to chat about 1 and 2 so feel free to email me (email is in my profile). What we have done so far is here -> https://github.com/marqo-ai/marqo

Baichuan2

1 3,960 7.3 Python

A series of large language models developed by Baichuan Intelligent Technology

Project mention: Baichuan 2 | news.ycombinator.com | 2023-10-12

AutoGPTQ

19 3,875 9.3 Python

An easy-to-use LLMs quantization package with user-friendly apis, based on GPTQ algorithm.

Project mention: Setting up LLAMA2 70B Chat locally | /r/developersIndia | 2023-08-18

SaaSHub

www.saashub.com featured

SaaSHub - Software Alternatives and Reviews. SaaSHub helps you find the best software and product alternatives

NOTE: The open source projects on this list are ordered by number of github stars. The number of mentions indicates repo mentiontions in the last 12 Months or since we started tracking (Dec 2020).

Python large-language-models related posts

Show HN: Generate a Quiz from Any Url

1 project | news.ycombinator.com | 17 May 2024
TimesFM (Time Series Foundation Model) for time-series forecasting

4 projects | news.ycombinator.com | 8 May 2024
Financial Market Applications of LLMs

1 project | news.ycombinator.com | 20 Apr 2024
Implementation for Mini-Gemini

1 project | news.ycombinator.com | 17 Apr 2024
News DataStax just bought our startup Langflow

1 project | news.ycombinator.com | 4 Apr 2024
Show HN: I made a library for LLM prompt injection/exploit/jailbreak detection

1 project | news.ycombinator.com | 3 Apr 2024
Mini-Gemini: Mining the Potential of Multi-Modality Vision Language Models

2 projects | news.ycombinator.com | 31 Mar 2024
A note from our sponsor - SaaSHub
www.saashub.com | 20 May 2024

SaaSHub helps you find the best software and product alternatives Learn more →

Index

What are some of the best open-source large-language-model projects in Python? This list will help you:

	Project	Stars
1	gpt_academic	58,363
2	LLaMA-Factory	22,453
3	Chinese-LLaMA-Alpaca	17,539
4	ChatGLM2-6B	15,546
5	haystack	13,883
6	MOSS	11,823
7	Qwen	11,430
8	ml-engineering	9,928
9	FlexGen	9,022
10	LLMSurvey	9,037
11	petals	8,730
12	nebuly	8,363
13	deeplake	7,751
14	Yi	7,250
15	txtai	7,111
16	PentestGPT	6,475
17	Baichuan-7B	5,646
18	openchat	4,996
19	camel	4,504
20	awesome-pretrained-chinese-nlp-models	4,279
21	marqo	4,189
22	Baichuan2	3,960
23	AutoGPTQ	3,875

Python large-language-models

Top 23 Python large-language-model Projects

Python large-language-models related posts

Show HN: Generate a Quiz from Any Url

TimesFM (Time Series Foundation Model) for time-series forecasting

Financial Market Applications of LLMs

Implementation for Mini-Gemini

News DataStax just bought our startup Langflow

Show HN: I made a library for LLM prompt injection/exploit/jailbreak detection

Mini-Gemini: Mining the Potential of Multi-Modality Vision Language Models

Index