Top 23 Python rag Projects

llama_index

75 31,628 10.0 Python

LlamaIndex is a data framework for your LLM applications

Project mention: LlamaIndex: A data framework for your LLM applications | news.ycombinator.com | 2024-04-07

chatgpt-on-wechat

1 25,427 9.4 Python

基于大模型搭建的聊天机器人，同时支持微信公众号、企业微信应用、飞书、钉钉等接入，可选择GPT3.5/GPT-4o/GPT4.0/ Claude/文心一言/讯飞星火/通义千问/ Gemini/GLM-4/Claude/Kimi/LinkAI，能处理文本、语音和图片，访问操作系统和互联网，支持基于自有知识库进行定制企业智能客服。
InfluxDB

www.influxdata.com featured

Power Real-Time Data Analytics at Scale. Get real-time insights from all types of time series data with InfluxDB. Ingest, query, and analyze billions of data points in real-time with unbounded cardinality.
ragflow

7 7,404 9.7 Python

RAGFlow is an open-source RAG (Retrieval-Augmented Generation) engine based on deep document understanding.

Project mention: DeepSeek-V2 integrated, RAGFlow v0.5.0 is released | news.ycombinator.com | 2024-05-07

txtai

356 7,080 9.3 Python

💡 All-in-one open-source embeddings database for semantic search, LLM orchestration and language model workflows

Project mention: Show HN: FileKitty – Combine and label text files for LLM prompt contexts | news.ycombinator.com | 2024-05-01

TaskingAI

1 4,837 9.4 Python

The open source platform for AI-native application development.

Project mention: TaskingAI: AI-native app development platform | news.ycombinator.com | 2024-01-30

GenerativeAIExamples

1 1,575 7.5 Python

Generative AI reference workflows optimized for accelerated infrastructure and microservice architecture.

Project mention: FLaNK Weekly 18 Dec 2023 | dev.to | 2023-12-18

swirl-search

32 1,542 9.8 Python

Swirl is an open-source search platform that uses AI to search multiple content and data sources simultaneously and return AI-ranked results. And provides summaries of your answers from searches using LLMs. It's a one-click, easy-to-use Retrieval Augmented Generation (RAG) Solution.

Project mention: GitHub - swirlai/swirl-search: Swirl is an open-source search platform that uses AI to search multiple content and data sources simultaneously, finds the best results using a reader LLM, then prompts Generative AI, enabling you to get answers based on your data. | /r/programming | 2023-12-05

SaaSHub

www.saashub.com featured

SaaSHub - Software Alternatives and Reviews. SaaSHub helps you find the best software and product alternatives
cognita

4 1,320 7.9 Python

RAG (Retrieval Augmented Generation) Framework for building modular, open source applications for production by TrueFoundry

Project mention: FLaNK AI Weekly for 29 April 2024 | dev.to | 2024-04-29

llama_parse

4 1,108 9.1 Python

Parse files for optimal RAG

Project mention: Adding an Amazon Bedrock Knowledge Base to the Forex Rate Assistant | dev.to | 2024-05-16

It's fair to think that undesirable artifacts and lack of structural context would impact search accuracy, performance, and ultimately cost. Consequently, it makes sense to perform some data pre-processing before passing the source documents to the RAG workflow. Third-party APIs and tools, such as LlamaParse and LayoutPDFReader, can help with pre-processing PDF data, however keep in mind that source documents may take any forms and there is no one-size-fits-all solution. You may have to resort to developing custom processes for pre-processing and search your unique data.

canopy

16 895 9.8 Python

Retrieval Augmented Generation (RAG) framework and context engine powered by Pinecone

Project mention: Build a simple RAG chatbot with LangChain... | dev.to | 2024-05-17

To create a PineCone account, sign up via this link: https://www.pinecone.io/

fastembed

4 822 9.5 Python

Fast, Accurate, Lightweight Python library to make State of the Art Embedding

Project mention: FastLLM by Qdrant – lightweight LLM tailored For RAG | news.ycombinator.com | 2024-04-01

raptor

3 491 6.6 Python

The official implementation of RAPTOR: Recursive Abstractive Processing for Tree-Organized Retrieval

Project mention: Show HN: A phone number to text with questions about current events | news.ycombinator.com | 2024-05-10

Hi HN! For my senior thesis in CS, I built an SMS-based application to make journalism more accessible. It works like this:
1) You text the topics you're interested in to my phone number. Every day, you'll receive a text with 5 headlines from The Associated Press (https://apnews.com/) related to those topics.
2) If you have questions about any of the current events the headlines describe, you just text them back. A response is generated from the contents of the articles using the RAPTOR retrieval framework (https://github.com/parthsarthi03/raptor) and texted right back to you.
The repo can be found here: https://github.com/tdh15/pressText
I'd really appreciate any and all feedback. Whatever you got, I'd love to hear it :)

StreamRAG

2 400 7.0 Python

Video Search and Streaming Agent 🕵️‍♂️

Project mention: Show HN: GPT-Powered Video Retrieval and Streaming | news.ycombinator.com | 2024-02-08

continuous-eval

4 327 8.7 Python

Open-Source Evaluation for GenAI Application Pipelines

Project mention: Show HN: Ellipsis – Automated PR reviews and bug fixes | news.ycombinator.com | 2024-05-09

Hi HN, hunterbrooks and nbrad here from Ellipsis (https://www.ellipsis.dev). Ellipsis automatically reviews your PRs when opened and on each new commit. If you tag @ellipsis-dev in a comment, it can make changes to the PR (via direct commit or side PR) and answer questions, just like a human.
Demo video: https://www.youtube.com/watch?v=X61NGZpaNQA
So far, we have dozens of open source projects and companies using Ellipsis. We seem to have landed in a kind of sweet spot where there’s a good match between the current capabilities of AI tools and the actual needs of software engineers - this doesn’t replace human review, but it saves you time by catching/fixing lots of small silly stuff.
Here’s an example in the wild: https://github.com/relari-ai/continuous-eval/pull/38, where Ellipsis (1) adds a PR summary; (2) finds a bug and adds a review comment; (3) after a [human] user comments, generates a side PR with the fix; and (4) after a (human) user merges the side PR and adds another commit, re-reviews the PR and approves it
Here’s another example: https://github.com/SciPhi-AI/R2R/pull/350#pullrequestreview-..., where Ellipsis adds several comments with inline suggestions that were directly merged by the developer.
You can configure Ellipsis in natural language to enforce custom rules, style guides, or conventions. For example, here’s how the `jxnl/instructor` repo uses natural language rules to make sure that docs are kept in sync: https://github.com/jxnl/instructor/blob/main/ellipsis.yaml#L..., and here’s an example PR that Ellipsis came up with based on those rules: https://github.com/jxnl/instructor/pull/346.
Don’t worry, your code is never stored or used to train models (https://docs.ellipsis.dev/security).
Installing into your repo takes 2 clicks at https://www.ellipsis.dev. We’d really appreciate your feedback, thoughts, and ideas!

txtchat

17 226 6.9 Python

💭 Retrieval augmented generation (RAG) and language model powered search applications
Instrukt

4 221 9.2 Python

Integrated AI environment in the terminal. Build, test and instruct agents.

Project mention: Instrukt: a TUI AI assistant to explore and understand any complex code base. | /r/programming | 2023-09-07

tonic_validate

6 210 9.5 Python

Metrics to evaluate the quality of responses of your Retrieval Augmented Generation (RAG) applications.

Project mention: Validating the RAG Performance of Amazon Titan vs. Cohere Using Amazon Bedrock | news.ycombinator.com | 2024-02-09

I tried out Amazon Bedrock, and used Tonic Validate to do a head to head comparison of very simple RAG system's built using embedding and text models available in Amazon Bedrock. I compared Amazon Titan's embedding and text models to Cohere's embedding and text models in RAG systems that employ Amazon Bedrock Knowledge Bases as the vector db and retrieval components of the system.
The code for the comparison is in this jupyter notebook https://github.com/TonicAI/tonic_validate/blob/main/examples...
Let me know what you think, And your experiences building RAG with Amazon Bedrock!

open-assistant-api

0 172 8.3 Python

The Open Assistant API is a ready-to-use, open-source, self-hosted agent/gpts orchestration creation framework, supporting customized extensions for LLM, RAG, function call, and tools capabilities. It also supports seamless integration with the openai/langchain sdk.
ragna

4 163 9.2 Python

RAG orchestration framework ⛵️

Project mention: Reconquer your documents with Ragna | dev.to | 2023-11-22

git clone https://github.com/Quansight/ragna.git cd ragna pip install 'ragna[all]'

mychatGPT

10 123 8.1 Python

GPT chat with your docs!

Project mention: Can I let AI read a group of information from books and what not and then let it answer questions? | /r/ArtificialInteligence | 2023-06-04

enterprise-h2ogpte

1 66 7.8 Python

Client Code Examples, Use Cases and Benchmarks for Enterprise h2oGPTe RAG-Based GenAI Platform

Project mention: FLaNK AI - 01 April 2024 | dev.to | 2024-04-01

beyondllm

1 62 7.6 Python

Build, evaluate and observe LLM apps

Project mention: FLaNK AI Weekly for 29 April 2024 | dev.to | 2024-04-29

NeoGPT

1 63 9.5 Python

Chat effortlessly, execute commands, and interpret code with Llama3, Phi3, and more - your local AI assistant. Enjoy seamless interaction while ensuring ultimate privacy

Project mention: HacktoberRest | dev.to | 2023-11-01

One of the most interesting projects I came across this month was NeoGPT. It's a GPT based application that is being built to converse with documents and videos. While still in its infancy, the project has outlined a cool roadmap and has a very active base of contributors continuously expanding on its functionality. The project appeals to my desire to learn how to work with AI and neural networks. It is also at a development stage that it is not outside of the reach of my comprehension. Icing on the cake being it's Py based, which is my sharpest tool at the moment. I see it as a decent project to stay tapped into and grow my skills as the application develops.

SaaSHub

www.saashub.com featured

SaaSHub - Software Alternatives and Reviews. SaaSHub helps you find the best software and product alternatives

NOTE: The open source projects on this list are ordered by number of github stars. The number of mentions indicates repo mentiontions in the last 12 Months or since we started tracking (Dec 2020).

Python rag related posts

Build a simple RAG chatbot with LangChain...

2 projects | dev.to | 17 May 2024
Adding an Amazon Bedrock Knowledge Base to the Forex Rate Assistant

1 project | dev.to | 16 May 2024
Show HN: A phone number to text with questions about current events

2 projects | news.ycombinator.com | 10 May 2024
DeepSeek-V2 integrated, RAGFlow v0.5.0 is released

1 project | news.ycombinator.com | 7 May 2024
RAGCache: Efficient Knowledge Caching for Retrieval-Augmented Generation

1 project | news.ycombinator.com | 30 Apr 2024
Show HN: R2R – Open-source framework for production-grade RAG

5 projects | news.ycombinator.com | 26 Feb 2024
Show HN: GPT-Powered Video Retrieval and Streaming

1 project | news.ycombinator.com | 8 Feb 2024
A note from our sponsor - InfluxDB
www.influxdata.com | 17 May 2024

Get real-time insights from all types of time series data with InfluxDB. Ingest, query, and analyze billions of data points in real-time with unbounded cardinality. Learn more →

Index

What are some of the best open-source rag projects in Python? This list will help you:

	Project	Stars
1	llama_index	31,628
2	chatgpt-on-wechat	25,427
3	ragflow	7,404
4	txtai	7,080
5	TaskingAI	4,837
6	GenerativeAIExamples	1,575
7	swirl-search	1,542
8	cognita	1,320
9	llama_parse	1,108
10	canopy	895
11	fastembed	822
12	raptor	491
13	StreamRAG	400
14	continuous-eval	327
15	txtchat	226
16	Instrukt	221
17	tonic_validate	210
18	open-assistant-api	172
19	ragna	163
20	mychatGPT	123
21	enterprise-h2ogpte	66
22	beyondllm	62
23	NeoGPT	63

Python rag

Top 23 Python rag Projects

Python rag related posts

Build a simple RAG chatbot with LangChain...

Adding an Amazon Bedrock Knowledge Base to the Forex Rate Assistant

Show HN: A phone number to text with questions about current events

DeepSeek-V2 integrated, RAGFlow v0.5.0 is released

RAGCache: Efficient Knowledge Caching for Retrieval-Augmented Generation

Show HN: R2R – Open-source framework for production-grade RAG

Show HN: GPT-Powered Video Retrieval and Streaming

Index