SaaSHub helps you find the best software and product alternatives Learn more →
Top 6 Python txtai Projects
-
txtai
💡 All-in-one open-source embeddings database for semantic search, LLM orchestration and language model workflows
-
InfluxDB
Power Real-Time Data Analytics at Scale. Get real-time insights from all types of time series data with InfluxDB. Ingest, query, and analyze billions of data points in real-time with unbounded cardinality.
Project mention: Show HN: FileKitty – Combine and label text files for LLM prompt contexts | news.ycombinator.com | 2024-05-01
Project mention: Oracle of Zotero: LLM QA of Your Research Library | news.ycombinator.com | 2023-11-26Nice project!
I've spent quite a lot of time in the medical/scientific literature space. With regards to LLMs, specifically RAG, how the data is chunked is quite important. With that, I have a couple projects that might be beneficial additions.
paperetl (https://github.com/neuml/paperetl) - supports parsing arXiv, PubMed and integrates with GROBID to handle parsing metadata and text from arbitrary papers.
paperai (https://github.com/neuml/paperai) - builds embeddings databases of medical/scientific papers. Supports LLM prompting, semantic workflows and vector search. Built with txtai (https://github.com/neuml/txtai).
While arbitrary chunking/splitting can work, I've found that integrating parsing that has knowledge of medical/scientific paper structure increases the overall accuracy and experience of downstream applications.
As mentioned previously, all of the main components of txtai can be replaced with custom components. For example, there are external integrations for storing dense vectors in Weaviate and Qdrant to name a few.
Python txtai related posts
-
External database integration
-
Introducing the Overflow Offline project
-
[P] Stack Overflow Semantic Search
-
Semantic search of Stack Overflow with codequestion
-
Semantic search of Stack Overflow with codequestion
-
Semantic search of Stack Overflow with codequestion
-
Semantic search of Stack Overflow with codequestion
-
A note from our sponsor - SaaSHub
www.saashub.com | 20 May 2024
Index
What are some of the best open-source txtai projects in Python? This list will help you:
Project | Stars | |
---|---|---|
1 | txtai | 7,111 |
2 | paperai | 1,206 |
3 | codequestion | 511 |
4 | tldrstory | 345 |
5 | txtchat | 226 |
6 | weaviate-txtai | 7 |
Sponsored