Llmsherpa Alternatives
Similar projects and alternatives to llmsherpa
-
txtai
💡 All-in-one open-source embeddings database for semantic search, LLM orchestration and language model workflows
-
unstructured
Open source libraries and APIs to build custom preprocessing pipelines for labeling, training, or production machine learning pipelines.
-
InfluxDB
Power Real-Time Data Analytics at Scale. Get real-time insights from all types of time series data with InfluxDB. Ingest, query, and analyze billions of data points in real-time with unbounded cardinality.
-
WorkOS
The modern identity platform for B2B SaaS. The APIs are flexible and easy-to-use, supporting authentication, user identity, and complex enterprise features like SSO and SCIM provisioning.
-
llama-hub
Discontinued A library of data loaders for LLMs made by the community -- to be used with LlamaIndex and/or LangChain
-
nlm-ingestor
This repo provides the server side code for llmsherpa API to connect. It includes parsers for various file formats.
llmsherpa reviews and mentions
-
LlamaCloud and LlamaParse
To get good RAG performance you will need a good chunking strategy. Simply getting all the text is not good enough and knowing the boundaries of table, list, paragraph, section etc. is helpful.
Great work by llamaindex team. Also feel free to try https://github.com/nlmatics/llmsherpa which takes into account some of the things I mentioned.
-
Show HN: Open-source Rule-based PDF parser for RAG
I wrote about split points and the need for including section hierarchy in this post: https://ambikasukla.substack.com/p/efficient-rag-with-docume...
All this is automated in the llmsherpa parser https://github.com/nlmatics/llmsherpa which you can use as an API over this library.
Stats
nlmatics/llmsherpa is an open source project licensed under MIT License which is an OSI approved license.
The primary programming language of llmsherpa is Jupyter Notebook.
Popular Comparisons
Sponsored