SemanticSlicer
A recursive text chunker that attempts to preserve context. (by drittich)
pg_vectorize
The simplest way to orchestrate vector search on Postgres (by tembo-io)
SemanticSlicer | pg_vectorize | |
---|---|---|
1 | 5 | |
7 | 643 | |
- | 9.6% | |
7.5 | 9.2 | |
5 months ago | 5 days ago | |
C# | Rust | |
MIT License | - |
The number of mentions indicates the total number of mentions that we've tracked plus the number of user suggested alternatives.
Stars - the number of stars that a project has on GitHub. Growth - month over month growth in stars.
Activity is a relative number indicating how actively a project is being developed. Recent commits have higher weight than older ones.
For example, an activity of 9.0 indicates that a project is amongst the top 10% of the most actively developed projects that we are tracking.
Stars - the number of stars that a project has on GitHub. Growth - month over month growth in stars.
Activity is a relative number indicating how actively a project is being developed. Recent commits have higher weight than older ones.
For example, an activity of 9.0 indicates that a project is amongst the top 10% of the most actively developed projects that we are tracking.
SemanticSlicer
Posts with mentions or reviews of SemanticSlicer.
We have used some of these posts to build our list of alternatives
and similar projects. The last one was on 2024-03-06.
-
Pg_vectorize: The simplest way to do vector search and RAG on Postgres
I wrote a C# library to do this, which is similar to other chunking approaches that are common, like the way langchain does it: https://github.com/drittich/SemanticSlicer
Given a list of separators (regexes), it goes through them in order and keeps splitting the text by them until the chunk fits within the desired size. By putting the higher level separators first (e.g., for HTML split by
before
), it's a pretty good proxy for maintaining context.
pg_vectorize
Posts with mentions or reviews of pg_vectorize.
We have used some of these posts to build our list of alternatives
and similar projects. The last one was on 2024-04-17.
-
Embeddings are a good starting point for the AI curious app developer
check out https://github.com/tembo-io/pg_vectorize - we're taking it a little bit beyond just the storage and index. The project uses pgvector for the indices and distance operators, but also adds a simpler API, hooks into pre-trained embedding models, and helps you keep embeddings updated as data changes/grows
-
Pg_vectorize: The simplest way to do vector search and RAG on Postgres
There is a RAG example here https://github.com/tembo-io/pg_vectorize?tab=readme-ov-file#...
You can provide your own prompts by adding them to the `vectorize.prompts` table. There's an API for this in the works. It is poorly documented at the moment.
- Pg_vectorize – The simplest way to orchestrate vector search on Postgres
What are some alternatives?
When comparing SemanticSlicer and pg_vectorize you can also consider the following projects:
OpenAI-DotNet - A Non-Official OpenAI RESTful API Client for DotNet
pgvector - Open-source vector similarity search for Postgres
nlm-ingestor - This repo provides the server side code for llmsherpa API to connect. It includes parsers for various file formats.
fastembed-rs - Library to generate vector embeddings. Rust implementation of Qdrant's FastEmbed.