Vector Databases: A Technical Primer [pdf]

This page summarizes the projects mentioned and recommended in the original post on news.ycombinator.com

Scout Monitoring - Free Django app performance insights with Scout Monitoring
Get Scout setup in minutes, and let us sweat the small stuff. A couple lines in settings.py is all you need to start monitoring your apps. Sign up for our free tier today.
www.scoutapm.com
featured
InfluxDB - Power Real-Time Data Analytics at Scale
Get real-time insights from all types of time series data with InfluxDB. Ingest, query, and analyze billions of data points in real-time with unbounded cardinality.
www.influxdata.com
featured
  • chromem-go

    Embeddable vector database for Go with Chroma-like interface and zero third-party dependencies. In-memory with optional persistence.

  • For Python I believe Chroma [1] can be used embedded.

    For Go I recently started building chromem-go, inspired by the Chroma interface: https://github.com/philippgille/chromem-go

    It's neither advanced nor for scale yet, but the RAG demo works.

    [1] https://github.com/chroma-core/chroma

  • chroma

    the AI-native open-source embedding database

  • For Python I believe Chroma [1] can be used embedded.

    For Go I recently started building chromem-go, inspired by the Chroma interface: https://github.com/philippgille/chromem-go

    It's neither advanced nor for scale yet, but the RAG demo works.

    [1] https://github.com/chroma-core/chroma

  • Scout Monitoring

    Free Django app performance insights with Scout Monitoring. Get Scout setup in minutes, and let us sweat the small stuff. A couple lines in settings.py is all you need to start monitoring your apps. Sign up for our free tier today.

    Scout Monitoring logo
  • pgvector

    Open-source vector similarity search for Postgres

  • You don't need a dedicated vector db, you can use pgvector.

    You could maybe use Cube for euclidean space search, but you're better off using optimized algorithms for embedding space search.

    https://github.com/pgvector/pgvector

  • usearch

    Fast Open-Source Search & Clustering engine × for Vectors & 🔜 Strings × in C++, C, Python, JavaScript, Rust, Java, Objective-C, Swift, C#, GoLang, and Wolfram 🔍

  • I've used usearch successfully for a small project: https://github.com/unum-cloud/usearch/

  • qdrant-client

    Python client for Qdrant vector search engine

  • The qdrant clients support a local mode where you point to a file[0].

    [0] -- https://github.com/qdrant/qdrant-client#local-mode

  • qdrant-lib

    Extract core logic from qdrant and make it available as a library.

  • google-research

    Google Research

  • There are options such as Google's ScaNN that may let you go farther before needing to consider specialized databases.

    https://github.com/google-research/google-research/blob/mast...

  • InfluxDB

    Power Real-Time Data Analytics at Scale. Get real-time insights from all types of time series data with InfluxDB. Ingest, query, and analyze billions of data points in real-time with unbounded cardinality.

    InfluxDB logo
NOTE: The number of mentions on this list indicates mentions on common posts plus user suggested alternatives. Hence, a higher number means a more popular project.

Suggest a related project

Related posts

  • Show HN: Chromem-go – Embeddable vector database for Go

    4 projects | news.ycombinator.com | 5 Apr 2024
  • How to reduce costs on embeddings up to 70%

    1 project | /r/OpenAI | 26 May 2023
  • Open-source AI /LLM Embedding Preprocessing Editor

    1 project | /r/selfhosted | 26 May 2023
  • Open-source AI /LLM Embedding Pre-processing Editor.

    1 project | /r/ChatGPT | 26 May 2023
  • AI Embedding Pre-processing Editor.

    1 project | /r/opensource | 26 May 2023