How We Made PostgreSQL a Better Vector Database

This page summarizes the projects mentioned and recommended in the original post on news.ycombinator.com

InfluxDB - Power Real-Time Data Analytics at Scale
Get real-time insights from all types of time series data with InfluxDB. Ingest, query, and analyze billions of data points in real-time with unbounded cardinality.
www.influxdata.com
featured
SaaSHub - Software Alternatives and Reviews
SaaSHub helps you find the best software and product alternatives
www.saashub.com
featured
  • pgvecto.rs

    Scalable, Low-latency and Hybrid-enabled Vector Search in Postgres. Revolutionize Vector Search, not Database.

  • Hi, we've solved the problem you mentioned! Please take a look on our open source postgres vector extension https://github.com/tensorchord/pgvecto.rs.

    Our index building process is significantly faster than pgvector on hnsw because we can utilize all the cores, whereas pgvector can only use one core. And for the filter support, we do support pre-filtering, which will guarantee enough results no matter the condition is.

  • ann-benchmarks

    Benchmarks of approximate nearest neighbor libraries in Python

  • (Blog author here). Thanks for the question. In this case the index for both DiskANN and pgvector HNSW is small enough to fit in memory on the machine (8GB RAM), so there's no need to touch the SSD. We plan to test on a config where the index size is larger than memory (we couldn't this time due to limitations in ANN benchmarks [0], the tool we use).

    To your question about RAM usage, we provide a graph of index size. When enabling PQ, our new index is 10x smaller than pgvector HNSW. We don't have numbers for HNSWPQ in FAISS yet.

    [0]: https://github.com/erikbern/ann-benchmarks/

  • InfluxDB

    Power Real-Time Data Analytics at Scale. Get real-time insights from all types of time series data with InfluxDB. Ingest, query, and analyze billions of data points in real-time with unbounded cardinality.

    InfluxDB logo
NOTE: The number of mentions on this list indicates mentions on common posts plus user suggested alternatives. Hence, a higher number means a more popular project.

Suggest a related project

Related posts

  • Using Your Vector Database as a JSON (Or Relational) Datastore

    1 project | news.ycombinator.com | 23 Apr 2024
  • ANN Benchmarks

    1 project | news.ycombinator.com | 25 Jan 2024
  • pgvector vs Pinecone: cost and performance

    1 project | dev.to | 23 Oct 2023
  • Vector Dataset benchmark with 1536/768 dim data

    3 projects | news.ycombinator.com | 14 Aug 2023
  • Numbers every LLM Developer should know

    1 project | news.ycombinator.com | 12 Aug 2023