Building an Internet Scale Meme Search Engine

This page summarizes the projects mentioned and recommended in the original post on news.ycombinator.com

InfluxDB - Power Real-Time Data Analytics at Scale
Get real-time insights from all types of time series data with InfluxDB. Ingest, query, and analyze billions of data points in real-time with unbounded cardinality.
www.influxdata.com
featured
SaaSHub - Software Alternatives and Reviews
SaaSHub helps you find the best software and product alternatives
www.saashub.com
featured
  • sonic

    🦔 Fast, lightweight & schema-less search backend. An alternative to Elasticsearch that runs on a few MBs of RAM.

  • If you don't need advanced search features, you can use Sonic (https://github.com/valeriansaliou/sonic). It's blazing fast and you can save lot of money on servers.

  • deep-text-recognition-benchmark

    PyTorch code of my ICDAR 2021 paper Vision Transformer for Fast and Efficient Scene Text Recognition (ViTSTR) (by roatienza)

  • https://github.com/roatienza/deep-text-recognition-benchmark (available weights are for tasks that seem similar to OCR so there is a good chance you can use it out of the box). With a good gpu it should process hundreds to thousands image per seconds, so you likely can build your index in less than a day. (Maybe you can even port it to your iphone stack :) )

    https://github.com/microsoft/GenerativeImage2Text (You'll probably have to train on your custom dataset that you have constituted)

    There are tons of other freely available solutions that you can get with a search for things with keywords like "image to text ocr" "transformers" "visual transformers"...

  • InfluxDB

    Power Real-Time Data Analytics at Scale. Get real-time insights from all types of time series data with InfluxDB. Ingest, query, and analyze billions of data points in real-time with unbounded cardinality.

    InfluxDB logo
  • GenerativeImage2Text

    GIT: A Generative Image-to-text Transformer for Vision and Language

  • https://github.com/roatienza/deep-text-recognition-benchmark (available weights are for tasks that seem similar to OCR so there is a good chance you can use it out of the box). With a good gpu it should process hundreds to thousands image per seconds, so you likely can build your index in less than a day. (Maybe you can even port it to your iphone stack :) )

    https://github.com/microsoft/GenerativeImage2Text (You'll probably have to train on your custom dataset that you have constituted)

    There are tons of other freely available solutions that you can get with a search for things with keywords like "image to text ocr" "transformers" "visual transformers"...

  • ocrit

    Simple command-line utility for performing OCR using Apple's Vision framework

  • There's ocrit, a CLI utility using Apple's Vision framework for OCR: https://github.com/insidegui/ocrit

  • macOCR

    Get any text on your screen into your clipboard.

  • Pretty insane. If you don’t want to use iPhones, I made a while back macOCR which uses the same vision APIs, with a very simple CLI interface. See: https://github.com/schappim/macOCR

  • SaaSHub

    SaaSHub - Software Alternatives and Reviews. SaaSHub helps you find the best software and product alternatives

    SaaSHub logo
NOTE: The number of mentions on this list indicates mentions on common posts plus user suggested alternatives. Hence, a higher number means a more popular project.

Suggest a related project

Related posts

  • sonic: Fast, lightweight & schema-less search backend. An alternative to Elasticsearch that runs on a few MBs of RAM.

    1 project | /r/programming | 27 Oct 2023
  • Sonic, An alternative to Elasticsearch that runs on a few MBs of RAM

    1 project | /r/Boiling_Steam | 24 Oct 2022
  • An alternative to Elasticsearch that runs on a few MBs of RAM

    1 project | /r/patient_hackernews | 24 Oct 2022
  • An alternative to Elasticsearch that runs on a few MBs of RAM

    1 project | /r/hackernews | 24 Oct 2022
  • An alternative to Elasticsearch that runs on a few MBs of RAM

    1 project | /r/hypeurls | 24 Oct 2022