I built an online PDF management platform using open-source software

This page summarizes the projects mentioned and recommended in the original post on news.ycombinator.com

InfluxDB - Power Real-Time Data Analytics at Scale
Get real-time insights from all types of time series data with InfluxDB. Ingest, query, and analyze billions of data points in real-time with unbounded cardinality.
www.influxdata.com
featured
SaaSHub - Software Alternatives and Reviews
SaaSHub helps you find the best software and product alternatives
www.saashub.com
featured
  • tesseract-ocr

    Tesseract Open Source OCR Engine (main repository)

  • i used open source solutions to built it, like libreoffice, ghostscript, google's tesseract and a bunch of other tools, Google's Tesseract: https://github.com/tesseract-ocr/tesseract

  • EasyOCR

    Ready-to-use OCR with 80+ supported languages and all popular writing scripts including Latin, Chinese, Arabic, Devanagari, Cyrillic and etc.

  • Ok on cleaned aligned data, but there are a few newer ones like EasyOCR [0] that can deal with much less organized text (albeit more slowly)

    [0] https://github.com/JaidedAI/EasyOCR

  • InfluxDB

    Power Real-Time Data Analytics at Scale. Get real-time insights from all types of time series data with InfluxDB. Ingest, query, and analyze billions of data points in real-time with unbounded cardinality.

    InfluxDB logo
  • sist2

    Lightning-fast file system indexer and search tool

  • This is what I use for that

    https://github.com/simon987/sist2

  • Stirling-PDF

    #1 Locally hosted web application that allows you to perform various operations on PDF files

NOTE: The number of mentions on this list indicates mentions on common posts plus user suggested alternatives. Hence, a higher number means a more popular project.

Suggest a related project

Related posts

  • Leveraging GPT-4 for PDF Data Extraction: A Comprehensive Guide

    5 projects | dev.to | 27 Dec 2023
  • Finding an dictionary key value in an image or on an screen.

    2 projects | /r/learnpython | 2 Apr 2021
  • Multimodal AI: Bridging the Gap Between Human and Machine Understanding

    1 project | dev.to | 14 May 2024
  • Highlighting Image Text

    1 project | dev.to | 30 Apr 2024
  • one of the Codia AI Design technologies: OCR Technology

    1 project | dev.to | 14 Feb 2024