Frog: OCR Tool for Linux

InfluxDB - Power Real-Time Data Analytics at Scale

Get real-time insights from all types of time series data with InfluxDB. Ingest, query, and analyze billions of data points in real-time with unbounded cardinality.

www.influxdata.com

featured

SaaSHub - Software Alternatives and Reviews

SaaSHub helps you find the best software and product alternatives

www.saashub.com

featured

normcap

18 1,718 9.3 Python

OCR powered screen-capture tool to capture information instead of images
tessdata

10 5,951 2.8

Trained models with fast variant of the "best" LSTM models + legacy models

Appears to be a nice wrapper around Tesseract:
https://github.com/tesseract-ocr/tessdata
https://en.wikipedia.org/wiki/Tesseract_(software)
The demo of course works perfectly on a Mac as this is already built into Ventura.
  In November 2020, Brewster Kahle from the Internet Archive praised Tesseract saying:

InfluxDB

www.influxdata.com featured

Power Real-Time Data Analytics at Scale. Get real-time insights from all types of time series data with InfluxDB. Ingest, query, and analyze billions of data points in real-time with unbounded cardinality.
doctr

12 3,128 8.9 Python

docTR (Document Text Recognition) - a seamless, high-performing & accessible library for OCR-related tasks powered by Deep Learning.

There's also DocTR which can do text detection and extraction out of the box.
It's command line driven but can display the detected text as an overlay of the document.
https://github.com/mindee/doctr

OCRmyPDF

77 12,213 9.5 Python

OCRmyPDF adds an OCR text layer to scanned PDF files, allowing them to be searched
PaddleOCR

60 38,878 8.7 Python

Awesome multilingual OCR toolkits based on PaddlePaddle (practical ultra lightweight OCR system, support 80+ languages recognition, provide data annotation and synthesis tools, support training and deployment among server, mobile, embedded and IoT devices)

I’ve had good results from paddle ocr.
https://github.com/PaddlePaddle/PaddleOCR

flameshot

233 23,309 7.8 C++

Powerful yet simple to use screenshot software :desktop_computer: :camera_flash:

Cool! I've seen similar ideas before and made my own inspired by these some years ago. It's a simple bash script based on [flameshot](https://flameshot.org/) for taking the screenshot and Tesseract:
    #!/usr/bin/env bash

NOTE: The number of mentions on this list indicates mentions on common posts plus user suggested alternatives. Hence, a higher number means a more popular project.

Suggest a related project

TextSnatcher: Copy text from images, for the Linux Desktop

7 projects | news.ycombinator.com | 14 Mar 2024
A better document viewer

1 project | /r/linux4noobs | 13 Sep 2023
OCR for a full pdf on Neoreader

1 project | /r/Onyx_Boox | 25 Jun 2023
ELI5: why is PDF such a widespread text format, instead of a format that's actually easier to edit?

1 project | /r/explainlikeimfive | 3 Jun 2023
[Free-Post Friday!] Recommendations for high volume document scanners

1 project | /r/DataHoarder | 19 May 2023

This page summarizes the projects mentioned and recommended in the original post on news.ycombinator.com
OCR ScreenShot Python Tesseract Qt
Post date: 22 Nov 2022

normcap

tessdata

InfluxDB

doctr

OCRmyPDF

PaddleOCR

flameshot

Related posts

TextSnatcher: Copy text from images, for the Linux Desktop

A better document viewer

OCR for a full pdf on Neoreader

ELI5: why is PDF such a widespread text format, instead of a format that's actually easier to edit?

[Free-Post Friday!] Recommendations for high volume document scanners