Show HN: BetterOCR combines and corrects multiple OCR engines with an LLM

InfluxDB - Power Real-Time Data Analytics at Scale

Get real-time insights from all types of time series data with InfluxDB. Ingest, query, and analyze billions of data points in real-time with unbounded cardinality.

www.influxdata.com

featured

SaaSHub - Software Alternatives and Reviews

SaaSHub helps you find the best software and product alternatives

www.saashub.com

featured

BetterOCR

3 397 8.3 Python

🔍 Better text detection by combining multiple OCR engines (EasyOCR, Tesseract, and Pororo) with 🧠 LLM.

https://github.com/junhoyeo/BetterOCR#-box-detection

doctr

12 3,128 8.9 Python

docTR (Document Text Recognition) - a seamless, high-performing & accessible library for OCR-related tasks powered by Deep Learning.

Yup! But I'm still exploring options. (any recommendations would be welcomed!) Here are some candidates I'm considering:
- https://github.com/mindee/doctr
- https://github.com/open-mmlab/mmocr
- https://github.com/PaddlePaddle/PaddleOCR (honestly I don't know Mandarin so I'm a bit stuck)
- https://github.com/clovaai/donut - While it's primarily an "OCR-free document understanding transformer," I think it's worth experimenting with. Think I can sort this out by letting the LLM reason through it multiple times (although this will impact performance)
- yesterday got a suggestion to consider https://github.com/kakaobrain/pororo - I don't think development is still active but the results are pretty great on Korean text

InfluxDB

www.influxdata.com featured

Power Real-Time Data Analytics at Scale. Get real-time insights from all types of time series data with InfluxDB. Ingest, query, and analyze billions of data points in real-time with unbounded cardinality.
mmocr

6 4,108 4.7 Python

OpenMMLab Text Detection, Recognition and Understanding Toolbox

Yup! But I'm still exploring options. (any recommendations would be welcomed!) Here are some candidates I'm considering:
- https://github.com/mindee/doctr
- https://github.com/open-mmlab/mmocr
- https://github.com/PaddlePaddle/PaddleOCR (honestly I don't know Mandarin so I'm a bit stuck)
- https://github.com/clovaai/donut - While it's primarily an "OCR-free document understanding transformer," I think it's worth experimenting with. Think I can sort this out by letting the LLM reason through it multiple times (although this will impact performance)
- yesterday got a suggestion to consider https://github.com/kakaobrain/pororo - I don't think development is still active but the results are pretty great on Korean text

PaddleOCR

60 38,878 8.7 Python

Awesome multilingual OCR toolkits based on PaddlePaddle (practical ultra lightweight OCR system, support 80+ languages recognition, provide data annotation and synthesis tools, support training and deployment among server, mobile, embedded and IoT devices)

Yup! But I'm still exploring options. (any recommendations would be welcomed!) Here are some candidates I'm considering:
- https://github.com/mindee/doctr
- https://github.com/open-mmlab/mmocr
- https://github.com/PaddlePaddle/PaddleOCR (honestly I don't know Mandarin so I'm a bit stuck)
- https://github.com/clovaai/donut - While it's primarily an "OCR-free document understanding transformer," I think it's worth experimenting with. Think I can sort this out by letting the LLM reason through it multiple times (although this will impact performance)
- yesterday got a suggestion to consider https://github.com/kakaobrain/pororo - I don't think development is still active but the results are pretty great on Korean text

donut

19 5,370 3.6 Python

Official Implementation of OCR-free Document Understanding Transformer (Donut) and Synthetic Document Generator (SynthDoG), ECCV 2022

Yup! But I'm still exploring options. (any recommendations would be welcomed!) Here are some candidates I'm considering:
- https://github.com/mindee/doctr
- https://github.com/open-mmlab/mmocr
- https://github.com/PaddlePaddle/PaddleOCR (honestly I don't know Mandarin so I'm a bit stuck)
- https://github.com/clovaai/donut - While it's primarily an "OCR-free document understanding transformer," I think it's worth experimenting with. Think I can sort this out by letting the LLM reason through it multiple times (although this will impact performance)
- yesterday got a suggestion to consider https://github.com/kakaobrain/pororo - I don't think development is still active but the results are pretty great on Korean text

pororo

1 1,225 10.0 Python

Discontinued PORORO: Platform Of neuRal mOdels for natuRal language prOcessing

Yup! But I'm still exploring options. (any recommendations would be welcomed!) Here are some candidates I'm considering:
- https://github.com/mindee/doctr
- https://github.com/open-mmlab/mmocr
- https://github.com/PaddlePaddle/PaddleOCR (honestly I don't know Mandarin so I'm a bit stuck)
- https://github.com/clovaai/donut - While it's primarily an "OCR-free document understanding transformer," I think it's worth experimenting with. Think I can sort this out by letting the LLM reason through it multiple times (although this will impact performance)
- yesterday got a suggestion to consider https://github.com/kakaobrain/pororo - I don't think development is still active but the results are pretty great on Korean text

llama.cpp

780 58,425 10.0 C++

LLM inference in C/C++

Please consider LLaMA.cpp (https://github.com/ggerganov/llama.cpp), which supports a lot of models and doesn't need an expensive GPU.

SaaSHub

www.saashub.com featured

SaaSHub - Software Alternatives and Reviews. SaaSHub helps you find the best software and product alternatives

NOTE: The number of mentions on this list indicates mentions on common posts plus user suggested alternatives. Hence, a higher number means a more popular project.

Suggest a related project

OCR a lot of hand written invoice and records?

1 project | /r/selfhosted | 7 Dec 2023
[P] EasyOCR in C++!

2 projects | /r/MachineLearning | 2 Dec 2023
OCR at Edge on Cloudflare Constellation

3 projects | news.ycombinator.com | 3 Jul 2023
Help with OCR of pixel-y numbers

1 project | /r/computervision | 4 Apr 2023
How to perform document OCR?

1 project | /r/computervision | 14 Mar 2023

Show HN: BetterOCR combines and corrects multiple OCR engines with an LLM

This page summarizes the projects mentioned and recommended in the original post on news.ycombinator.com
OCR Pytorch crnn Deep Learning text-detection
Post date: 28 Oct 2023

BetterOCR

doctr

InfluxDB

mmocr

PaddleOCR

donut

pororo

llama.cpp

SaaSHub

Related posts

OCR a lot of hand written invoice and records?

[P] EasyOCR in C++!

OCR at Edge on Cloudflare Constellation

Help with OCR of pixel-y numbers

How to perform document OCR?

Show HN: BetterOCR combines and corrects multiple OCR engines with an LLM

This page summarizes the projects mentioned and recommended in the original post on news.ycombinator.com OCR Pytorch crnn Deep Learning text-detection Post date: 28 Oct 2023

Related posts

OCR a lot of hand written invoice and records?

[P] EasyOCR in C++!

OCR at Edge on Cloudflare Constellation

Help with OCR of pixel-y numbers

How to perform document OCR?

This page summarizes the projects mentioned and recommended in the original post on news.ycombinator.com
OCR Pytorch crnn Deep Learning text-detection
Post date: 28 Oct 2023