Get Scout setup in minutes, and let us sweat the small stuff. A couple lines in settings.py is all you need to start monitoring your apps. Sign up for our free tier today. Learn more →
Top 23 Python OCR Projects
-
PaddleOCR
Awesome multilingual OCR toolkits based on PaddlePaddle (practical ultra lightweight OCR system, support 80+ languages recognition, provide data annotation and synthesis tools, support training and deployment among server, mobile, embedded and IoT devices)
-
EasyOCR
Ready-to-use OCR with 80+ supported languages and all popular writing scripts including Latin, Chinese, Arabic, Devanagari, Cyrillic and etc.
-
Scout Monitoring
Free Django app performance insights with Scout Monitoring. Get Scout setup in minutes, and let us sweat the small stuff. A couple lines in settings.py is all you need to start monitoring your apps. Sign up for our free tier today.
-
paperless-ngx
A community-supported supercharged version of paperless: scan, index and archive all your physical documents
-
ragflow
RAGFlow is an open-source RAG (Retrieval-Augmented Generation) engine based on deep document understanding.
-
InfluxDB
Power Real-Time Data Analytics at Scale. Get real-time insights from all types of time series data with InfluxDB. Ingest, query, and analyze billions of data points in real-time with unbounded cardinality.
-
donut
Official Implementation of OCR-free Document Understanding Transformer (Donut) and Synthetic Document Generator (SynthDoG), ECCV 2022
-
video-subtitle-extractor
视频硬字幕提取,生成srt文件。无需申请第三方API,本地实现文本识别。基于深度学习的视频字幕提取框架,包含字幕区域检测、字幕内容提取。A GUI tool for extracting hard-coded subtitle (hardsub) from videos and generating srt files.
-
PyMuPDF
PyMuPDF is a high performance Python library for data extraction, analysis, conversion & manipulation of PDF (and other) documents.
-
AdelaiDet
AdelaiDet is an open source toolbox for multiple instance-level detection and recognition tasks.
-
doctr
docTR (Document Text Recognition) - a seamless, high-performing & accessible library for OCR-related tasks powered by Deep Learning.
-
CnOCR
CnOCR: Awesome Chinese/English OCR Python toolkits based on PyTorch. It comes with 20+ well-trained models for different application scenarios and can be used directly after installation. 【基于 PyTorch/MXNet 的中文/英文 OCR Python 包。】
-
pdftabextract
A set of tools for extracting tables from PDF files helping to do data mining on (OCR-processed) scanned documents.
-
BallonsTranslator
深度学习辅助漫画翻译工具, 支持一键机翻和简单的图像/文本编辑 | Yet another computer-aided comic/manga translation tool powered by deeplearning
-
RapidOCR
Awesome OCR multiple programing languages toolkits based on ONNXRuntime, OpenVION and PaddlePaddle.
-
SaaSHub
SaaSHub - Software Alternatives and Reviews. SaaSHub helps you find the best software and product alternatives
Project mention: Ask HN: I have many PDFs – what is the best local way to leverage AI for search? | news.ycombinator.com | 2024-05-30If you want to run locally you can look into this https://github.com/PaddlePaddle/PaddleOCR
https://andrejusb.blogspot.com/2024/03/optimizing-receipt-pr...
But I suggest that you just skip that and use gpt-4o. They aren't actually going to steal your data.
Sort through it to find anything with a credit card number or anything ahead time.
Or you could look into InternVL..
Or a combination of PaddleOCR first and then use a strong LLM via API, like gpt-4o or llama3 70b via together.ai
If you truly must do it locally, then if you have two 3090s or 4090s it might work out. Otherwise it the LLMs may not be smart enough to give good results.
Leaving out the details of your hardware makes it impossible to give good advice about running locally. Other than, it's not really necessary.
Project mention: I built an online PDF management platform using open-source software | news.ycombinator.com | 2024-05-12Ok on cleaned aligned data, but there are a few newer ones like EasyOCR [0] that can deal with much less organized text (albeit more slowly)
[0] https://github.com/JaidedAI/EasyOCR
Project mention: Ask HN: I have many PDFs – what is the best local way to leverage AI for search? | news.ycombinator.com | 2024-05-30Paperless supports OCR + full text indexing: https://docs.paperless-ngx.com/
As far as AI goes, not sure.
Project mention: TextSnatcher: Copy text from images, for the Linux Desktop | news.ycombinator.com | 2024-03-14Try https://github.com/ocrmypdf/OCRmyPDF - it uses Tesseract behind the scenes and it absolutely brilliant.
Project mention: Better RAG Results with Reciprocal Rank Fusion and Hybrid Search | news.ycombinator.com | 2024-05-30Within our open source RAG product RAGFlow(https://github.com/infiniflow/ragflow), Elasticsearch is currently used instead of other general vector databases, because it can provide hybrid search right now. Under the default cases, embedding based reranker is not required, just RRF is enough, while even if reranker is used, keywords based retrieval is also a MUST to be hybridized with embedding based retrieval, that's just what RAGFlow's latest 0.7 release has provided.
On the other hand let me introduce another database we developed, Infinity(https://github.com/infiniflow/infinity), which can provide the fastest hybrid search, you can see the performance here(https://github.com/infiniflow/infinity/blob/main/docs/refere...), both vector search and full-text search could perform much faster than other open source alternatives.
From the next version(weeks later), Infinity will also provide more comprehensive hybrid search capabilities, what you have mentioned the 3-way recalls(dense vector, sparse vector, keyword search) could be provided within single request.
maybe this is better? https://github.com/clovaai/donut
I'm not sure
Project mention: [DISC] - The angel who came to pick me up is a Gal (Oneshot by Shiraishi Kouhei) | /r/manga | 2023-09-06OCR works pretty good. ocr.space, ocr.best and cotrans.touhou.ai/ are all pretty nice.
Project mention: Show HN: BetterOCR combines and corrects multiple OCR engines with an LLM | news.ycombinator.com | 2023-10-28Yup! But I'm still exploring options. (any recommendations would be welcomed!) Here are some candidates I'm considering:
- https://github.com/mindee/doctr
- https://github.com/open-mmlab/mmocr
- https://github.com/PaddlePaddle/PaddleOCR (honestly I don't know Mandarin so I'm a bit stuck)
- https://github.com/clovaai/donut - While it's primarily an "OCR-free document understanding transformer," I think it's worth experimenting with. Think I can sort this out by letting the LLM reason through it multiple times (although this will impact performance)
- yesterday got a suggestion to consider https://github.com/kakaobrain/pororo - I don't think development is still active but the results are pretty great on Korean text
Project mention: Show HN: How do you OCR on a Mac using the CLI or just Python for free | news.ycombinator.com | 2024-01-02https://github.com/mindee/doctr/issues/1049
I am looking for something this polished and reliable for handwriting, does anyone have any pointers? I want to integrate it in a workflow with my eink tablet I take notes on. A few years ago, I tried various models, but they performed poorly (around 80% accuracy) on my handwriting, which I can read almost 90% of the time.
Project mention: How can I install pytorch versions < 1.0 and torchvision==0.13 or lower? | /r/pytorch | 2023-07-16
Project mention: Show HN: Beyond text splitting – improved file parsing for LLM's | news.ycombinator.com | 2024-04-07https://github.com/deepdoctection/deepdoctection
Have you tried this ?
Python OCR related posts
-
Ask HN: I have many PDFs – what is the best local way to leverage AI for search?
-
Integrated Rerankers, implemented RAPTOR, RAGFlow 0.7 released
-
Ask HN: RAG and unstructured data from several docs
-
Show HN: Tarsier – vision for text-only LLM web agents that beats GPT-4o
-
DeepSeek-V2 integrated, RAGFlow v0.5.0 is released
-
ScrapeGraphAI: Web scraping using LLM and direct graph logic
-
🔍Underrated Open Source Projects You Should Know About 🧠
-
A note from our sponsor - Scout Monitoring
www.scoutapm.com | 3 Jun 2024
Index
What are some of the best open-source OCR projects in Python? This list will help you:
Project | Stars | |
---|---|---|
1 | PaddleOCR | 39,198 |
2 | EasyOCR | 22,419 |
3 | paperless-ngx | 17,416 |
4 | OCRmyPDF | 12,353 |
5 | LaTeX-OCR | 11,121 |
6 | ragflow | 8,245 |
7 | pytesseract | 5,582 |
8 | donut | 5,411 |
9 | video-subtitle-extractor | 5,024 |
10 | layout-parser | 4,558 |
11 | manga-image-translator | 4,447 |
12 | PyMuPDF | 4,305 |
13 | mmocr | 4,131 |
14 | AdelaiDet | 3,332 |
15 | doctr | 3,183 |
16 | TextRecognitionDataGenerator | 3,085 |
17 | CRAFT-pytorch | 2,982 |
18 | CnOCR | 2,981 |
19 | Papermerge | 2,382 |
20 | deepdoctection | 2,268 |
21 | pdftabextract | 2,152 |
22 | BallonsTranslator | 2,135 |
23 | RapidOCR | 2,117 |
Sponsored