6,216 questions
-9
votes
0
answers
115
views
EasyOCR doesn't recognize any text in image
I'm trying to extract text from a video. I created cropped frames from the video using ffmpeg and saved them as PNGs. Now, I'm using those PNGs to extract the text using EasyOCR. At first, I tried ...
-6
votes
0
answers
91
views
Why is ADB image transfer slow compared to local file processing in Python OCR, and what is a faster alternative? [closed]
I am building a system where:
An Android device captures an image using the camera
I use adb pull to transfer the image from /sdcard/DCIM/Camera
Then I process it using pytesseract OCR in Python
...
Best practices
0
votes
3
replies
72
views
OCR output contains “garbage” characters after special symbols (mojibake / control chars) — how to reliably clean before returning from LLM?
I have an on-prem OCR pipeline that returns extracted text inside a JSON blob. I parse the LLM response and call a local normalizer before returning the text to callers. Example call site:
result = ...
Advice
0
votes
0
replies
28
views
Implementing a KIE pipeline for hybrid document types (1 dynamic schema, 4 static templates)
I’m architecting a document processing pipeline for a real-time workflow. I have 5 document types, but they require two completely different extraction strategies.
The Document for Dynamic Form: This ...
Advice
0
votes
1
replies
94
views
How to extract rooms and dimensions from MEP/floor plan drawings using AI or computer vision?
I’m working on a project where I want to use AI / computer vision to read MEP (Mechanical, Electrical, Plumbing) drawings or floor plans.
My goal is to:
Detect rooms and extract their labels (e.g., “...
3
votes
0
answers
90
views
How to handle "False Positive" string matches in a state-dependent OCR engine (Chess Move Validation)? [closed]
Problem: I am building an OCR-based engine to digitize chess scoresheets using Python and python-chess. I use the Levenshtein distance (specifically Jaro-Winkler) to map recognized text (e.g., "...
Best practices
0
votes
7
replies
107
views
Which is the best way to detect lines in historically book pages
I am working on an OCR project and need to create a dataset consisting of approximately 1247 pages from 6 books. I need to crop the images line by line and transcribe the text for training a model. ...
Advice
1
vote
1
replies
79
views
How can I build my own offline text recognition library like Google ML Kit that gives text and bounding boxes, but supports my own custom language?
How can I create an Android library similar to Google ML Kit Text Recognition that works fully offline, detects text from images, and returns both the recognized text and the bounding boxes? I want to ...
0
votes
0
answers
72
views
Extract Bangla Text From Image
var ocr = new IronTesseract();
ocr.Language = OcrLanguage.Bengali;
// Optimization for high-density forms (18 boxes)
ocr.Configuration.ReadBarCodes = false;
ocr.Configuration.PageSegmentationMode = ...
0
votes
0
answers
48
views
Segmentation fault when initializing OCR on Raspberry Pi 5 (Python 3.13)
I'm encountering a segmentation fault when trying to initialize PaddleOCR on a Raspberry Pi 5 (8GB RAM) running Python 3.13 in a virtual environment. The error occurs during the model loading phase.
...
Advice
0
votes
1
replies
73
views
Use OCR on a large PDF (of scans) with 4 columns per pages
I work on the toponymie in France and need to know for each city its departement and the origins of the city name. To do so I have PDF of a scanned book (700 pages) that indicates those informations. ...
3
votes
2
answers
1k
views
PaddleOCR predict() method throws NotImplementedError
I'm trying to use PaddleOCR:
from paddleocr import PaddleOCR
from PIL import Image
# Initialize the OCR engine
ocr = PaddleOCR(use_textline_orientation=False, lang='es')
# Run OCR on an image path
...
3
votes
0
answers
401
views
Training Tesseract to decode the Epstein Files
The Department of Justice has recently released Volumes 09 and 10 of the Epstein files. Among them is a PDF: https://www.justice.gov/epstein/files/DataSet%209/EFTA01012650.pdf
This PDF contains ...
Advice
0
votes
3
replies
79
views
Flutter digital odometer ocr
I am working on a flutter app that needs to read digital odometer state from an image. The goal is to have an image of the odometer, crop it around the number as close as possible and then do text ...
3
votes
0
answers
92
views
Python OCR Tesseract: Extraction of time stamps and coordinates from video frames
I have a python problem regarding OCR of time stamps/GPS overlays from video files.
The video files contain camera footage of seagrass meadows (screenshot).
I would like to extract date + time (left ...