5 questions from the last 30 days
-9
votes
0
answers
115
views
EasyOCR doesn't recognize any text in image
I'm trying to extract text from a video. I created cropped frames from the video using ffmpeg and saved them as PNGs. Now, I'm using those PNGs to extract the text using EasyOCR. At first, I tried ...
-6
votes
0
answers
91
views
Why is ADB image transfer slow compared to local file processing in Python OCR, and what is a faster alternative? [closed]
I am building a system where:
An Android device captures an image using the camera
I use adb pull to transfer the image from /sdcard/DCIM/Camera
Then I process it using pytesseract OCR in Python
...
Best practices
0
votes
3
replies
72
views
OCR output contains “garbage” characters after special symbols (mojibake / control chars) — how to reliably clean before returning from LLM?
I have an on-prem OCR pipeline that returns extracted text inside a JSON blob. I parse the LLM response and call a local normalizer before returning the text to callers. Example call site:
result = ...
Advice
0
votes
1
replies
94
views
How to extract rooms and dimensions from MEP/floor plan drawings using AI or computer vision?
I’m working on a project where I want to use AI / computer vision to read MEP (Mechanical, Electrical, Plumbing) drawings or floor plans.
My goal is to:
Detect rooms and extract their labels (e.g., “...
Advice
0
votes
0
replies
28
views
Implementing a KIE pipeline for hybrid document types (1 dynamic schema, 4 static templates)
I’m architecting a document processing pipeline for a real-time workflow. I have 5 document types, but they require two completely different extraction strategies.
The Document for Dynamic Form: This ...