1

I'm trying to extract text from an image such as this one:

enter image description here

However, if I just use OCR, then the text extracted starts from the first line in the first column, and then continues to the first line of the second column, which is wrong. OCR should read all lines from the first column, and all lines from the second column separately.

By searching on the web I found this on Stackoverflow: How to detect figures in a paper news image in Python? This is actually based on this article: https://www.linkedin.com/pulse/how-segment-figures-text-region-newspaper-using-layout-mohammad-oghli

In both articles you can clearly see that all "columns" are detected with layoutparser.

However If I run the same code with the image above, the boxes created in the image are totally wrong.

These are the packages that need to be installed:

pip install layoutparser # Install the base layoutparser library with
pip install "layoutparser[layoutmodels]" # Install DL layout model toolkit
pip install "layoutparser[ocr]" # Install OCR toolkit

Then we need to install the detectron2 deep learning model backend dependencies

pip install layoutparser torchvision && pip install "git+https://github.com/facebookresearch/[email protected]#egg=detectron2"

And here is the code:

import layoutparser as lp
import cv2
import matplotlib.pyplot as plt

# Convert the image from BGR (cv2 default loading style)
# to RGB
image = cv2.imread("test.jpg")
image = image[..., ::-1]

# Load the deep layout model from the layoutparser API
# For all the supported model, please check the Model
# Zoo Page: https://layout-parser.readthedocs.io/en/latest/notes/modelzoo.html

model = lp.models.Detectron2LayoutModel('lp://PrimaLayout/mask_rcnn_R_50_FPN_3x/config',
                                 extra_config=["MODEL.ROI_HEADS.SCORE_THRESH_TEST", 0.7],
                                 label_map={1:"TextRegion", 2:"ImageRegion", 3:"TableRegion", 4:"MathsRegion", 5:"SeparatorRegion", 6:"OtherRegion"})

# Detect layout
layout = model.detect(image)

# Draw and display results
visualized_image = lp.draw_box(image, layout, box_width=10)
plt.figure(figsize=(12, 8))
plt.imshow(visualized_image)
plt.axis('off')

plt.show()

enter image description here

Does anyone have an idea of how to tackle this issue?

Hopefully someone can help me with my question. Thanks in advance.

2
  • what do you use to extract it? Where is your code? What means "the image are totally wrong."? We can't see your code, we can't see your computer, and we can't read in your mind. You have to put all details in question (not in comments). And it could be better if you would create minimal working code which we could use for tests. Commented Sep 30 at 15:30
  • 1
    Totally right! I changed my post with code, together with input and ouput. Commented Sep 30 at 17:11

0

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.