I'm trying to extract text from an image such as this one:
However, if I just use OCR, then the text extracted starts from the first line in the first column, and then continues to the first line of the second column, which is wrong. OCR should read all lines from the first column, and all lines from the second column separately.
By searching on the web I found this on Stackoverflow: How to detect figures in a paper news image in Python? This is actually based on this article: https://www.linkedin.com/pulse/how-segment-figures-text-region-newspaper-using-layout-mohammad-oghli
In both articles you can clearly see that all "columns" are detected with layoutparser.
However If I run the same code with the image above, the boxes created in the image are totally wrong.
These are the packages that need to be installed:
pip install layoutparser # Install the base layoutparser library with
pip install "layoutparser[layoutmodels]" # Install DL layout model toolkit
pip install "layoutparser[ocr]" # Install OCR toolkit
Then we need to install the detectron2 deep learning model backend dependencies
pip install layoutparser torchvision && pip install "git+https://github.com/facebookresearch/[email protected]#egg=detectron2"
And here is the code:
import layoutparser as lp
import cv2
import matplotlib.pyplot as plt
# Convert the image from BGR (cv2 default loading style)
# to RGB
image = cv2.imread("test.jpg")
image = image[..., ::-1]
# Load the deep layout model from the layoutparser API
# For all the supported model, please check the Model
# Zoo Page: https://layout-parser.readthedocs.io/en/latest/notes/modelzoo.html
model = lp.models.Detectron2LayoutModel('lp://PrimaLayout/mask_rcnn_R_50_FPN_3x/config',
extra_config=["MODEL.ROI_HEADS.SCORE_THRESH_TEST", 0.7],
label_map={1:"TextRegion", 2:"ImageRegion", 3:"TableRegion", 4:"MathsRegion", 5:"SeparatorRegion", 6:"OtherRegion"})
# Detect layout
layout = model.detect(image)
# Draw and display results
visualized_image = lp.draw_box(image, layout, box_width=10)
plt.figure(figsize=(12, 8))
plt.imshow(visualized_image)
plt.axis('off')
plt.show()
Does anyone have an idea of how to tackle this issue?
Hopefully someone can help me with my question. Thanks in advance.


minimal working codewhich we could use for tests.