Split the code
This long method does a lot of things:
- it reads the file
- preprocesses it
- searches for the bounding box of the area of interest
- crops the result
- writes the result to a file
Better would be to split this into more parts. This way, you don't need to comment as much, but let the function names speak for themselves.
Comments
If you do feel the need to comment, you can do that in the docstring. If you want to comment on the code, explain why you do it, not how.
#Ignore the limits/extremities of the document (sometimes are black, so they distract the algorithm)
is a useful comment.
# Apply adaptive mean thresholding
is not. It doesn't explain what problem this adaptive thresholding solves, and how you got to the parameters you use: 255, ADAPTIVE_THRESH_MEAN_C, 35 and 15
Indexing
negative indices start counting from the back of a sequence, so
(height, width) = img.shape[0:2]
image = erosion[50:height - 50, 50: width - 50]
can be replaced by erosion[50:-50, 50:-50]
There is also no need to put the parentheses around the height, width tuple.
Magic numbers
There are a lot of magic number in your code: 15 and 35 in the adaptive threshold, 15 in the kerneling, 50 in the cropping,...
Better would be to give them names, and define them in the function or use them as parameters to pass into the function.
Keyword arguments
amtImage = cv2.adaptiveThreshold(bit, 255, cv2.ADAPTIVE_THRESH_MEAN_C, cv2.THRESH_BINARY, 35, 15)
would be a lot clearer as:
blocksize = 35
constant = 15
max_value = 255 # 8 bits
amtImage = cv2.adaptiveThreshold(
src=bit,
maxValue=max_value ,
adaptiveMethod=cv2.ADAPTIVE_THRESH_MEAN_C,
thresholdType=cv2.THRESH_BINARY,
blockSize=blocksize ,
C=constant,
)
Vectorizing
opencv uses numpy arrays internally, so you can use all the vectorisation goodies, instead of iterating over each pixel twice in python land.
bw_threshold = 150
limits = 0.2, 0.15
mask = image < bw_threshold
edges = []
for axis in (0, 1):
count = mask.sum(axis=axis)
limit = limits[axis] * image.shape[axis]
index = np.where(count > limit)
_min, _max = index[0][0], index[0][-1]
edges.append((_min, _max))
does the same, but vectorized and about 1000 times faster.
Final result
def preproces_image(
image,
*,
kernel_size=15,
crop_side=50,
blocksize=35,
constant=15,
max_value=255,
):
gray = cv2.cvtColor(image, cv2.COLOR_BGR2GRAY)
bit = cv2.bitwise_not(gray)
image_adapted = cv2.adaptiveThreshold(
src=bit,
maxValue=max_value,
adaptiveMethod=cv2.ADAPTIVE_THRESH_MEAN_C,
thresholdType=cv2.THRESH_BINARY,
blockSize=blocksize,
C=constant,
)
kernel = np.ones((kernel_size, kernel_size), np.uint8)
erosion = cv2.erode(image_adapted, kernel, iterations=2)
return erosion[crop_side:-crop_side, crop_side:-crop_side]
def find_edges(image_preprocessed, *, bw_threshold=150, limits=(0.2, 0.15)):
mask = image_preprocessed < bw_threshold
edges = []
for axis in (1, 0):
count = mask.sum(axis=axis)
limit = limits[axis] * image_preprocessed.shape[axis]
index_ = np.where(count >= limit)
_min, _max = index_[0][0], index_[0][-1]
edges.append((_min, _max))
return edges
def adapt_edges(edges, *, height, width):
(x_min, x_max), (y_min, y_max) = edges
x_min2 = x_min
x_max2 = x_max + min(250, (height - x_max) * 10 // 11)
# could do with less magic numbers
y_min2 = max(0, y_min)
y_max2 = y_max + min(250, (width - y_max) * 10 // 11)
return (x_min2, x_max2), (y_min2, y_max2)
if __name__ == "__main__":
filename_in = "NHnV7.png"
filename_out = "res_NHnV7.png"
image = cv2.imread(str(filename_in))
height, width = image.shape[0:2]
image_preprocessed = preproces_image(image)
edges = find_edges(image_preprocessed)
(x_min, x_max), (y_min, y_max) = adapt_edges(
edges, height=height, width=width
)
image_cropped = image[x_min:x_max, y_min:y_max]
cv2.imwrite(str(filename_out), image_cropped)