Matching corresponding masks of objects between 2 images

Question

Here is a program that processes 2 grayscale images. Both represent the position of objects, in the form of multiple masks per image. Each mask is a shape with one color delimiting it, with 0 (or black) for the background. Each object uses the color available in order when created, so if there are n object, the color from 1 to n are used.

The first image is the result of an AI prediction, while the second is estimated by a user. I want to match objetcs between the 2 images, since numbering is not done in the same order in both. I thus matched the objects, in one of three categories :

no match
exactly one match
more than one match

This is done by first extracting a boolean image of the position for each mask, storing them in 2 lists, and summing (equivalent to and in bool) each combinaition to see if they overlap. I don't have multiple matches in this exemple, but there are in other cases. For exactly 1-1 match, I do the numbering again for img_2, by matching it with what is stored in one.

The code is working as intended, but it is a bit slow when the image size gets bigger, and especially when the number of objects gets higher. The matching part is the main culprit, since its complexity is O(n²). It takes about 2min for images with 250 objects (is it possible to share those? I can't create them in the code as I did here), and it is most likely the upper bound of what I'll encounter.

My question is : is it possible to reduce the complexity of the matching part of my code ?

Any feedback on the rest of code is also welcomed, although not my main concern.

import numpy as np
import matplotlib.pyplot as plt


### Creating both sample images
img_1 = np.array([[0]*27+[1]*22+[0]*10,
                  [0]*27+[1]*22+[0]*10,
                  [0]*27+[1]*22+[0]*10,
                  [0]*28+[1]*20+[0]*11,
                  [0]*28+[1]*20+[0]*11,
                  [0]*29+[1]*18+[0]*12,
                  [0]*29+[1]*18+[0]*12,
                  [0]*30+[1]*16+[0]*13,
                  [0]*12+[3]*2+[0]*18+[1]*12+[0]*15,
                  [0]*11+[3]*4+[0]*19+[1]*8+[0]*17,
                  [0]*10+[3]*6+[0]*43,
                  [0]*10+[3]*6+[0]*43,
                  [0]*10+[3]*6+[0]*43,
                  [0]*11+[3]*4+[0]*44,
                  [0]*59,
                  [0]*59,
                  [0]*59,
                  [0]*59,
                  [0]*59,
                  [0]*59,
                  [0]*59,
                  [0]*59,
                  [0]*59,
                  [0]*59,
                  [0]*59,
                  [0]*59,
                  [0]*59,
                  [0]*59,
                  [0]*59,
                  [0]*12+[2]*2+[0]*45,
                  [0]*10+[2]*6+[0]*43,
                  [0]*8+[2]*10+[0]*41,
                  [0]*7+[2]*12+[0]*40,
                  [0]*5+[2]*16+[0]*38,
                  [0]*4+[2]*18+[0]*37,
                  [0]*3+[2]*20+[0]*36,
                  [0]*3+[2]*20+[0]*36,
                  [0]*3+[2]*20+[0]*36,
                  [0]*3+[2]*20+[0]*36,
                  [0]*3+[2]*20+[0]*36,
                  [0]*3+[2]*20+[0]*36,
                  [0]*3+[2]*20+[0]*36,
                  [0]*3+[2]*20+[0]*36,
                  [0]*3+[2]*20+[0]*36,
                  [0]*3+[2]*20+[0]*36,
                  [0]*3+[2]*20+[0]*36,
                  [0]*3+[2]*20+[0]*36,
                  [0]*4+[2]*18+[0]*37,
                  [0]*5+[2]*16+[0]*38,
                  [0]*7+[2]*12+[0]*40,
                  [0]*8+[2]*10+[0]*41,
                  [0]*10+[2]*6+[0]*43,
                  [0]*11+[2]*4+[0]*44,
                  [0]*59,
                  [0]*59],dtype=np.int32)

img_2 = np.array([[0]*24+[3]*23+[0]*12,
                  [0]*23+[3]*25+[0]*11,
                  [0]*23+[3]*25+[0]*11,
                  [0]*21+[2]*2+[3]*25+[0]*11,
                  [0]*20+[2]*3+[3]*25+[0]*11,
                  [0]*19+[2]*4+[3]*25+[0]*11,
                  [0]*18+[2]*5+[3]*25+[0]*11,
                  [0]*18+[2]*6+[3]*23+[0]*12,
                  [0]*18+[2]*6+[3]*23+[0]*12,
                  [0]*18+[2]*7+[3]*21+[0]*13,
                  [0]*19+[2]*7+[3]*19+[0]*14,
                  [0]*20+[2]*7+[3]*17+[0]*15,
                  [0]*21+[2]*5+[0]*3+[3]*13+[0]*17,
                  [0]*32+[3]*7+[0]*20,
                  [0]*59,
                  [0]*59,
                  [0]*59,
                  [0]*59,
                  [0]*59,
                  [0]*59,
                  [0]*59,
                  [0]*59,
                  [0]*59,
                  [0]*59,
                  [0]*13+[1]*3+[0]*43,
                  [0]*11+[1]*6+[0]*42,
                  [0]*9+[1]*10+[0]*40,
                  [0]*8+[1]*12+[0]*39,
                  [0]*6+[1]*16+[0]*37,
                  [0]*4+[1]*20+[0]*35,
                  [0]*3+[1]*22+[0]*34,
                  [0]*2+[1]*24+[0]*33,
                  [0]*2+[1]*24+[0]*33,
                  [0]*2+[1]*24+[0]*33,
                  [0]*2+[1]*24+[0]*33,
                  [0]*2+[1]*24+[0]*33,
                  [0]*2+[1]*24+[0]*33,
                  [0]*2+[1]*24+[0]*33,
                  [0]*2+[1]*24+[0]*33,
                  [0]*2+[1]*24+[0]*33,
                  [0]*2+[1]*24+[0]*33,
                  [0]*2+[1]*24+[0]*33,
                  [0]*2+[1]*24+[0]*33,
                  [0]*2+[1]*24+[0]*33,
                  [0]*2+[1]*24+[0]*33,
                  [0]*3+[1]*22+[0]*34,
                  [0]*4+[1]*20+[0]*35,
                  [0]*6+[1]*16+[0]*37,
                  [0]*7+[1]*14+[0]*38,
                  [0]*9+[1]*10+[0]*40,
                  [0]*11+[1]*6+[0]*42,
                  [0]*12+[1]*4+[0]*43,
                  [0]*59,
                  [0]*59,
                  [0]*59],dtype=np.int32)



### Side to side images
print("Number of items predicted in img_1 :",len(np.unique(img_1)))
print("Number of items in ground-truth, img_2 :",len(np.unique(img_2)))

plt.figure(0)
plt.subplot(1,2,1), plt.imshow(img_1)
plt.axis("off")
plt.subplot(1,2,2), plt.imshow(img_2)
plt.axis("off")


### Creating one boolean image for each mask
masks_1 = []
masks_2 = []
leng = max(len(np.unique(img_1)), len(np.unique(img_2)))
for k in range(1,leng):
    mask_1 = img_1==k
    masks_1.append(mask_1)
    mask_2 = img_2==k
    masks_2.append(mask_2)


### Each possible combination, from img_1 compared to img_2, and the opposite
zero=[]     # No match from img_1 to img_2
one={}      # Exactly one match from img_1 to img_2
more={}     # More than one match from img_1 to img_2
zero_2=[]   # Same as zero but from img_2 to img_1
one_2={}    # Same as one but from img_2 to img_1
more_2={}   # Same as more but from img_2 to img_1
for i in range(leng-1):
    number_match = 0
    number_match_2 = 0
    index_match = []
    index_match_2 = []
    
    ### Checks if there are pixels in common in both images, and the total number
    for j in range(leng-1):
        if not (sum(sum(masks_1[i]*masks_2[j]))==0):
            number_match += 1
            index_match.append(j)
        if not (sum(sum(masks_2[i]*masks_1[j]))==0):
            number_match_2 += 1
            index_match_2.append(j)
    
    ### Sorts each index depending on the number of match
    if number_match==0:
        zero.append(i)
    elif number_match>1:
        more[i] = index_match
    elif number_match==1:
        one[i] = index_match[0]
    
    if number_match_2==0:
        zero_2.append(i)
    elif number_match_2>1:
        more_2[i] = index_match_2
    elif number_match_2==1:
        one_2[i] = index_match_2[0]


### Remove pairs if it's not a 1-1 match
temp = []
for m in one.keys():
    if one_2.get(one[m]) is None:
        if more_2.get(one[m]) is None:
            zero_2.append(one[m])
            zero.append(m)
            temp.append(m)
        else:
            more.update({m:one[m]})
            temp.append(m)
for k in range(len(temp)):
    del one[temp[k]]

temp = []
for n in one_2.keys():
    if one.get(one_2[n]) is None:
        if more.get(one_2[n]) is None:
            zero.append(one_2[n])
            zero_2.append(n)
            temp.append(n)
        else:
            more_2.update({n:one_2[n]})
            temp.append(n)
for k in range(len(temp)):
    del one_2[temp[k]]



### Second pair of image, with all 1-1 match
xSize, ySize = np.shape(img_1)
img_1 = np.zeros((xSize,ySize),dtype=np.int32)
img_2 = np.zeros((xSize,ySize),dtype=np.int32)
if bool(one):
    lis = list(one.keys())
    for m in range(len(lis)):
        img_1 = img_1+(masks_1[lis[m]])*(lis[m]+1)

if bool(one_2):
    lis2 = list(one_2.keys())
    for n in range(len(lis2)):
        img_2 = img_2+(masks_2[lis2[n]])*(one_2[lis2[n]]+1)


plt.figure(1)
plt.subplot(1,2,1), plt.imshow(img_1)
plt.axis("off")
plt.subplot(1,2,2), plt.imshow(img_2)
plt.axis("off")

np.unique(img_1) produces [0, 1, 2, 3], but in your mask loop you start at 1, so 0 will never get a mask. Is this intentional? — Reinderien
– Reinderien, Commented Jul 2, 2022 at 17:50
It seems so, because you've described 0 as being reserved for the background. — Reinderien
– Reinderien, Commented Jul 2, 2022 at 17:51

Reinderien · Accepted Answer · 2022-07-03 01:21:25Z

This:

summing (equivalent to and in bool)

is untrue. "And" is equivalent to multiplication.

As for your code:

Move all of your code out of the global namespace into functions, and hint their signatures with PEP484 types.

Increase vectorisation by holding your images in one array with an outer dimension of 2.

These prints:

print("Number of items predicted in img_1 :", len(np.unique(img_1)))
print("Number of items in ground-truth, img_2 :", len(np.unique(img_2)))

do not seem accurate, because you include the background (0). You should probably exclude this.

Rather than implicit axis references with plt., prefer explicit axis object references, such as ax.imshow().

Do not hold masks_x as lists; instead hold them as one np.ndarray.

Combine zero and zero_2 into one tuple of two lists, and similar for one and more.

This loop:

for i in range(leng - 1):

though it doesn't say so due to a poor variable name, is actually looping through colours ("items" in your parlance). But why are you using a range? Why not actually iterate through the colour values themselves? This will relieve your code from needing to assume that your colours are contiguous integers.

Don't not sum(sum(x * y)) == 0. This is really just an np.any() on an np.logical_and over the proper dimensions.

Change this loop:

for m in one.keys():

so that it's calling .items() instead of .keys(), so that you can use the value from it instead of re(re-re-re)-writing one[m].

Do not write if one_2.get(one[m]) is None; instead write if one[m] not in one_2.

Have you tested this?

for k in range(len(temp)):
    del one[temp[k]]

Your current test data do not exercise this path, and I believe it will fail because you're deleting by index from front to back, when you need to delete from back to front. Also don't iterate over a range; just iterate over the values of temp.

if bool(one): and similar can be entirely deleted because your later for loop, if one is empty, will already be a no-op. Even if you didn't delete it, bool() is unnecessary because the collection is already truthy.

You should really add titles to your figures.

Suggested

Basically equivalent in my limited testing.

import numpy as np
import matplotlib.pyplot as plt


def make_samples() -> np.ndarray:  # 2x55x59
    samples = np.empty((2, 55, 59), dtype=np.int32)

    samples[0, ...] = [[0] * 27 + [1] * 22 + [0] * 10,
                      [0] * 27 + [1] * 22 + [0] * 10,
                      [0] * 27 + [1] * 22 + [0] * 10,
                      [0] * 28 + [1] * 20 + [0] * 11,
                      [0] * 28 + [1] * 20 + [0] * 11,
                      [0] * 29 + [1] * 18 + [0] * 12,
                      [0] * 29 + [1] * 18 + [0] * 12,
                      [0] * 30 + [1] * 16 + [0] * 13,
                      [0] * 12 + [3] * 2 + [0] * 18 + [1] * 12 + [0] * 15,
                      [0] * 11 + [3] * 4 + [0] * 19 + [1] * 8 + [0] * 17,
                      [0] * 10 + [3] * 6 + [0] * 43,
                      [0] * 10 + [3] * 6 + [0] * 43,
                      [0] * 10 + [3] * 6 + [0] * 43,
                      [0] * 11 + [3] * 4 + [0] * 44,
                      [0] * 59,
                      [0] * 59,
                      [0] * 59,
                      [0] * 59,
                      [0] * 59,
                      [0] * 59,
                      [0] * 59,
                      [0] * 59,
                      [0] * 59,
                      [0] * 59,
                      [0] * 59,
                      [0] * 59,
                      [0] * 59,
                      [0] * 59,
                      [0] * 59,
                      [0] * 12 + [2] * 2 + [0] * 45,
                      [0] * 10 + [2] * 6 + [0] * 43,
                      [0] * 8 + [2] * 10 + [0] * 41,
                      [0] * 7 + [2] * 12 + [0] * 40,
                      [0] * 5 + [2] * 16 + [0] * 38,
                      [0] * 4 + [2] * 18 + [0] * 37,
                      [0] * 3 + [2] * 20 + [0] * 36,
                      [0] * 3 + [2] * 20 + [0] * 36,
                      [0] * 3 + [2] * 20 + [0] * 36,
                      [0] * 3 + [2] * 20 + [0] * 36,
                      [0] * 3 + [2] * 20 + [0] * 36,
                      [0] * 3 + [2] * 20 + [0] * 36,
                      [0] * 3 + [2] * 20 + [0] * 36,
                      [0] * 3 + [2] * 20 + [0] * 36,
                      [0] * 3 + [2] * 20 + [0] * 36,
                      [0] * 3 + [2] * 20 + [0] * 36,
                      [0] * 3 + [2] * 20 + [0] * 36,
                      [0] * 3 + [2] * 20 + [0] * 36,
                      [0] * 4 + [2] * 18 + [0] * 37,
                      [0] * 5 + [2] * 16 + [0] * 38,
                      [0] * 7 + [2] * 12 + [0] * 40,
                      [0] * 8 + [2] * 10 + [0] * 41,
                      [0] * 10 + [2] * 6 + [0] * 43,
                      [0] * 11 + [2] * 4 + [0] * 44,
                      [0] * 59,
                      [0] * 59]

    samples[1, ...] = [[0] * 24 + [3] * 23 + [0] * 12,
                      [0] * 23 + [3] * 25 + [0] * 11,
                      [0] * 23 + [3] * 25 + [0] * 11,
                      [0] * 21 + [2] * 2 + [3] * 25 + [0] * 11,
                      [0] * 20 + [2] * 3 + [3] * 25 + [0] * 11,
                      [0] * 19 + [2] * 4 + [3] * 25 + [0] * 11,
                      [0] * 18 + [2] * 5 + [3] * 25 + [0] * 11,
                      [0] * 18 + [2] * 6 + [3] * 23 + [0] * 12,
                      [0] * 18 + [2] * 6 + [3] * 23 + [0] * 12,
                      [0] * 18 + [2] * 7 + [3] * 21 + [0] * 13,
                      [0] * 19 + [2] * 7 + [3] * 19 + [0] * 14,
                      [0] * 20 + [2] * 7 + [3] * 17 + [0] * 15,
                      [0] * 21 + [2] * 5 + [0] * 3 + [3] * 13 + [0] * 17,
                      [0] * 32 + [3] * 7 + [0] * 20,
                      [0] * 59,
                      [0] * 59,
                      [0] * 59,
                      [0] * 59,
                      [0] * 59,
                      [0] * 59,
                      [0] * 59,
                      [0] * 59,
                      [0] * 59,
                      [0] * 59,
                      [0] * 13 + [1] * 3 + [0] * 43,
                      [0] * 11 + [1] * 6 + [0] * 42,
                      [0] * 9 + [1] * 10 + [0] * 40,
                      [0] * 8 + [1] * 12 + [0] * 39,
                      [0] * 6 + [1] * 16 + [0] * 37,
                      [0] * 4 + [1] * 20 + [0] * 35,
                      [0] * 3 + [1] * 22 + [0] * 34,
                      [0] * 2 + [1] * 24 + [0] * 33,
                      [0] * 2 + [1] * 24 + [0] * 33,
                      [0] * 2 + [1] * 24 + [0] * 33,
                      [0] * 2 + [1] * 24 + [0] * 33,
                      [0] * 2 + [1] * 24 + [0] * 33,
                      [0] * 2 + [1] * 24 + [0] * 33,
                      [0] * 2 + [1] * 24 + [0] * 33,
                      [0] * 2 + [1] * 24 + [0] * 33,
                      [0] * 2 + [1] * 24 + [0] * 33,
                      [0] * 2 + [1] * 24 + [0] * 33,
                      [0] * 2 + [1] * 24 + [0] * 33,
                      [0] * 2 + [1] * 24 + [0] * 33,
                      [0] * 2 + [1] * 24 + [0] * 33,
                      [0] * 2 + [1] * 24 + [0] * 33,
                      [0] * 3 + [1] * 22 + [0] * 34,
                      [0] * 4 + [1] * 20 + [0] * 35,
                      [0] * 6 + [1] * 16 + [0] * 37,
                      [0] * 7 + [1] * 14 + [0] * 38,
                      [0] * 9 + [1] * 10 + [0] * 40,
                      [0] * 11 + [1] * 6 + [0] * 42,
                      [0] * 12 + [1] * 4 + [0] * 43,
                      [0] * 59,
                      [0] * 59,
                      [0] * 59]

    return samples


def get_unique(before_images: np.ndarray) -> np.ndarray:  # one-dimensional (3)
    img_1, img_2 = before_images
    items_1 = np.unique(img_1)
    items_2 = np.unique(img_2)
    items = np.unique(before_images)

    # exclude background
    items_1 = items_1[items_1.nonzero()]
    items_2 = items_2[items_2.nonzero()]
    items = items[items.nonzero()]

    print('Number of items:')
    print("    img_1, predicted:", items_1.size)
    print("    img_2, ground-truth:", items_2.size)
    print("    shared:", items.size)

    return items


def make_masks(
    before_images: np.ndarray,  # 2 x 55 x 59
    items: np.ndarray,          # one-dimensional (3)
) -> np.ndarray:  # 3 x 2 x 55 x 59
    """Creating one boolean image for each mask"""
    # item * left/right * height * width
    k = np.expand_dims(items, (1, 2, 3))
    return before_images[np.newaxis, ...] == k


def make_combinations(
    items: np.ndarray,  # one-dimensional (3)
    masks: np.ndarray,  # 3 x 2 x 55 x 59
) -> tuple[
    tuple[list[int], ...],       # zeros
    tuple[dict[int, int], ...],  # ones
    tuple[dict, ...],            # more
]:
    """Each possible combination, from img_1 compared to img_2, and the opposite"""

    zeros = [], []    # No match from image to image
    ones = {}, {}     # Exactly one match from image to image
    mores = {}, {}    # More than one match from image to image

    for masks_a, item_a in zip(masks, items):
        for i_image, (zero, one, more) in enumerate(zip(
            zeros, ones, mores,
        )):
            number_match = 0
            index_match = []

            # Checks if there are pixels in common in both images, and the total number
            for masks_b, item_b in zip(masks, items):
                if np.any(np.logical_and(masks_a[i_image, ...], masks_b[1 - i_image, ...])):
                    number_match += 1
                    index_match.append(item_b)

            # Sorts each index depending on the number of match
            if number_match == 0:
                zero.append(item_a)
            elif number_match == 1:
                one[item_a] = index_match[0]
            elif number_match > 1:
                more[item_a] = index_match

    return zeros, ones, mores


def remove_pairs(
    zeros: tuple[list[int], ...],
    ones: tuple[dict[int, int], ...],
    more: tuple[dict, ...],
) -> None:
    """Remove pairs if it's not a 1-1 match"""

    for direction in (1, -1):
        this_zero, other_zero = zeros[::direction]
        this_one, other_one = ones[::direction]
        this_more, other_more = more[::direction]
        to_remove = []

        for one_key, one_val in this_one.items():
            if one_val not in other_one:
                if one_val in other_more:
                    this_more[one_key] = one_val
                else:
                    other_zero.append(one_val)
                    this_zero.append(one_key)
                to_remove.append(one_key)
        for t in sorted(to_remove, reverse=True):
            del this_one[t]


def make_new_pair(
    ones: tuple[dict[int, int], ...],
    items: np.ndarray,
    masks: np.ndarray,
) -> np.ndarray:
    """Second pair of image, with all 1-1 match"""

    item_indices = dict(np.stack((items, np.arange(items.size))).T)

    coefficients = np.zeros((*masks.shape[:2], 1, 1), dtype=np.int32)  # 3x2x1x1
    one, one_2 = ones
    for item in one.keys():
        coefficients[item_indices[item], 0, ...] = 1 + item
    for item, count in one_2.items():
        coefficients[item_indices[item], 1, ...] = 1 + count

    return (masks*coefficients).sum(axis=0)


def show(images: np.ndarray, title: str) -> plt.Figure:
    """Side to side images"""
    fig, axes = plt.subplots(nrows=1, ncols=images.shape[0])
    fig.suptitle(title)

    for axis, image in zip(axes, images):
        axis.imshow(image)
        axis.axis('off')

    return fig


def main() -> None:
    before_images = make_samples()
    show(before_images, 'Before matching')

    items = get_unique(before_images)
    masks = make_masks(before_images, items)
    zeros, ones, more = make_combinations(items, masks)
    remove_pairs(zeros, ones, more)
    after_images = make_new_pair(ones, items, masks)

    show(after_images, 'After matching')
    plt.show()


if __name__ == '__main__':
    main()

Output

Number of items:
    img_1, predicted: 3
    img_2, ground-truth: 3
    shared: 3

Simplification

is it possible to reduce the complexity of the matching part of my code ?

Yes, extremely. I think the algorithm may have been overthought. For example, you go through the trouble of building a zero collection but then never use it. This really boils down to

find the subset of pixel coordinates for which both the left and right images have foreground (are non-zero);
reduce to a map from left value to right value at those coordinates;
apply the map; and
demote to background anything not in the map.

The results are the same:

def main() -> None:
    images = make_samples()
    show(images, 'Before matching')

    both_foreground = np.logical_and.reduce(images, axis=0)
    mapping = np.unique(images[:, both_foreground], axis=1)

    # Substitution routine loosely inspired by https://stackoverflow.com/questions/3403973
    v, k = mapping
    sparse_map = np.zeros(k.max() + 1, dtype=np.int32)
    sparse_map[k] = v
    images[1, ...] = sparse_map[images[1, ...]]
    images[~np.isin(images, mapping[0, :])] = 0

    show(images, 'After matching')
    plt.show()

Thanks a lot for the answer! I've already went through your answer one time, but it'll take a bit longer to digest everything. To clarify, I do use zero, just not in the example I provided, since I focused on the matching part. — Clement B
– Clement B, Commented Jul 5, 2022 at 11:33
Your Suggested works really well for the upscaled images with 250 objects, compilling in about 10s. There is an offset of +1 in After Matching, coming from make_new_pair(). The Simplification doesn't work since it merges objects on the left with 1-2 or 1-3 match (my fault, should have included an example or the expected behaviour). The offset is gone though. — Clement B
– Clement B, Commented Jul 5, 2022 at 13:57

Stack Exchange Network

Matching corresponding masks of objects between 2 images

1 Answer 1

Suggested

Output

Simplification

You must log in to answer this question.

Hot Network Questions

Matching corresponding masks of objects between 2 images

1 Answer 1

Suggested

Output

Simplification

You must log in to answer this question.

Related

Hot Network Questions