Can someone help me further optimize the following cythonCython code snippets? Specifically, aa and bb are np.ndarraynp.ndarray with intint value (range(256)) in them, they. They are one dimension arrays with dynamic length, resultHamming. resultHamming is a one-dimension array with float value in it (dynamic length), bits. bits is an intint list (size 256).
The function is to compare two dynamic length bit vector, and return a similarity value as the distance of the two, where the length of each vector is a multiple of 2048-bit (256 bytes). I want to find the best match between these two bit vector by comparing each 2048-bit block, where each bit vector is represented as ndarrayndarray (read the bit sequence byte by byte, thus each position is range from 0 to 2^8 = 256). Rule for matching is to find global minimum distance between all block pairs, and allow one block in A to be matched with more than one block in B if they have smaller distance. Always compare the smaller size vector against the larger one.
The following code assumes bb vector is smaller, we. We can limit resultHammingresultHamming to be smaller than size of numArrayBnumArrayB and only record numArrayBnumArrayB smallest distance value, but need to track the current size when inserting new value into it. Even with current case (record all the pairwise distance), we actually know the final size of reaultHammingresultHamming at the beginingbeginning.