I think a different approach is needed to achieve a better performance. The current approach recomputes the Jaccard similarity from scratch for each possible threshold value. However, going from one threshold to the next, only a small fraction of prediction values change as well as the intersection and the union. Therefore a lot of unnecessary computation is being performed.
A better approach can first compute for each patch a 2D histogram of the ground truth and the prediction probabilities, using the thresholds as bin edges. The frequency counts in each bin give the amount of predictions affected going from one threshold to the next. Therefore, it should be possible to directly compute the Jaccard similarity values for all thresholds from cummulative frequency counts derived from the histogram.