Skip to main content
debugging
Source Link
Snowbody
  • 8.7k
  • 25
  • 50

The final column's call to MATCH() is likely the biggest source oftriggering some slowdown. The 0 parameter forces a linear search, so the whole thing takes time proportional to n^2 where n is the number of elements. This needs to be avoided. The list of count in column G (or wherever you put it, if you got them combined) is in monotonically increasing order, so it doesn't needis permissible to be sorted, so we can use 1 or true as thatthe third parameter to SEARCH(), causing it performsto perform a binary search -- but unfortunately it might pick any element with that value. Instead we tweakIn order to end up at the right spot, take advantage of the other feature of binary search: it picks the latest value less than or equal to the searched item. Tweak the value slightly and end up at the right spot!

=IF(ROWS($G$3:G7G3)<=MaxOfColumnG<=MAX($G$3:$G$1000),
 INDEX($E$3:$E$22,1+MATCH1+IFERROR(MATCH(ROWS($G$3:G7G3)-0.5,$G$3:$G$22,1),0)),"")

This depends on a helper cell somewhere else, named MaxOfColumnG, which is just the =MAX($g$3$G$3:$G$1000) or whatever.

How it works: The list in column G ("third column") is something like

0
0
0
1
1
1
2
2
3
4

Suppose we wantThe formula in H3 wants to find thatthe first 21. So we search for 10.5. We find the largest value less than or equal to 10.5, which ends up being the last 10. Add on one more and we're at the right place. I had to change the IFERROR to an IF but we already know the max-index. In addition, the IF() will short-circuit and not perform the MATCH() unless it is necessary.

The final column's call to MATCH() is likely the biggest source of slowdown. The 0 parameter forces a linear search, so the whole thing takes time proportional to n^2 where n is the number of elements. This needs to be avoided. The list of count is in increasing order, so it doesn't need to be sorted, so we can use 1 or true as that parameter, it performs a binary search -- but unfortunately it might pick any element with that value. Instead we tweak the value slightly and end up at the right spot!

=IF(ROWS($G$3:G7)<=MaxOfColumnG,INDEX($E$3:$E$22,1+MATCH(ROWS($G$3:G7)-0.5,$G$3:$G$22,1)),"")

This depends on a helper cell somewhere else, named MaxOfColumnG, which is just the =MAX($g$3:$G$1000) or whatever.

How it works: The list in column G ("third column") is something like

0
0
0
1
1
1
2
2
3
4

Suppose we want to find that first 2. So we search for 1.5. We find the largest value less than or equal to 1.5, which ends up being the last 1. Add on one more and we're at the right place. I had to change the IFERROR to an IF but we already know the max-index. In addition, the IF() will short-circuit and not perform the MATCH() unless it is necessary.

The final column's call to MATCH() is triggering some slowdown. The 0 parameter forces a linear search, so the whole thing takes time proportional to n^2 where n is the number of elements. This needs to be avoided. The list of count in column G (or wherever you put it, if you got them combined) is in monotonically increasing order, so it is permissible to use 1 or true as the third parameter to SEARCH(), causing it to perform a binary search -- but unfortunately it might pick any element with that value. In order to end up at the right spot, take advantage of the other feature of binary search: it picks the latest value less than or equal to the searched item. Tweak the value slightly and end up at the right spot!

=IF(ROWS($G$3:G3)<=MAX($G$3:$G$1000),
 INDEX($E$3:$E$22,1+IFERROR(MATCH(ROWS($G$3:G3)-0.5,$G$3:$G$22,1),0)),"")

This depends on a helper cell somewhere else, named MaxOfColumnG, which is just the =MAX($G$3:$G$1000) or whatever.

How it works: The list in column G ("third column") is something like

0
0
0
1
1
1
2
2
3
4

The formula in H3 wants to find the first 1. So we search for 0.5. We find the largest value less than or equal to 0.5, which ends up being the last 0. Add on one more and we're at the right place. I had to change the IFERROR to an IF but we already know the max-index. In addition, the IF() will short-circuit and not perform the MATCH() unless it is necessary.

added an actual speedup
Source Link
Snowbody
  • 8.7k
  • 25
  • 50

The final column's call to MATCH() is likely the biggest source of slowdown. This needs to be avoided. The 0 parameter forces a linear search, so the whole thing takes time proportional to n^2 where n is the number of elements. It wouldThis needs to be betteravoided. The list of count is in increasing order, so it doesn't need to be sorted, so we can use 1 or true as that parameter, allowingit performs a binary search -- but unfortunately thisit might pick any element with that value. Instead we tweak the value slightly and end up at the right spot!

=IF(ROWS($G$3:G7)<=MaxOfColumnG,INDEX($E$3:$E$22,1+MATCH(ROWS($G$3:G7)-0.5,$G$3:$G$22,1)),"")

This depends on a helper cell somewhere else, named MaxOfColumnG, which is just picks the closest match=MAX($g$3:$G$1000) or whatever. An additional helper

How it works: The list in column G ("third column") is neededsomething like

0
0
0
1
1
1
2
2
3
4

Suppose we want to check iffind that first 2. So we search for 1.5. We find the match is correctlargest value less than or equal to 1.5, which ends up being the last 1. StillAdd on one more and we're at the right place. I had to change the IFERROR to an IF but we already know the max-index. In addition, that's only another n calculationsthe IF() will short-circuit and not perform the MATCH() unless it is necessary.

The final column's call to MATCH() is likely the biggest source of slowdown. This needs to be avoided. The 0 parameter forces a linear search, so the whole thing takes time proportional to n^2 where n is the number of elements. It would be better to use 1 or true as that parameter, allowing a binary search -- but unfortunately this just picks the closest match. An additional helper column is needed to check if the match is correct. Still, that's only another n calculations.

The final column's call to MATCH() is likely the biggest source of slowdown. The 0 parameter forces a linear search, so the whole thing takes time proportional to n^2 where n is the number of elements. This needs to be avoided. The list of count is in increasing order, so it doesn't need to be sorted, so we can use 1 or true as that parameter, it performs a binary search -- but unfortunately it might pick any element with that value. Instead we tweak the value slightly and end up at the right spot!

=IF(ROWS($G$3:G7)<=MaxOfColumnG,INDEX($E$3:$E$22,1+MATCH(ROWS($G$3:G7)-0.5,$G$3:$G$22,1)),"")

This depends on a helper cell somewhere else, named MaxOfColumnG, which is just the =MAX($g$3:$G$1000) or whatever.

How it works: The list in column G ("third column") is something like

0
0
0
1
1
1
2
2
3
4

Suppose we want to find that first 2. So we search for 1.5. We find the largest value less than or equal to 1.5, which ends up being the last 1. Add on one more and we're at the right place. I had to change the IFERROR to an IF but we already know the max-index. In addition, the IF() will short-circuit and not perform the MATCH() unless it is necessary.

Source Link
Snowbody
  • 8.7k
  • 25
  • 50

The final column's call to MATCH() is likely the biggest source of slowdown. This needs to be avoided. The 0 parameter forces a linear search, so the whole thing takes time proportional to n^2 where n is the number of elements. It would be better to use 1 or true as that parameter, allowing a binary search -- but unfortunately this just picks the closest match. An additional helper column is needed to check if the match is correct. Still, that's only another n calculations.