Commit 067ebcd
ARM: branchless binary search for smallPrefixes_
Summary:
Extract a branchless upper_bound into a function and use it on ARM for the
smallPrefixes_ lookup. Conditional index updates compile to CSEL instructions,
avoiding branch mispredictions that hurt sorted_vector_map::upper_bound()
which goes through iterator/comparator wrappers.
On x86, the hardware branch predictor handles upper_bound's branches well,
so this optimization is gated with `#ifdef __aarch64__`.
Common code (fullPrefixes_ upper_bound and previousPrefix_ walk) is shared
between both paths.
Benchmark command:
```
buck run fbcode/mode/opt fbcode//mcrouter/lib/fbi/cpp/test/facebook:string_prefix_map_benchmark -- --bm_regex _Lb --bm_mode=adaptive --bm_min_secs=5 --bm_max_secs=30
```
# ARM
Before:
```
============================================================================
[...]facebook/StringPrefixMapBenchmark.cpp relative time/iter iters/s
============================================================================
PrefixMapInCache_Lb 197.37ns 5.07M
PrefixMapNotInCache_Lb 1.83us 547.10K
```
After:
```
============================================================================
[...]facebook/StringPrefixMapBenchmark.cpp relative time/iter iters/s
============================================================================
PrefixMapInCache_Lb 92.27ns 10.84M
PrefixMapNotInCache_Lb 449.13ns 2.23M
```
- PrefixMapInCache_Lb: 197.37ns → 92.27ns (53% less time/iter)
- PrefixMapNotInCache_Lb: 1.83us → 449.13ns (75% less time/iter)
Reviewed By: DenisYaroshevskiy
Differential Revision: D97136185
fbshipit-source-id: 2c06d420378264b595a03efab56358857bc30fcc1 parent 8ae299b commit 067ebcd
1 file changed
Lines changed: 34 additions & 1 deletion
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
13 | 13 | | |
14 | 14 | | |
15 | 15 | | |
16 | | - | |
| 16 | + | |
17 | 17 | | |
18 | 18 | | |
19 | 19 | | |
20 | 20 | | |
21 | 21 | | |
| 22 | + | |
| 23 | + | |
| 24 | + | |
| 25 | + | |
| 26 | + | |
| 27 | + | |
| 28 | + | |
| 29 | + | |
| 30 | + | |
| 31 | + | |
| 32 | + | |
| 33 | + | |
| 34 | + | |
| 35 | + | |
| 36 | + | |
| 37 | + | |
| 38 | + | |
| 39 | + | |
| 40 | + | |
| 41 | + | |
| 42 | + | |
| 43 | + | |
| 44 | + | |
| 45 | + | |
| 46 | + | |
22 | 47 | | |
23 | 48 | | |
24 | 49 | | |
| |||
65 | 90 | | |
66 | 91 | | |
67 | 92 | | |
| 93 | + | |
| 94 | + | |
| 95 | + | |
| 96 | + | |
| 97 | + | |
| 98 | + | |
| 99 | + | |
68 | 100 | | |
69 | 101 | | |
| 102 | + | |
70 | 103 | | |
71 | 104 | | |
72 | 105 | | |
| |||
0 commit comments