optimize lz matchfinding loop (#826) by Victor-C-Zhang · Pull Request #826 · facebook/openzl

Victor-C-Zhang · 2026-06-18T19:43:17Z

Summary:

This is v2 of D108382572. During testing, it was discovered that an optimized scalar loop (this diff) was equal to, and in some cases faster than a vectorized implementation, on both x86 and arm. This diff makes this change and leaves a marker for future contributors interested in optimizing the LZ kernels.

To future contributors: the code attached in D108382572 is neutral to slightly worse on Skylake, Bergamo, and Turin, and significantly worse on Grace. However, this is not a universal fact across the test corpus, where individual file variation is sometimes +/-5% on total compression speed. This likely depends on the data being compressed, but for the general case, it's worse.

Reviewed By: terrelln

Differential Revision: D109051308

meta-codesync · 2026-06-18T19:43:26Z

@Victor-C-Zhang has exported this pull request. If you are a Meta employee, you can view the originating Diff in D109051308.

Summary: This is v2 of D108382572. During testing, it was discovered that an optimized scalar loop (this diff) was equal to, and in some cases faster than a vectorized implementation, on both x86 and arm. This diff makes this change and leaves a marker for future contributors interested in optimizing the LZ kernels. To future contributors: the code attached in D108382572 is neutral to slightly worse on Skylake, Bergamo, and Turin, and significantly worse on Grace. However, this is not a universal fact across the test corpus, where individual file variation is sometimes +/-5% on *total* compression speed. This likely depends on the data being compressed, but for the general case, it's worse. Reviewed By: terrelln Differential Revision: D109051308

meta-codesync · 2026-06-18T22:58:18Z

This pull request has been merged in ea4e4e3.

meta-cla Bot added the cla signed label Jun 18, 2026

meta-codesync Bot added the meta-exported label Jun 18, 2026

meta-codesync Bot changed the title ~~optimize lz matchfinding loop~~ Jun 18, 2026

Victor-C-Zhang force-pushed the export-D109051308 branch from b6a8d8a to b841fa1 Compare June 18, 2026 20:30

meta-codesync Bot closed this in ea4e4e3 Jun 18, 2026

meta-codesync Bot added the Merged label Jun 18, 2026

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

optimize lz matchfinding loop (#826)#826

optimize lz matchfinding loop (#826)#826
Victor-C-Zhang wants to merge 1 commit into
facebook:devfrom
Victor-C-Zhang:export-D109051308

Victor-C-Zhang commented Jun 18, 2026 •

edited by meta-codesync Bot

Loading

meta-codesync Bot commented Jun 18, 2026

meta-codesync Bot commented Jun 18, 2026

Labels

1 participant

Uh oh!

Conversation

Victor-C-Zhang commented Jun 18, 2026 • edited by meta-codesync Bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

meta-codesync Bot commented Jun 18, 2026

meta-codesync Bot commented Jun 18, 2026

Labels

1 participant

Victor-C-Zhang commented Jun 18, 2026 •

edited by meta-codesync Bot

Loading