Skip to content

Disable parallel compression for fast built-in compressors#14841

Closed
pdillinger wants to merge 1 commit into
facebook:mainfrom
pdillinger:fast_compressors_no_parallel
Closed

Disable parallel compression for fast built-in compressors#14841
pdillinger wants to merge 1 commit into
facebook:mainfrom
pdillinger:fast_compressors_no_parallel

Conversation

@pdillinger

Copy link
Copy Markdown
Contributor

Summary: Use Compressor::GetRecommendedParallelThreads() overrides to disable parallel compression for Snappy, LZ4 (accelerated, not LZ4HC), and for accelerated levels of ZSTD (level < 0). These do not generally benefit from parallel compression.

Also add --verify_compression option to sst_dump for some basic "recompress" benchmarking with that option.

Test Plan: unit test updated.

In planning the scope of this change, I manually tested some production SST files with release build sst_dump --command=recompress and various settings for --compression_parallel_threads, --compression_types, and --compression_level, while I had the CPUs mostly busy using cache_bench in the background. Here's an example, before the change to override parallel_threads:

$ for PT in 1 8; do /usr/bin/time ./sst_dump --command=recompress --compression_parallel_threads=$PT --block_size=16384 --compression_types=kLZ4Compression --verify_compression=1 test.sst; done
...
Compression: kLZ4Compression          Block Size: 16384  Threads: 1
Cx level: 32767 Cx size:  168501634 Uncx size:  791721664 Ratio:   4.698599 Write usec:    2345894  Read usec:     429022  Cx count:  48661 (100.0%) Not cx for ratio:      0 (  0.0%) Not cx otherwise:      0 (  0.0%)
2.54user 0.29system 0:02.84elapsed 99%CPU (0avgtext+0avgdata 460816maxresident)k
...
Compression: kLZ4Compression          Block Size: 16384  Threads: 8
Cx level: 32767 Cx size:  168501634 Uncx size:  791721664 Ratio:   4.698599 Write usec:    2476459  Read usec:     439464  Cx count:  48661 (100.0%) Not cx for ratio:      0 (  0.0%) Not cx otherwise:      0 (  0.0%)
3.95user 0.33system 0:02.98elapsed 143%CPU (0avgtext+0avgdata 455504maxresident)k

Here as in many of the cases I'm changing, it actually takes longer to compress with parallel, despite the added parallel opportunity of verify_compression. And overall CPU is much higher from 2.54 CPUs to 3.95 CPUs.

The difference disappears with the change, because both use single-threaded SST construction.

Summary: Use Compressor::GetRecommendedParallelThreads() overrides to
disable parallel compression for Snappy, LZ4 (accelerated, not LZ4HC),
and for accelerated levels of ZSTD (level < 0). These do not generally
benefit from parallel compression.

Also add --verify_compression option to sst_dump for some basic
"recompress" benchmarking with that option.

Test Plan: unit test updated.

In planning the scope of this change, I manually tested some production
SST files with release build sst_dump --command=recompress and various settings for
--compression_parallel_threads, --compression_types, and
--compression_level, while I had the CPUs mostly busy using cache_bench
in the background. Here's an example, before the change to override
parallel_threads:

```
$ for PT in 1 8; do /usr/bin/time ./sst_dump --command=recompress --compression_parallel_threads=$PT --block_size=16384 --compression_types=kLZ4Compression --verify_compression=1 test.sst; done
...
Compression: kLZ4Compression          Block Size: 16384  Threads: 1
Cx level: 32767 Cx size:  168501634 Uncx size:  791721664 Ratio:   4.698599 Write usec:    2345894  Read usec:     429022  Cx count:  48661 (100.0%) Not cx for ratio:      0 (  0.0%) Not cx otherwise:      0 (  0.0%)
2.54user 0.29system 0:02.84elapsed 99%CPU (0avgtext+0avgdata 460816maxresident)k
...
Compression: kLZ4Compression          Block Size: 16384  Threads: 8
Cx level: 32767 Cx size:  168501634 Uncx size:  791721664 Ratio:   4.698599 Write usec:    2476459  Read usec:     439464  Cx count:  48661 (100.0%) Not cx for ratio:      0 (  0.0%) Not cx otherwise:      0 (  0.0%)
3.95user 0.33system 0:02.98elapsed 143%CPU (0avgtext+0avgdata 455504maxresident)k
```

Here as in many of the cases I'm changing, it actually takes longer to compress
with parallel, despite the added parallel opportunity of verify_compression.
And overall CPU is much higher from 2.54 CPU*s to 3.95 CPU*s.

The difference disappears with the change, because both use
single-threaded SST construction.
@pdillinger pdillinger requested a review from joshkang97 June 9, 2026 22:27
@meta-cla meta-cla Bot added the CLA Signed label Jun 9, 2026
@github-actions

github-actions Bot commented Jun 9, 2026

Copy link
Copy Markdown

⚠️ clang-tidy: 1 warning(s) on changed lines

Completed in 171.4s.

Summary by check

Check Count
modernize-make-shared 1
Total 1

Details

tools/sst_dump_test.cc (1 warning(s))
tools/sst_dump_test.cc:287:22: warning: use std::make_shared instead [modernize-make-shared]
@meta-codesync

meta-codesync Bot commented Jun 9, 2026

Copy link
Copy Markdown

@pdillinger has imported this pull request. If you are a Meta employee, you can view this in D108075938.

@github-actions

github-actions Bot commented Jun 9, 2026

Copy link
Copy Markdown

🟡 Codex Code Review

Auto-triggered after CI passed — reviewing commit b1b0162


Codex review failed before producing findings.

WARNING: proceeding, even though we could not create PATH aliases: Refusing to create helper binaries under temporary dir "/tmp" (codex_home: AbsolutePathBuf("/tmp/codex-home"))
error: the argument '--base <BRANCH>' cannot be used with '[PROMPT]'

Usage: codex exec review --commit <SHA> --base <BRANCH> --title <TITLE> --model <MODEL> --config <key=value> --dangerously-bypass-approvals-and-sandbox --output-last-message <FILE> [PROMPT]

For more information, try '--help'.

ℹ️ About this response

Generated by Codex CLI.
Review methodology: claude_md/code_review.md

Limitations:

  • Codex may miss context from files not in the diff
  • Large PRs may be truncated
  • Always apply human judgment to AI suggestions

Commands:

  • /codex-review [context] — Request a code review
  • /codex-query <question> — Ask about the PR or codebase
@github-actions

github-actions Bot commented Jun 9, 2026

Copy link
Copy Markdown

✅ Claude Code Review

Auto-triggered after CI passed — reviewing commit b1b0162


Summary

Clean, well-scoped change that correctly leverages the existing GetRecommendedParallelThreads() infrastructure to disable parallel compression for fast algorithms. The implementation is correct, thread-safe, and follows existing patterns. No high-severity findings.

High-severity findings (0):
No high-severity findings.

Full review (click to expand)

Findings

🔴 HIGH

None.

🟡 MEDIUM

M1. Test gap: ZSTD edge cases not covered -- util/compression_test.cc
  • Issue: The updated test verifies ZSTD with level=3 (non-accelerated, keeps parallel) and level=-1 (accelerated, overrides to 1), but does not test level=0 or level=kDefaultCompressionLevel (32767). Both should NOT trigger the override (0 is not < 0, and 32767 is not < 0), but explicit test coverage would prevent regressions.
  • Root cause: Test only covers one positive and one negative case for ZSTD level threshold.
  • Suggested fix: Add test cases for level=0 and level=kDefaultCompressionLevel:
    for (int level : {0, CompressionOptions::kDefaultCompressionLevel}) {
      opts.level = level;
      opts.parallel_threads = 8;
      auto compressor = mgr->GetCompressor(opts, kZSTD);
      if (compressor) {
        ASSERT_EQ(compressor->GetRecommendedParallelThreads(), 8U);
      }
    }
M2. Test gap: LZ4HC non-override not explicitly verified -- util/compression_test.cc
  • Issue: The test loop includes kLZ4HCCompression in the iteration and expects it to return 8 (correct), but there's no explicit comment or assertion calling out that LZ4HC intentionally does NOT get the override. A future maintainer might accidentally add an override to LZ4HC's separate class (BuiltinLZ4HCCompressorV2 at compression.cc:918) without realizing it's intentionally excluded.
  • Root cause: LZ4HC's non-override behavior is implicitly tested but not explicitly documented in the test.
  • Suggested fix: Add a comment in the test or a separate explicit assertion for LZ4HC:
    // LZ4HC is intentionally NOT overridden (it's not a "fast" compressor)

🟢 LOW / NIT

L1. Silent user configuration override -- behavior change documentation
  • Issue: Users who explicitly set parallel_threads > 1 with Snappy/LZ4 will have their setting silently ignored. The release note and db_bench/sst_dump help text mention this, which is appropriate. However, no runtime log or warning is emitted.
  • Suggested fix: This is a design choice, not a bug. The release note adequately documents it. Consider adding an ROCKS_LOG_INFO when the override takes effect, but this is optional and may add noise.
L2. NIT: Comment wording for LZ4 -- util/compression.cc:764
  • Issue: The comment says "LZ4 (accelerated, not LZ4HC)" which could be misread as "only accelerated LZ4" when it means "LZ4 the fast algorithm, as opposed to LZ4HC". The code returns 1 unconditionally for all LZ4 (correct behavior since all LZ4 is fast).
  • Suggested fix: Consider rewording to "LZ4 (the fast variant, not LZ4HC)" for clarity. Minor nit.
L3. sst_dump test only verifies non-crash -- tools/sst_dump_test.cc
  • Issue: The new VerifyCompression test only checks that tool.Run() succeeds with --verify_compression=1 and --verify_compression=0. It doesn't verify that verification actually happened (e.g., by checking output or statistics).
  • Root cause: This is a basic smoke test, consistent with the existing CompressedSizes test pattern.
  • Suggested fix: Acceptable as-is for a CLI tool test. Deeper verification would require output parsing.

Cross-Component Analysis

Context Does code execute? Assumptions hold? Action needed?
WritePreparedTxnDB Yes (SST building) Yes (compression independent of txn visibility) None
ReadOnly DB No (no SST building) N/A None
CompactionService Yes (uses same table builder) Yes (override happens at compressor level, not serialized) None
User-defined timestamps Yes Yes (no interaction with compression threading) None
CompressorWrapper Yes (forwards to wrapped) Yes (custom wrappers can override) None
FIFO/Universal compaction Yes Yes (same table builder path) None

CompressorWrapper behavior: CompressorWrapper::GetRecommendedParallelThreads() (advanced_compression.h:545) forwards to the wrapped compressor. A wrapper around a fast compressor inherits return 1. Custom wrappers can override by implementing their own GetRecommendedParallelThreads(). Custom CompressionManager can also create entirely custom compressors. The release note claim that "this behavior can be overridden with a custom Compressor from a custom CompressionManager" is accurate.

ZSTD kDefaultCompressionLevel (32767): The override checks opts_.level < 0. kDefaultCompressionLevel = 32767 is not < 0, so it correctly falls through to the base class. The mapping to ZSTD_CLEVEL_DEFAULT(3) only happens in the compression code (compression.cc:1104-1107), not in the parallel threads logic. This is correct behavior.

Positive Observations

  • Leverages existing GetRecommendedParallelThreads() infrastructure exactly as designed
  • Clean separation: Snappy/LZ4 unconditionally return 1, ZSTD conditionally checks level
  • LZ4HC correctly excluded via separate class hierarchy (BuiltinLZ4HCCompressorV2)
  • BuiltinLZ4CompressorV2NoDict correctly inherits override from parent WithDict class
  • Hard structural constraints in table builder (block_based_table_builder.cc:1223-1227) still override compressor recommendations when needed
  • sst_dump --verify_compression follows existing patterns for CLI option parsing
  • Empirical benchmark data provided in PR description validates the approach

ℹ️ About this response

Generated by Claude Code.
Review methodology: claude_md/code_review.md

Limitations:

  • Claude may miss context from files not in the diff
  • Large PRs may be truncated
  • Always apply human judgment to AI suggestions

Commands:

  • /claude-review [context] — Request a code review
  • /claude-query <question> — Ask about the PR or codebase

@joshkang97 joshkang97 left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Review automatically exported from Phabricator review in Meta.

@meta-codesync meta-codesync Bot closed this in 01084e7 Jun 10, 2026
@meta-codesync meta-codesync Bot added the Merged label Jun 10, 2026
@meta-codesync

meta-codesync Bot commented Jun 10, 2026

Copy link
Copy Markdown

@pdillinger merged this pull request in 01084e7.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

2 participants