Change default compression from Snappy to LZ4#14818
Conversation
Summary:
The historical default block compression `kSnappyCompression` dates to
when Snappy was the obvious fast/cheap choice. On modern server CPUs LZ4
matches or beats Snappy on compression ratio while decompressing
substantially cheaper, so it is a better default. This changes the
default `ColumnFamilyOptions::compression` to `kLZ4Compression`, with a
runtime fallback of LZ4 -> Snappy -> none depending on what is compiled
into the binary (new `GetDefaultCompressionType()` in util/compression.h).
Only column families that do not explicitly set `compression` are
affected (including compaction output when
`CompactionOptions::compression == kDisableCompressionOption`), and only
newly written SST files; existing data is read as before. Doc comments,
the Java bindings, and the sorted_run_builder example are updated to
describe the new default.
Test Plan:
Adjusted two unit tests that implicitly depended on the old Snappy
default to pin `compression = kSnappyCompression`: db_iterator_test's
ReadAhead (readahead byte thresholds assume Snappy-sized files) and
compaction_service_test's CustomFileChecksum (LSM shape after
auto-compaction determined whether a manual CompactRange had work to do).
This change is primarily validated by performance testing. Added a
`compressreject` db_bench benchmark (output buffer sized just below the
predicted compressed size) to measure the cost of attempting then
declining compression, alongside `compress`/`uncompress`. Both db_bench
and sst_dump compress/decompress a modest block at a time, as in a real
workload.
sst_dump on SST files from production workloads (4 files, 16KB blocks,
single thread) on recent server-class AMD, Intel, and ARM CPUs,
LZ4 vs Snappy:
- Compression ratio: comparable; LZ4 is slightly smaller on the more
compressible files (up to ~8%) and within ~2% on the rest.
- Compression (write) CPU: a wash, within ~2% either direction.
- Decompression (read) CPU: the clear win -- Snappy costs ~1.2x-1.5x as
much as LZ4, i.e. LZ4 saves ~25-30% read CPU, consistently across
AMD, Intel, and ARM.
db_bench synthetic workload (100-byte values), at 1 / 12 / 160 threads,
LZ4 vs Snappy:
- Compression throughput: LZ4 ~10-30% higher.
- Decompression throughput: LZ4 much higher, and the advantage grows
with core count -- from ~+12% at 12 threads to ~+45-50% at 160
threads, i.e. better multi-core scaling.
- Rejection (insufficient ratio) path: comparable; LZ4 ~15-18% faster
on compressible-but-rejected blocks and Snappy within ~10% on
barely-compressible blocks. No meaningful regression, confirming
incompressible data is still efficiently detected and stored raw.
✅ clang-tidy: No findings on changed linesCompleted in 432.2s. |
|
@pdillinger has imported this pull request. If you are a Meta employee, you can view this in D107536490. |
|
@pdillinger has imported this pull request. If you are a Meta employee, you can view this in D107536490. |
🟡 Codex Code ReviewAuto-triggered after CI passed — reviewing commit 126d671 ❌ Codex review failed before producing findings. ℹ️ About this responseGenerated by Codex CLI. Limitations:
Commands:
|
✅ Claude Code ReviewAuto-triggered after CI passed — reviewing commit 126d671 SummaryWell-motivated change to switch the default compression from Snappy to LZ4. The core implementation is correct: High-severity findings (0): Full review (click to expand)Findings🔴 HIGHNone. 🟡 MEDIUMM1.
|
| Context | Does code execute? | Assumptions hold? | Action needed? |
|---|---|---|---|
| WritePreparedTxnDB | YES (inherits CF options) | YES (compression is orthogonal to txn visibility) | Safe |
| ReadOnly DB | NO (read-only, doesn't write SSTs) | N/A | Safe |
| CompactionService | YES (uses OPTIONS file) | YES (compression type in OPTIONS file) | Safe, but remote worker needs LZ4 |
| User-defined timestamps | YES | YES (compression independent of UDT) | Safe |
| MemPurge | NO (memtable-to-memtable) | N/A | Safe |
| BlobDB | NO (blob_compression_type is separate) | N/A | Safe (separate config) |
| Old snapshots | N/A (compression is per-write, not per-read) | N/A | Safe |
| FIFO/Universal compaction | YES (inherits CF options) | YES | Safe |
Fallback chain verification:
- LZ4 compiled in, Snappy compiled in →
kLZ4Compression✅ - LZ4 not compiled in, Snappy compiled in →
kSnappyCompression✅ (no regression) - LZ4 compiled in, Snappy not compiled in →
kLZ4Compression✅ - Neither compiled in →
kNoCompression✅ (same as old behavior when Snappy unavailable)
Key debate resolution: Multiple review agents initially flagged the fallback chain as "critical" claiming it drops directly to kNoCompression when LZ4 is unavailable. This was incorrect — the actual GetDefaultCompressionType() implementation properly includes Snappy as an intermediate fallback: LZ4_Supported() ? kLZ4Compression : (Snappy_Supported() ? kSnappyCompression : kNoCompression). On systems without LZ4 but with Snappy, behavior is identical to the old default.
SST format safety: Compression type is stored per-block in the SST block trailer (1 byte at block_data[block_size]). Decompression reads this stored type, never the default. MANIFEST does not store compression type. Existing data is fully readable regardless of the default change.
Positive Observations
- Centralizing the default via
GetDefaultCompressionType()is good design — eliminates the scatteredSnappy_Supported() ? ...pattern. - The fallback chain is well-designed: LZ4 → Snappy → None preserves backward compatibility.
- Test pinning strategy (using
kNoCompressionwith clear comments) is the right approach for test stability. - The
CompressRejectbenchmark fills a genuine gap in db_bench's compression measurement capabilities. - The
UNLIKELYmacros in error paths ofCompressRejectfollow existing conventions. - The release note in
unreleased_history/behavior_changes/is clear and covers the key points. - LZ4 has been supported in RocksDB since ~2015, so forward/backward compatibility is excellent.
ℹ️ About this response
Generated by Claude Code.
Review methodology: claude_md/code_review.md
Limitations:
- Claude may miss context from files not in the diff
- Large PRs may be truncated
- Always apply human judgment to AI suggestions
Commands:
/claude-review [context]— Request a code review/claude-query <question>— Ask about the PR or codebase
|
@pdillinger merged this pull request in 8053b94. |
Summary:
The historical default block compression
kSnappyCompressiondates to when Snappy was the obvious fast/cheap choice. On modern server CPUs LZ4 matches or beats Snappy on compression ratio while decompressing substantially cheaper, so it is a better default. This changes the defaultColumnFamilyOptions::compressiontokLZ4Compression, with a runtime fallback of LZ4 -> Snappy -> none depending on what is compiled into the binary (newGetDefaultCompressionType()in util/compression.h). Only column families that do not explicitly setcompressionare affected (including compaction output whenCompactionOptions::compression == kDisableCompressionOption), and only newly written SST files; existing data is read as before. Doc comments, the Java bindings, and the sorted_run_builder example are updated to describe the new default.Test Plan:
Adjusted two unit tests that implicitly depended on the old Snappy default to pin
compression = kSnappyCompression: db_iterator_test's ReadAhead (readahead byte thresholds assume Snappy-sized files) and compaction_service_test's CustomFileChecksum (LSM shape after auto-compaction determined whether a manual CompactRange had work to do).This change is primarily validated by performance testing. Added a
compressrejectdb_bench benchmark (output buffer sized just below the predicted compressed size) to measure the cost of attempting then declining compression, alongsidecompress/uncompress. Both db_bench and sst_dump compress/decompress a modest block at a time, as in a real workload.sst_dump on SST files from production workloads (4 files, 16KB blocks, single thread) on recent server-class AMD, Intel, and ARM CPUs, LZ4 vs Snappy:
db_bench synthetic workload (100-byte values), at 1 / 12 / 160 threads, LZ4 vs Snappy: