[rocksdb] Pass file metadata to IngestExternalFile to improve ingestion latency#14835
[rocksdb] Pass file metadata to IngestExternalFile to improve ingestion latency#14835joshkang97 wants to merge 1 commit into
Conversation
✅ clang-tidy: No findings on changed linesCompleted in 373.9s. |
🟡 Codex Code ReviewAuto-triggered after CI reached the early-review threshold — reviewing commit 4c60623 ❌ Codex review failed before producing findings. ℹ️ About this responseGenerated by Codex CLI. Limitations:
Commands:
|
✅ Claude Code ReviewAuto-triggered after CI reached the early-review threshold — reviewing commit 4c60623 SummaryWell-designed optimization that adds a metadata fast-path to external SST file ingestion, achieving ~34% latency reduction. The architecture is sound: opaque High-severity findings (1):
Medium findings (4): Low/nit findings (5): TableProperties copy by value is acceptable, raw pointer lifetime is documented, ABI change is expected, stress test coverage is good, db_bench options usage is correct. The full review with detailed analysis, cross-component tables, and assumption stress-test results is in ℹ️ About this responseGenerated by Claude Code. Limitations:
Commands:
|
Stack from ghstack (oldest at bottom):
External file ingestion (
DB::IngestExternalFile[s]) re-opens every SST file and scans it -- footer, properties, index, filter, and the first/last data blocks -- to recompute the boundary keys, sequence-number bounds, and table properties before committing. For cold, I/O-bound files this scan dominates ingest latency, even when the file is moved/linked rather than copied.This change lets a caller skip that work when it already has the file's metadata.
SstFileWriter::Finish()now populates the fullExternalSstFileInfo(itssmallest_internal_key,largest_internal_key, andtable_properties), and a newIngestExternalFileArg::file_infosfield carries oneExternalSstFileInfoper file intoIngestExternalFiles(). When set, ingestion reuses that metadata instead of re-opening and scanning each file. The file is still copied/linked, and the checksum is still verified whenverify_checksums_before_ingestis set (the fast path opens the file only for that). Range tombstones and user-defined-timestamp files (including the "UDT in Memtables only" format, whose boundary keys carry no timestamp) are handled.Internally, ingestion-job metadata acquisition was split into
GetIngestedFileInfoFromFileInfo(reuse caller metadata) andGetIngestedFileInfoFromFile(open + scan).Benchmark results
db_benchingestexternalfilebenchmark, release build, on SSD (btrfs). Files are linked (move_files) so the measurement isolates the ingest path rather than file-copy throughput.--ingest_external_file_use_file_info=true~34% lower median ingest latency by skipping the per-file open + scan (a warm/page-cache number; the gain is larger for cold files).
Command:
Differential Revision: D107721261
NOTE FOR REVIEWERS: This PR has internal Meta-specific changes or comments, please review them on Phabricator!