Skip to content

[rocksdb] Pass file metadata to IngestExternalFile to improve ingestion latency#14835

Closed
joshkang97 wants to merge 1 commit into
gh/joshkang97/7/basefrom
gh/joshkang97/7/head
Closed

[rocksdb] Pass file metadata to IngestExternalFile to improve ingestion latency#14835
joshkang97 wants to merge 1 commit into
gh/joshkang97/7/basefrom
gh/joshkang97/7/head

Conversation

@joshkang97

@joshkang97 joshkang97 commented Jun 8, 2026

Copy link
Copy Markdown
Contributor

Stack from ghstack (oldest at bottom):

External file ingestion (DB::IngestExternalFile[s]) re-opens every SST file and scans it -- footer, properties, index, filter, and the first/last data blocks -- to recompute the boundary keys, sequence-number bounds, and table properties before committing. For cold, I/O-bound files this scan dominates ingest latency, even when the file is moved/linked rather than copied.

This change lets a caller skip that work when it already has the file's metadata. SstFileWriter::Finish() now populates the full ExternalSstFileInfo (its smallest_internal_key, largest_internal_key, and table_properties), and a new IngestExternalFileArg::file_infos field carries one ExternalSstFileInfo per file into IngestExternalFiles(). When set, ingestion reuses that metadata instead of re-opening and scanning each file. The file is still copied/linked, and the checksum is still verified when verify_checksums_before_ingest is set (the fast path opens the file only for that). Range tombstones and user-defined-timestamp files (including the "UDT in Memtables only" format, whose boundary keys carry no timestamp) are handled.

Internally, ingestion-job metadata acquisition was split into GetIngestedFileInfoFromFileInfo (reuse caller metadata) and GetIngestedFileInfoFromFile (open + scan).

Benchmark results

db_bench ingestexternalfile benchmark, release build, on SSD (btrfs). Files are linked (move_files) so the measurement isolates the ingest path rather than file-copy throughput.

Config Ingest latency / call (P50)
Baseline (open + scan each file) 28,111 us
--ingest_external_file_use_file_info=true 18,444 us

~34% lower median ingest latency by skipping the per-file open + scan (a warm/page-cache number; the gain is larger for cold files).

Command:

make clean && DEBUG_LEVEL=0 make -j$(nproc) db_bench
B=/data/$USER/ingest_bench
for fi in false true; do
  rm -rf $B
  ./db_bench --benchmarks=ingestexternalfile --num=1000000 --value_size=100 \
    --compression_type=none --ingest_external_file_batch_size=10 \
    --ingest_external_file_num_batches=10 --disable_auto_compactions=1 \
    --statistics --db=$B --ingest_external_file_use_file_info=$fi 2>&1 \
    | grep rocksdb.ingest.external.file.micros
done

Differential Revision: D107721261

NOTE FOR REVIEWERS: This PR has internal Meta-specific changes or comments, please review them on Phabricator!

[ghstack-poisoned]
@github-actions

github-actions Bot commented Jun 8, 2026

Copy link
Copy Markdown

✅ clang-tidy: No findings on changed lines

Completed in 373.9s.

@github-actions

github-actions Bot commented Jun 9, 2026

Copy link
Copy Markdown

🟡 Codex Code Review

Auto-triggered after CI reached the early-review threshold — reviewing commit 4c60623


Codex review failed before producing findings.

WARNING: proceeding, even though we could not create PATH aliases: Refusing to create helper binaries under temporary dir "/tmp" (codex_home: AbsolutePathBuf("/tmp/codex-home"))
error: the argument '--base <BRANCH>' cannot be used with '[PROMPT]'

Usage: codex exec review --commit <SHA> --base <BRANCH> --title <TITLE> --model <MODEL> --config <key=value> --dangerously-bypass-approvals-and-sandbox --output-last-message <FILE> [PROMPT]

For more information, try '--help'.

ℹ️ About this response

Generated by Codex CLI.
Review methodology: claude_md/code_review.md

Limitations:

  • Codex may miss context from files not in the diff
  • Large PRs may be truncated
  • Always apply human judgment to AI suggestions

Commands:

  • /codex-review [context] — Request a code review
  • /codex-query <question> — Ask about the PR or codebase
@github-actions

github-actions Bot commented Jun 9, 2026

Copy link
Copy Markdown

✅ Claude Code Review

Auto-triggered after CI reached the early-review threshold — reviewing commit 4c60623


Summary

Well-designed optimization that adds a metadata fast-path to external SST file ingestion, achieving ~34% latency reduction. The architecture is sound: opaque PreparedFileInfo from SstFileWriter::Finish() is reused by IngestExternalFiles() to skip re-opening/scanning files. One high-severity bug found in the fast-path for range-deletion-only files with user-defined timestamps.

High-severity findings (1):

  • [external_sst_file_ingestion_job.cc: GetIngestedFileInfoFromFileInfo] UDT padding runs before range-tombstone bound extension; crashes on range-deletion-only files when UDT is enabled with persist=false. PadInternalKeyWithMinTimestamp asserts key.size() >= kNumInternalBytes (db/dbformat.cc:122), which fails on empty keys. Fix: move UDT padding after MaybeUpdateRange, and guard with an emptiness check.

Medium findings (4): global_seqno_offset validation relaxed for both paths (intentional but warrants a comment), no TOCTOU guard on PreparedFileInfo vs actual file, missing test for range-deletion-only file via fast path, missing test for UDT with fast path.

Low/nit findings (5): TableProperties copy by value is acceptable, raw pointer lifetime is documented, ABI change is expected, stress test coverage is good, db_bench options usage is correct.

The full review with detailed analysis, cross-component tables, and assumption stress-test results is in review-findings.md.


ℹ️ About this response

Generated by Claude Code.
Review methodology: claude_md/code_review.md

Limitations:

  • Claude may miss context from files not in the diff
  • Large PRs may be truncated
  • Always apply human judgment to AI suggestions

Commands:

  • /claude-review [context] — Request a code review
  • /claude-query <question> — Ask about the PR or codebase
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

1 participant