Skip to content

Skip MultiScan cache lookup when block cache is disabled#14785

Closed
xingbowang wants to merge 1 commit into
facebook:mainfrom
xingbowang:codex/multiscan-no-block-cache
Closed

Skip MultiScan cache lookup when block cache is disabled#14785
xingbowang wants to merge 1 commit into
facebook:mainfrom
xingbowang:codex/multiscan-no-block-cache

Conversation

@xingbowang

@xingbowang xingbowang commented May 25, 2026

Copy link
Copy Markdown
Contributor

Summary

  • Return OK from BlockBasedTable::LookupAndPinBlocksInCache when a table has no block cache installed.
  • Teach IODispatcher to detect the missing block cache once and send all requested blocks directly to the read path instead of probing for impossible cache hits.
  • Add IO dispatcher coverage for no-block-cache tables, including verification that SubmitJob skips cache lookups.

Test Plan

CI

@meta-cla meta-cla Bot added the CLA Signed label May 25, 2026
@github-actions

github-actions Bot commented May 25, 2026

Copy link
Copy Markdown

✅ clang-tidy: No findings on changed lines

Completed in 266.9s.

Tables configured without a block cache cannot produce cache hits during MultiScan Prepare, so probing the cache for every candidate block only adds CPU overhead. Teach IODispatcher to detect the missing block cache once and send every requested block directly to the read path.

Keep LookupAndPinBlocksInCache safe for no-cache configurations by returning OK when no block cache is installed. Existing cache-enabled behavior is unchanged: the dispatcher still looks up each block, pins hits, and reads only misses, including when ReadOptions::fill_cache is false.

Tests: built io_dispatcher_test; ran timeout 60 ./io_dispatcher_test --gtest_filter=IODispatcherTest.LookupAndPinBlocksInCacheNoOpsWithoutBlockCache:IODispatcherTest.SubmitJobWithoutBlockCacheSkipsCacheLookup:IODispatcherTest.FillCacheFalseStillLooksUpExistingCacheEntries; ran git diff --check.
@xingbowang xingbowang force-pushed the codex/multiscan-no-block-cache branch from ed91273 to 5ab3614 Compare May 25, 2026 14:45
@xingbowang xingbowang marked this pull request as draft May 25, 2026 14:56
@github-actions

Copy link
Copy Markdown

🟡 Codex Code Review

Auto-triggered after CI passed — reviewing commit 5ab3614


Codex review failed before producing findings.

WARNING: proceeding, even though we could not update PATH: Refusing to create helper binaries under temporary dir "/tmp" (codex_home: AbsolutePathBuf("/tmp/codex-home"))
error: the argument '--base <BRANCH>' cannot be used with '[PROMPT]'

Usage: codex exec review --commit <SHA> --base <BRANCH> --title <TITLE> --model <MODEL> --config <key=value> --dangerously-bypass-approvals-and-sandbox --output-last-message <FILE> [PROMPT]

For more information, try '--help'.

ℹ️ About this response

Generated by Codex CLI.
Review methodology: claude_md/code_review.md

Limitations:

  • Codex may miss context from files not in the diff
  • Large PRs may be truncated
  • Always apply human judgment to AI suggestions

Commands:

  • /codex-review [context] — Request a code review
  • /codex-query <question> — Ask about the PR or codebase
@github-actions

Copy link
Copy Markdown

✅ Claude Code Review

Auto-triggered after CI passed — reviewing commit 5ab3614


Summary

Clean, low-risk PR that fixes a real bug (assertion failure when IODispatcher is used with no_block_cache=true) and adds a sensible optimization to skip N pointless cache lookups. Follows established RocksDB patterns throughout.

High-severity findings (0):

No high-severity findings.

Full review (click to expand)

Findings

🔴 HIGH

None.

🟡 MEDIUM

M1. reserve() over-allocates in common case -- io_dispatcher_imp.cc:708
  • Issue: block_indices_to_read.reserve(job->block_handles.size()) is called unconditionally. In the with-cache path, most blocks may be cache hits, so only a small fraction of indices are added. This over-allocates memory for the common high-cache-hit-rate case.
  • Root cause: The reserve() was added as a blanket optimization but is only truly beneficial for the no-cache path where all indices are added.
  • Suggested fix: Move reserve() inside the if (!has_block_cache) branch only, or use a conditional strategy:
    if (!has_block_cache) {
      block_indices_to_read.reserve(job->block_handles.size());
      // ... populate all indices
    } else {
      // no reserve -- let vector grow dynamically for the (likely small) miss set
    }
M2. Consider LIKELY hint for has_block_cache -- io_dispatcher_imp.cc:711
  • Issue: Most production deployments use a block cache, making has_block_cache == true the overwhelmingly common case. Adding LIKELY(has_block_cache) would help branch prediction on the hot path.
  • Suggested fix: if (LIKELY(has_block_cache)) { ... } else { ... } (swap branch order).

🟢 LOW / NIT

L1. Pre-existing: silent block drop on LookupAndPinBlocksInCache error -- io_dispatcher_imp.cc:717-718
  • Issue: In the original (with-cache) code, if (!s.ok()) { continue; } skips blocks that fail cache lookup -- they are neither cached nor added to the disk-read list, resulting in a silent drop. This is a pre-existing issue, not introduced by this PR, but worth noting.
  • Suggested fix: Consider adding the block index to block_indices_to_read on error so it can be read from disk as a fallback.
L2. Simpler no-cache index fill -- io_dispatcher_imp.cc:712-714
  • Issue: The no-cache loop for (i=0; i<size; ++i) emplace_back(i) is a sequential iota. Could use std::iota or resize+iota for slightly cleaner code.
  • Suggested fix:
    block_indices_to_read.resize(job->block_handles.size());
    std::iota(block_indices_to_read.begin(), block_indices_to_read.end(), size_t{0});
L3. Header documentation for null-cache behavior -- block_based_table_reader.h
  • Issue: LookupAndPinBlocksInCache now has a new contract: returns Status::OK() with null out_parsed_block when no block cache is configured. This should be documented in the header declaration.
  • Suggested fix: Add a brief comment: // Returns OK with empty out_parsed_block when no block cache is configured.

Cross-Component Analysis

Context Executes? Assumptions hold? Action needed?
WritePreparedTxnDB YES (via iterator) YES -- cache config is orthogonal to txn visibility Safe
ReadOnly DB YES YES -- common no-cache scenario Enabled by this PR
CompactionService YES YES -- remote workers often use no_block_cache Enabled by this PR
User-defined timestamps YES YES -- orthogonal Safe
FIFO / Universal compaction YES YES -- compaction style doesn't affect cache config Safe
Prefix seek YES YES -- orthogonal Safe
BlockCacheLookupForReadAheadSize NO when no cache N/A -- guarded by block_cache.get() check at iterator.cc:74 Safe

Assumption stress test results:

  • "Returning OK with null is safe for all callers" -- VERIFIED. Both callers (SubmitJob and BlockCacheLookupForReadAheadSize) correctly handle OK-with-null as cache miss. BlockCacheLookupForReadAheadSize is additionally never called when no cache.
  • "Checking block_cache once is sufficient" -- VERIFIED. rep_->table_options is immutable after table construction.
  • "Dictionary skip is safe" -- VERIFIED. Dictionary is only needed for block_cache.LookupFull() decompression context. No cache means no LookupFull. Normal RetrieveBlock handles its own decompression.

Positive Observations

  • Correct pattern usage: Follows the established if (!block_cache) guard pattern used throughout RocksDB (MaybeReadBlockAndLoadToCache, EraseFromCache, TEST_BlockInCache).
  • Good test design: Three well-structured tests covering: (1) direct API behavior, (2) SubmitJob integration with sync point verification that cache is NOT consulted, (3) regression guard ensuring fill_cache=false still does lookups when cache exists.
  • Sensible optimization: Eliminates N cache key generations + N dictionary lookups + N cache probes that would all return null anyway.
  • Enables real use cases: CompactionService remote workers, embedded systems, and streaming workloads frequently use no_block_cache=true.
  • Appropriate use of TEST_SYNC_POINT: Allows precise verification that the function is entered (or not) without introducing flakiness.

ℹ️ About this response

Generated by Claude Code.
Review methodology: claude_md/code_review.md

Limitations:

  • Claude may miss context from files not in the diff
  • Large PRs may be truncated
  • Always apply human judgment to AI suggestions

Commands:

  • /claude-review [context] — Request a code review
  • /claude-query <question> — Ask about the PR or codebase
@xingbowang xingbowang closed this May 26, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

2 participants