Add reverse order support to MultiScan (#14876)#14876
Conversation
|
@jaykorean has exported this pull request. If you are a Meta employee, you can view the originating Diff in D109357730. |
✅ clang-tidy: No findings on changed linesCompleted in 548.0s. |
Summary: Add reverse scan support to `MultiScanArgs`/`MultiScan` so callers can iterate bounded scan ranges in descending order. Plumb reverse scan state through `DBIter`, `BlockBasedTableIterator`, and `MultiScanIndexIterator`, including reverse prefetch windowing and async I/O support. Update `db_bench` `multiscanrandom` to honor `--reverse_iterator` and report rows scanned so reverse MultiScan and iterator baselines can be compared on row throughput. Add `db_stress --multiscan_reverse` coverage so the prepared MultiScan path can be stressed with `SeekForPrev()`/`Prev()` against a control iterator. During review I fixed the table child iterator path to accept reverse `SeekForPrev()` targets above a file's prepared ranges. `MergingIterator` and `LevelIterator` can legally seek every child to the current range limit; a child file may only have lower prepared ranges, so treating that as invalid caused intermittent `SeekForPrev target is outside prepared ranges` exceptions in the benchmark. Differential Revision: D109357730
0b40fae to
659a490
Compare
Summary: Add reverse scan support to `MultiScanArgs`/`MultiScan` so callers can iterate bounded scan ranges in descending order. Plumb reverse scan state through `DBIter`, `BlockBasedTableIterator`, and `MultiScanIndexIterator`, including reverse prefetch windowing and async I/O support. Update `db_bench` `multiscanrandom` to honor `--reverse_iterator` and report rows scanned so reverse MultiScan and iterator baselines can be compared on row throughput. Add `db_stress --multiscan_reverse` coverage so the prepared MultiScan path can be stressed with `SeekForPrev()`/`Prev()` against a control iterator. During review I fixed the table child iterator path to accept reverse `SeekForPrev()` targets above a file's prepared ranges. `MergingIterator` and `LevelIterator` can legally seek every child to the current range limit; a child file may only have lower prepared ranges, so treating that as invalid caused intermittent `SeekForPrev target is outside prepared ranges` exceptions in the benchmark. Differential Revision: D109357730
659a490 to
d253e4d
Compare
Codex Code Review - OBSOLETESuperseded by a newer AI review. Expand to see the original review.🟡 Codex Code ReviewAuto-triggered after CI passed — reviewing commit d253e4d ❌ Codex review failed before producing findings. ℹ️ About this responseGenerated by Codex CLI. Limitations:
Commands:
|
Claude Code Review - OBSOLETESuperseded by a newer AI review. Expand to see the original review.✅ Claude Code ReviewAuto-triggered after CI passed — reviewing commit d253e4d SummaryThis PR adds reverse scan support to MultiScan, plumbing reverse state through DBIter, BlockBasedTableIterator, and MultiScanIndexIterator. The implementation is well-structured and correctly mirrors the forward path with appropriate reverse semantics. Good test coverage including unit, table-reader, stress, and benchmark integration. High-severity findings (0): No high-severity findings. Full review (click to expand)Findings🟡 MEDIUMM1. Reverse
|
| Context | Executes? | Assumptions hold? | Notes |
|---|---|---|---|
| WritePreparedTxnDB | Yes | Yes | read_callback_ handles visibility |
| User-defined timestamps | Yes | Yes | CompareWithoutTimestamp used throughout |
| ReadOnly DB | Yes | Yes | Read-only, no write assumptions |
| Prefix seek | No | N/A | MultiScan requires total_order_seek |
| Multi-level (LevelIterator) | Yes | Yes | SeekForPrev accepts above-range targets |
| Concurrent writers | Yes | Yes | Snapshot isolation via ReadOptions |
| FIFO/Universal compaction | Yes | Yes | No compaction-specific assumptions |
| BlobDB | Yes | Yes | DBIter handles blob dereferencing |
Positive Observations
- Clean architectural layering: DBIter drives range transitions, MultiScanIndexIterator handles per-file block navigation, BlockBasedTableIterator bridges the two.
- The
prefetch_start_idx_/prefetch_max_idx_windowing for reverse prefetch is well-designed, prefetching from the end of blocks. IsPrefetchedBlock()helper consolidates range checks and prevents off-by-one errors.- ReadSet index mapping
(block_idx - prefetch_start_idx_)correctly translates indices. - Comprehensive validation: reverse requires upper bounds, Seek blocked on reverse, SeekForPrev blocked on forward.
- Good defensive fix: accepting SeekForPrev targets above a file's prepared ranges handles MergingIterator/LevelIterator edge cases.
- Strong test coverage: unit test with 3 ranges across multiple files, negative test for missing bounds, table-reader-level test, stress test with control iterator, db_bench integration.
ℹ️ About this response
Generated by Claude Code.
Review methodology: claude_md/code_review.md
Limitations:
- Claude may miss context from files not in the diff
- Large PRs may be truncated
- Always apply human judgment to AI suggestions
Commands:
/claude-review [context]— Request a code review/claude-query <question>— Ask about the PR or codebase
Summary: Add reverse scan support to `MultiScanArgs`/`MultiScan` so callers can iterate bounded scan ranges in descending order. Plumb reverse scan state through `DBIter`, `BlockBasedTableIterator`, and `MultiScanIndexIterator`, including reverse prefetch windowing and async I/O support. Update `db_bench` `multiscanrandom` to honor `--reverse_iterator` and report rows scanned so reverse MultiScan and iterator baselines can be compared on row throughput. Add `db_stress --multiscan_reverse` coverage so the prepared MultiScan path can be stressed with `SeekForPrev()`/`Prev()` against a control iterator. During review I fixed the table child iterator path to accept reverse `SeekForPrev()` targets above a file's prepared ranges. `MergingIterator` and `LevelIterator` can legally seek every child to the current range limit; a child file may only have lower prepared ranges, so treating that as invalid caused intermittent `SeekForPrev target is outside prepared ranges` exceptions in the benchmark. Differential Revision: D109357730
d253e4d to
828c9e8
Compare
Codex Code Review - OBSOLETESuperseded by a newer AI review. Expand to see the original review.🟡 Codex Code ReviewAuto-triggered after CI passed — reviewing commit 828c9e8 ❌ Codex review failed before producing findings. ℹ️ About this responseGenerated by Codex CLI. Limitations:
Commands:
|
Claude Code Review - OBSOLETESuperseded by a newer AI review. Expand to see the original review.✅ Claude Code ReviewAuto-triggered after CI passed — reviewing commit 828c9e8 SummaryThis PR adds reverse scan support to MultiScan, plumbing the High-severity findings (1):
Full review (click to expand)Findings🔴 HIGHH1.
|
| Context | Does code execute? | Assumptions hold? | Action needed? |
|---|---|---|---|
| Multi-file scans (LevelIterator) | YES | NO -- CopyConfigFrom missing reverse |
Fix H1 |
| WritePreparedTxnDB | YES | YES -- visibility handled by read_callback | None |
| ReadOnly DB | YES | YES -- no writes needed | None |
| User-defined timestamps | YES | YES -- CompareWithoutTimestamp used consistently |
None |
| Prefix seek | Possible | Untested -- SeekForPrev + prefix extractor | Worth testing |
| FIFO/Universal compaction | YES | YES -- scan is read-only | None |
| Async IO reverse | YES | Partially -- no async reverse test | Add test |
Positive Observations
- Clean separation of forward/reverse logic at each layer
- Good use of
valid_flag instead of sentinel index values for termination - Proper validation in
DBIter::SeekForPrevthat blocks cross-direction calls ValidateScanOptionscorrectly requires upper bounds for reverse- Stress test includes divergence detection with detailed error output
- db_bench reports
rows_scannedfor meaningful performance comparison - The fix for SeekForPrev targets above a file's prepared ranges (gap handling) is correctly implemented in
MultiScanIndexIterator::SeekForPrev - Block double-release safety verified:
ReadSet::ReleaseBlock,CachableEntry::Reset, andReleaseAsyncIOForBlockare all idempotent - Copy/move constructors and assignment operators all properly copy the
reversefield
ℹ️ About this response
Generated by Claude Code.
Review methodology: claude_md/code_review.md
Limitations:
- Claude may miss context from files not in the diff
- Large PRs may be truncated
- Always apply human judgment to AI suggestions
Commands:
/claude-review [context]— Request a code review/claude-query <question>— Ask about the PR or codebase
Summary: Add reverse scan support to `MultiScanArgs`/`MultiScan` so callers can iterate bounded scan ranges in descending order. Plumb reverse scan state through `DBIter`, `BlockBasedTableIterator`, and `MultiScanIndexIterator`, including reverse prefetch windowing and async I/O support. Update `db_bench` `multiscanrandom` to honor `--reverse_iterator` and report rows scanned so reverse MultiScan and iterator baselines can be compared on row throughput. Add `db_stress --multiscan_reverse` coverage so the prepared MultiScan path can be stressed with `SeekForPrev()`/`Prev()` against a control iterator. Also fix the table child iterator path to accept reverse `SeekForPrev()` targets above a file's prepared ranges. `MergingIterator` and `LevelIterator` can legally seek every child to the current range limit; a child file may only have lower prepared ranges, so treating that as invalid caused intermittent `SeekForPrev target is outside prepared ranges` exceptions in the benchmark. Differential Revision: D109357730
828c9e8 to
41a36f0
Compare
🟡 Codex Code ReviewAuto-triggered after CI passed — reviewing commit 41a36f0 ❌ Codex review failed before producing findings. ℹ️ About this responseGenerated by Codex CLI. Limitations:
Commands:
|
✅ Claude Code ReviewAuto-triggered after CI passed — reviewing commit 41a36f0 SummarySolid implementation of reverse MultiScan plumbing through the iterator stack. The core design (bool reverse, SeekForPrev+Prev, ascending range specification) is consistent with RocksDB patterns. One high-severity bug found: High-severity findings (1):
Full review (click to expand)Findings🔴 HIGHH1.
|
| Context | Does code execute? | Assumptions hold? | Action needed? |
|---|---|---|---|
| WritePreparedTxnDB | YES | YES | Safe |
| ReadOnly DB | YES | YES | Safe |
| User-defined timestamps | YES | YES | Safe |
| Trie index | YES | YES — kInbound correct for Prev | Safe |
| LevelIterator multi-file | YES | NO — CopyConfigFrom bug (H1) | Must fix |
| FIFO/Universal compaction | YES | YES | Safe |
Positive Observations
- Clean separation of forward/reverse logic with clear
scan_opts_->reversegates - Thorough validation in DBIter matching the forward path's pattern
- Well-designed reverse prefetch window calculation
- Proper stress test verification with control iterator
- Sound reasoning for TrieIndex PrevAndGetResult kInbound change
ℹ️ About this response
Generated by Claude Code.
Review methodology: claude_md/code_review.md
Limitations:
- Claude may miss context from files not in the diff
- Large PRs may be truncated
- Always apply human judgment to AI suggestions
Commands:
/claude-review [context]— Request a code review/claude-query <question>— Ask about the PR or codebase
Summary:
Add reverse scan support to
MultiScanArgs/MultiScanso callers can iterate bounded scan ranges in descending order. Plumb reverse scan state throughDBIter,BlockBasedTableIterator, andMultiScanIndexIterator, including reverse prefetch windowing and async I/O support. Updatedb_benchmultiscanrandomto honor--reverse_iteratorand report rows scanned so reverse MultiScan and iterator baselines can be compared on row throughput.Add
db_stress --multiscan_reversecoverage so the prepared MultiScan path can be stressed withSeekForPrev()/Prev()against a control iterator.Also fix the table child iterator path to accept reverse
SeekForPrev()targets above a file's prepared ranges.MergingIteratorandLevelIteratorcan legally seek every child to the current range limit; a child file may only have lower prepared ranges, so treating that as invalid caused intermittentSeekForPrev target is outside prepared rangesexceptions in the benchmark.Differential Revision: D109357730