Add async I/O release UAF regression coverage#14844
Conversation
✅ clang-tidy: No findings on changed linesCompleted in 82.2s. |
Add a controlled async filesystem regression test that models io_uring completing a released async read while polling another read. Without the ReleaseBlock async-cancellation fix, the test reproduces the stale callback as an ASAN heap-use-after-free in RandomAccessFileReader::ReadAsyncCallback; with the fix, the released request is aborted before Poll can complete it. Add MultiScan async IO stress coverage Have db_stress sometimes abandon a prepared MultiScan after --num_iterations results, which exercises cleanup of prefetched blocks when callers do not drain all ranges. Re-enable randomized --multiscan_use_async_io in db_crashtest so crash stress can cover the async MultiScan path again.
269b8ba to
d668ad8
Compare
|
@xingbowang has imported this pull request. If you are a Meta employee, you can view this in D108191370. |
🟡 Codex Code ReviewAuto-triggered after CI passed — reviewing commit d668ad8 ❌ Codex review failed before producing findings. ℹ️ About this responseGenerated by Codex CLI. Limitations:
Commands:
|
✅ Claude Code ReviewAuto-triggered after CI passed — reviewing commit d668ad8 SummaryWell-structured PR adding valuable regression test coverage for async I/O use-after-free bugs. The High-severity findings (0): No high-severity findings. Full review (click to expand)Findings🟡 MEDIUMM1. ControlledAsyncFS::AbortIO skips callback for handles not in pending list --
|
| Context | Affected? | Assessment |
|---|---|---|
| No io_uring | YES | Tests properly skip with ROCKSDB_GTEST_SKIP |
| Production code | NO | All changes are test-only |
| Sync fallback | Partially | ControlledAsyncRandomAccessFile doesn't track MultiRead (L1) |
| db_stress early exit | YES | Correctness verified -- cmp_iter stays synchronized via Seek on next range |
| Handle lifecycle | OK | del_fn is called by ReadSet::DeleteAsyncIOHandle() after AbortIO |
Assumption stress-test:
- Claim: "ReleaseBlock(0) ensures Poll() never completes block 0." Verified:
ReleaseAsyncIOForBlockcallsfs_->AbortIO()which removes handle frompending_handles_, thenDeleteAsyncIOHandlecallsdel_fn. SubsequentPoll()cannot find the handle. - Claim: "Early exit doesn't cause verification failures." Verified: break happens before
iter->Next(), so both iterators are at the same verified position. Next range re-Seeks both.
Positive Observations
- The
ControlledAsyncFSdesign is elegant -- it uses real file reads for data while controlling only the async scheduling, avoiding the complexity of mocking file content. - The
complete_stray_first_flag is a clever way to deterministically test io_uring's out-of-order completion behavior. - The
CreateAndOpenSSTsignature extension withenv_overrideas the last default parameter is backward-compatible and clean. - The early-exit op_logs (
"E"for abandon,"R"for range-skip) provide good debugging context. - The test properly separates two UAF scenarios (released block completed by stray vs. skipped range remainder blocking later read).
ℹ️ About this response
Generated by Claude Code.
Review methodology: claude_md/code_review.md
Limitations:
- Claude may miss context from files not in the diff
- Large PRs may be truncated
- Always apply human judgment to AI suggestions
Commands:
/claude-review [context]— Request a code review/claude-query <question>— Ask about the PR or codebase
|
@xingbowang merged this pull request in 4975341. |
Summary
Poll()can complete them.--num_iterationsresults and randomize--multiscan_use_async_ioagain.Test Plan
CI