Add OnCompactionPreCommit listener callback#14740
Conversation
Summary: Adds a new `EventListener::OnCompactionPreCommit` callback that fires after a compaction job's output files are written but *before* the manifest write commits the new Version. At that point input files still have `FileMetaData::being_compacted == true`, so listeners that maintain bookkeeping of "files currently being compacted" can clean up that state without racing the compaction picker. A check implemented by a Meta-internal RocksDB user crashes when the same file appears as input to two concurrent compactions. Tracking that set in `OnCompactionBegin` / `OnCompactionCompleted` produces false positives because `Compaction::ReleaseCompactionFiles()` flips `being_compacted` back to false before `OnCompactionCompleted` fires, so another thread can pick the same file and trigger `OnCompactionBegin` before the previous compaction's `Completed` callback runs. Doing the cleanup in `OnCompactionPreCommit` closes that race. Default implementation is a no-op, so no API break. Bonus: `CLAUDE.md` update for when to call `make clean` as I've recently had it get thoroughly confused TWICE mixing build modes. Test Plan: - New unit test in `db/listener_test.cc` checks that `OnCompactionPreCommit` fires strictly between `OnCompactionBegin` and `OnCompactionCompleted` for the same compaction, and that input files still have `being_compacted == true` at the time it fires. - Crash test: adds a concurrent-compaction sanity check to `db_stress`'s listener resembling the Meta-internal intended usage: tracks input file numbers from `OnCompactionBegin` to `OnCompactionPreCommit`, aborting if the same file appears as input to two concurrent compactions. Exercises all compaction styles and trivial-move/FIFO paths under load.
|
@pdillinger has imported this pull request. If you are a Meta employee, you can view this in D105065326. |
|
| Check | Count |
|---|---|
concurrency-mt-unsafe |
1 |
| Total | 1 |
Details
db_stress_tool/db_stress_test_base.cc (1 warning(s))
db_stress_tool/db_stress_test_base.cc:3973:9: warning: function is not thread safe [concurrency-mt-unsafe]
|
@pdillinger has imported this pull request. If you are a Meta employee, you can view this in D105065326. |
Codex Code Review - OBSOLETESuperseded by a newer AI review. Expand to see the original review.🟡 Codex Code ReviewAuto-triggered after CI passed — reviewing commit 33b66e3 ❌ Codex review failed before producing findings. ℹ️ About this responseGenerated by Codex CLI. Limitations:
Commands:
|
Claude Code Review - OBSOLETESuperseded by a newer AI review. Expand to see the original review.✅ Claude Code ReviewAuto-triggered after CI passed — reviewing commit 33b66e3 SummaryWell-designed addition of High-severity findings (0): No high-severity findings. Full review (click to expand)Findings🔴 HIGHNone. 🟡 MEDIUMM1. Java bindings not updated —
|
| Context | Code executes? | Assumptions hold? | Action needed? |
|---|---|---|---|
| BackgroundCompaction (all 5 paths) | YES | YES | All covered by PR |
| CompactFilesImpl | NO | N/A | Consistent — no Begin/Completed either |
| Remote compaction | YES (normal path) | YES | Goes through BackgroundCompaction |
| ReadOnly DB | NO | N/A | No compactions |
| WritePreparedTxnDB | YES | YES | No transaction state interaction |
| FIFO compaction | YES | YES | Dedicated paths covered |
| Trivial move | YES | YES | Covered, fires before PerformTrivialMove |
| Concurrent compactions | YES | YES | being_compacted prevents file reuse during callback |
Ordering guarantee verification:
- Begin sets
notify_on_compaction_completion_ = true→ PreCommit checks this → Completed checks this. Chain is correct. - PreCommit is called before every
ReleaseCompactionFiles(via explicit call sites or safety net). Verified at all 5 paths. - Safety net at end of
BackgroundCompaction(when!compaction_released) ensures PreCommit fires even on error paths after Begin.
Idempotency verification:
notify_on_compaction_pre_commit_called_is set on first call, checked on all subsequent calls. Plain bool is safe because Compaction objects are single-owner (one background thread) and all accesses are undermutex_.- Follows identical pattern to existing
notify_on_compaction_completion_(also a plain bool).
Thread safety verification:
- Mutex is released during callback (
mutex_.Unlock()), butbeing_compactedremains true on input files, preventing any other compaction from picking them. Re-acquired before returning. - Pattern is identical to existing
NotifyOnCompactionBeginandNotifyOnCompactionCompleted.
Positive Observations
- Correct problem identification: The race between
ReleaseCompactionFilesandOnCompactionCompletedis a genuine issue. PreCommit provides a clean solution. - Defensive design: The safety-net fallback ensures PreCommit fires even on unusual error paths, preventing silent missed callbacks.
- Consistent patterns: The implementation closely mirrors existing
NotifyOnCompactionBegin/NotifyOnCompactionCompleted, making it easy to review and maintain. - Good stress test integration: The
db_stress_listener.hchanges exercise the exact race condition the PR fixes, providing ongoing CI coverage. - Clean API: Default no-op virtual method preserves full backward compatibility for existing EventListener subclasses.
- Well-written documentation: The
listener.hdoc comments clearly explain when the callback fires, what state is observable, and how it relates to Begin/Completed.
ℹ️ About this response
Generated by Claude Code.
Review methodology: claude_md/code_review.md
Limitations:
- Claude may miss context from files not in the diff
- Large PRs may be truncated
- Always apply human judgment to AI suggestions
Commands:
/claude-review [context]— Request a code review/claude-query <question>— Ask about the PR or codebase
|
@pdillinger has imported this pull request. If you are a Meta employee, you can view this in D105065326. |
|
@pdillinger has imported this pull request. If you are a Meta employee, you can view this in D105065326. |
Codex Code Review - OBSOLETESuperseded by a newer AI review. Expand to see the original review.🟡 Codex Code ReviewAuto-triggered after CI passed — reviewing commit e35a258 ❌ Codex review failed before producing findings. ℹ️ About this responseGenerated by Codex CLI. Limitations:
Commands:
|
Claude Code Review - OBSOLETESuperseded by a newer AI review. Expand to see the original review.✅ Claude Code ReviewAuto-triggered after CI passed — reviewing commit e35a258 SummaryWell-designed feature that solves a real race condition in compaction listener bookkeeping. The implementation is thorough, covering all compaction paths with a sound idempotency mechanism and safety-net pattern. The refactoring of High-severity findings (1):
Full review (click to expand)Findings🔴 HIGHH1. Shutdown race can cause stress test listener state leak —
|
| Context | Code executes? | Assumptions hold? | Action needed? |
|---|---|---|---|
| WritePreparedTxnDB | YES (compactions run) | YES (callback is about compaction lifecycle, not transactions) | Safe |
| ReadOnly DB | NO (no compactions) | N/A | Safe |
| CompactionService / Remote | YES (Install runs locally) | YES (PreCommit fires before local LogAndApply) | Safe |
| FIFO compaction | YES (deletion + copy paths) | YES (both paths instrumented) | Safe |
| User-defined timestamps | YES | YES (no timestamp-specific logic in callback) | Safe |
| Concurrent writers | YES | YES (mutex held during flag checks) | Safe |
| Old snapshots | YES | YES (no snapshot interaction) | Safe |
Positive Observations
- Thorough coverage: All 4 compaction paths in
BackgroundCompactionare instrumented, plus a safety-net. - Sound idempotency: The per-Compaction boolean flag with mutex-protected access prevents double-firing cleanly.
- Good refactoring: Splitting
PerformTrivialMoveintoPrepareTrivialMoveEdit+CommitTrivialMoveis a clean separation of concerns that enables the notification insertion point. - Excellent stress test: The
db_stress_listener.hchanges provide real-world validation of the race condition fix (tracking files between Begin and PreCommit). - Comprehensive documentation: The
listener.hdoc comments clearly explain timing semantics, caveats for trivial moves/FIFO, and threading model. - Defensive safety-net: The idempotent call before
ReleaseCompactionFilesensures listeners always get notified, even on unusual error paths.
ℹ️ About this response
Generated by Claude Code.
Review methodology: claude_md/code_review.md
Limitations:
- Claude may miss context from files not in the diff
- Large PRs may be truncated
- Always apply human judgment to AI suggestions
Commands:
/claude-review [context]— Request a code review/claude-query <question>— Ask about the PR or codebase
|
@pdillinger has imported this pull request. If you are a Meta employee, you can view this in D105065326. |
🟡 Codex Code ReviewAuto-triggered after CI passed — reviewing commit 74b403a ❌ Codex review failed before producing findings. ℹ️ About this responseGenerated by Codex CLI. Limitations:
Commands:
|
✅ Claude Code ReviewAuto-triggered after CI passed — reviewing commit 74b403a SummaryWell-designed PR that adds a new High-severity findings (0): No high-severity findings. Full review (click to expand)Findings🟡 MEDIUMM1. Missing
|
| Context | Covered by PR? | Assessment |
|---|---|---|
| FIFO deletion compaction | YES (call site at line 4297) | Correct |
| FIFO trivial copy compaction | YES (call site at line 4517) | Correct |
| Trivial move (BackgroundCompaction) | YES (call site at line 4597) | Correct — split into Prepare+Commit enables callback |
| Normal compaction (CompactionJob) | YES (call site at line 4723) | Correct — fires before Install() |
| Safety net (error paths) | YES (call site at line 4749) | Correct — idempotent backstop |
| CompactFilesImpl | NO | Consistent — CompactFilesImpl also skips Begin/Completed |
| ReadOnly DB | N/A | No compactions in ReadOnly mode |
| Remote CompactionService | Via normal path | Works correctly |
| WritePreparedTxnDB | Via normal path | Works correctly |
Positive Observations
- Idempotency design is robust: The
notify_on_compaction_pre_commit_called_flag +ShouldNotifyOnCompactionCompleted()guard ensures exactly-once semantics without complex control flow. Simple and effective. - PerformTrivialMove split is clean: Separating edit preparation from commit is a sensible refactor that enables the callback without changing semantics. Both halves retain
mutex_.AssertHeld()documentation. - Stress test is thorough: The
DbStressListeneradditions (file tracking, ordering verification, deletion-while-compacting check) exercise the callback under realistic concurrent load with all compaction styles. - Documentation quality is good: The
listener.hdocstrings clearly explain timing, state guarantees,ci.statussemantics, and the relationship to Begin/Completed. TheOnCompactionCompleteddocs are also updated with complementary information. - Safety net shows defensive thinking: The backstop call before
ReleaseCompactionFilesensures listeners always get the callback even in unusual error paths, without requiring perfect coverage of every early-exit branch.
ℹ️ About this response
Generated by Claude Code.
Review methodology: claude_md/code_review.md
Limitations:
- Claude may miss context from files not in the diff
- Large PRs may be truncated
- Always apply human judgment to AI suggestions
Commands:
/claude-review [context]— Request a code review/claude-query <question>— Ask about the PR or codebase
|
@pdillinger merged this pull request in 805a476. |
Summary:
Adds a new
EventListener::OnCompactionPreCommitcallback that fires after a compaction job's output files are written but before the manifest write commits the new Version. At that point input files still haveFileMetaData::being_compacted == true, so listeners that maintain bookkeeping of "files currently being compacted" can clean up that state without racing the compaction picker.A check implemented by a Meta-internal RocksDB user crashes when the same file appears as input to two concurrent compactions. Tracking that set in
OnCompactionBegin/OnCompactionCompletedproduces false positives becauseCompaction::ReleaseCompactionFiles()flipsbeing_compactedback to false beforeOnCompactionCompletedfires, so another thread can pick the same file and triggerOnCompactionBeginbefore the previous compaction'sCompletedcallback runs. Doing the cleanup inOnCompactionPreCommitcloses that race. Default implementation is a no-op, so no API break.A trivial refactoring to split PerformTrivialMove ensures data is populated for the new callback, while calling back before the trivial move compaction is committed.
Bonus:
CLAUDE.mdupdate for when to callmake cleanas I've recently had it get thoroughly confused TWICE mixing build modes.Task: T269479969
Test Plan:
db/listener_test.ccchecks thatOnCompactionPreCommitfires strictly betweenOnCompactionBeginandOnCompactionCompletedfor the same compaction, and that input files still havebeing_compacted == trueat the time it fires.db_stress's listener resembling the Meta-internal intended usage: tracks input file numbers fromOnCompactionBegintoOnCompactionPreCommit, aborting if the same file appears as input to two concurrent compactions. Also checks other ordering constraints and checks for "leaks" from possible failure to call OnCompactionPreCommit(). Exercises all compaction styles and trivial-move/FIFO paths under load.