feat(write_thread): add writer state transition validation guards#14904
Open
darion-yaphet wants to merge 1 commit into
Open
feat(write_thread): add writer state transition validation guards#14904darion-yaphet wants to merge 1 commit into
darion-yaphet wants to merge 1 commit into
Conversation
Introduce IsValidStateTransition(), a static helper that encodes the writer state machine as executable rules, and assert on it in SetState() on both the fast atomic-publish path and the STATE_LOCKED_WAITING slow path. This catches illegal state transitions in debug and stress builds while leaving the lock-free handoff protocol and release-build behavior unchanged. Document the full set of allowed transitions and the STATE_LOCKED_WAITING wait-marker semantics in the header next to the state enum.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
Every writer state transition in
WriteThreadis funneled throughSetState(), but until now the set of legal transitions was only implicit — encoded indirectly in the control flow ofJoinBatchGroup,ExitAsBatchGroupLeader, the parallel-memtable path, and a handful of scattered comments. There was no single place that stated "a writer in state X may only move to Y or Z," which makes the lock-free handoff protocol harder to reason about and easy to break silently during refactors. A bad transition would not fail loudly; it would just corrupt the writer state machine and surface later as a hang or a mysterious assertion far from the root cause.This PR makes the writer state machine explicit and self-checking, without changing any runtime behavior.
IsValidStateTransition(old_state, new_state), astatichelper that encodes the allowed writer state transitions as executable rules. It rejects no-op transitions (old == new), rejects publishing the internal-only states (STATE_INIT,STATE_LOCKED_WAITING) as a destination, and otherwise enforces the per-state transition table via aswitchonold_state.SetState()on both paths: once before the fast atomic-publishcompare_exchange_strong, and again on theSTATE_LOCKED_WAITINGslow path after the observed state has been re-read. The slow-path check matters because the waker may overwrite the observed state between the initial relaxed load and taking the mutex.write_thread.h, next to the state enum, including the special role ofSTATE_LOCKED_WAITINGas a temporary wait marker installed byBlockingAwaitState(). Because that marker hides the writer's prior state from the waker,SetState()accepts any normal wake-up state out of it — the comment now spells this out so the asymmetry is not mistaken for a bug.Why debug-only
IsValidStateTransition()is only ever called fromassert(), so it compiles out entirely in release builds (NDEBUG/-fno-rtti).SetState()is on the hot write path, hit thousands of times per second under load, so paying zero cost in production is deliberate. The value is concentrated where it belongs: debug, ASAN/TSAN, and stress builds, where an illegal transition now fails immediately at the offending call site instead of much later.What this does not change
SetState().IsValidStateTransition()is a private static helper.Test Plan
make check(debug) passes.ASSERT_STATUS_CHECKED=1 make checkpasses.COERCE_CONTEXT_SWITCH=1stress runs of the write-path tests (db_write_test,write_batch_test,db_test) to drive the new assertions under contention and exercise theSTATE_LOCKED_WAITINGslow path.NDEBUGis set.