Per-StressTest fault injection with env/fs cleanup (#14750)#14757
Conversation
|
@hx235 has exported this pull request. If you are a Meta employee, you can view the originating Diff in D104959945. |
|
| Check | Count |
|---|---|
concurrency-mt-unsafe |
2 |
cppcoreguidelines-avoid-non-const-global-variables |
4 |
| Total | 6 |
Details
db_stress_tool/db_stress_common.cc (2 warning(s))
db_stress_tool/db_stress_common.cc:22:25: warning: variable 'raw_env' is non-const and globally accessible, consider making it const [cppcoreguidelines-avoid-non-const-global-variables]
db_stress_tool/db_stress_common.cc:22:25: warning: variable 'raw_env' provides global access to a non-const object; consider making the pointed-to data 'const' [cppcoreguidelines-avoid-non-const-global-variables]
db_stress_tool/db_stress_test_base.cc (2 warning(s))
db_stress_tool/db_stress_test_base.cc:3919:7: warning: function is not thread safe [concurrency-mt-unsafe]
db_stress_tool/db_stress_test_base.cc:5317:7: warning: function is not thread safe [concurrency-mt-unsafe]
db_stress_tool/db_stress_tool.cc (2 warning(s))
db_stress_tool/db_stress_tool.cc:39:49: warning: variable 'fault_fs_for_crash_report' is non-const and globally accessible, consider making it const [cppcoreguidelines-avoid-non-const-global-variables]
db_stress_tool/db_stress_tool.cc:39:49: warning: variable 'fault_fs_for_crash_report' provides global access to a non-const object; consider making the pointed-to data 'const' [cppcoreguidelines-avoid-non-const-global-variables]
Summary: Pull Request resolved: facebook#14757 Prerequisite for multi-DB stress test support where each StressTest instance owns one DB. Moves fault injection from a single global to per-StressTest instance so each DB gets isolated fault injection; errors injected into one DB do not leak to others. FS/Env architecture change: Before (upstream): raw_fs → FaultInjectionTestFS (global) → DbStressFSWrapper → db_stress_env → DB::Open After (this diff): Global: raw_env (no wrappers) Used outside StressTest: non-FS ops (threads, time, sleep), test framework FS setup (dirs), DB destruction Used inside StressTest: cleanup ops that must succeed (external file delete) Per StressTest: raw_fs → DbStressFSWrapper (db_stress_fs_, always) → FaultInjectionTestFS (db_fault_injection_fs_, optional) → CompositeEnvWrapper (db_env_, always) → DB::Open Expected values state: Env::Default() (always local PosixEnv even when raw_env is remote) FS layer order swapped (DbStressFSWrapper now innermost). It's safe because these two do and should not depend on each other. Stored members (per StressTest): - db_stress_fs_: DbStressFSWrapper. Always active regardless of fault injection flags. - db_fault_injection_fs_: FaultInjectionTestFS. Only when fault injection flags set. Direct access for Enable/Disable/SetThreadLocal. - db_env_: CompositeEnvWrapper. Always present. THE env for all DB I/O (options_.env). Eliminated globals: db_stress_env → renamed to raw_env (no wrappers, DbStressFSWrapper moved to per-StressTest); db_stress_listener_env, db_stress_raw_fs removed; fault_fs_guard, fault_env_guard moved to per-StressTest. Other changes: - CleanupOutputDirectory simplified (uses raw_env, no disable/enable needed) - SstFileManager recreated with per-StressTest env in Open() - Remote compaction override env uses options.env directly (fixes pre-existing silent bug) - Comments added: Env::Default() always local, DbStressDestroyDb MANIFEST explanation, fault injection log path in TEST_TMPDIR - TestFSWritableFile::Close() now mirrors the production FSWritableFile close boundary. After the first Close() attempt, later explicit or destructor Close() calls are wrapper-level no-ops, while FaultInjectionTestFS still records the first close attempt for crash/recovery simulation. - Added targeted fault_injection_fs_test coverage for injected metadata-close failures to ensure FaultInjectionTestFS does not retry the inner Close() path. Differential Revision: D104959945
Summary:
Prerequisite for multi-DB stress test support where each StressTest instance owns one DB. Moves fault injection from a single global to per-StressTest instance so each DB gets isolated fault injection; errors injected into one DB do not leak to others.
FS/Env architecture change:
Before (upstream):
raw_fs → FaultInjectionTestFS (global) → DbStressFSWrapper → db_stress_env → DB::Open
After (this diff):
Global: raw_env (no wrappers)
Used outside StressTest: non-FS ops (threads, time, sleep), test framework FS setup (dirs), DB destruction
Used inside StressTest: cleanup ops that must succeed (external file delete)
Per StressTest:
raw_fs → DbStressFSWrapper (db_stress_fs_, always)
→ FaultInjectionTestFS (db_fault_injection_fs_, optional)
→ CompositeEnvWrapper (db_env_, always) → DB::Open
Expected values state: Env::Default() (always local PosixEnv even when raw_env is remote)
FS layer order swapped (DbStressFSWrapper now innermost). It's safe because these two do and should not depend on each other.
Stored members (per StressTest):
- db_stress_fs_: DbStressFSWrapper. Always active regardless of fault injection flags.
- db_fault_injection_fs_: FaultInjectionTestFS. Only when fault injection flags set. Direct access for Enable/Disable/SetThreadLocal.
- db_env_: CompositeEnvWrapper. Always present. THE env for all DB I/O (options_.env).
Eliminated globals: db_stress_env → renamed to raw_env (no wrappers, DbStressFSWrapper moved to per-StressTest); db_stress_listener_env, db_stress_raw_fs removed; fault_fs_guard, fault_env_guard moved to per-StressTest.
Other changes:
- CleanupOutputDirectory simplified (uses raw_env, no disable/enable needed)
- SstFileManager recreated with per-StressTest env in Open()
- Remote compaction override env uses options.env directly (fixes pre-existing silent bug)
- Comments added: Env::Default() always local, DbStressDestroyDb MANIFEST explanation, fault injection log path in TEST_TMPDIR
- TestFSWritableFile::Close() now mirrors the production FSWritableFile close boundary. After the first Close() attempt, later explicit or destructor Close() calls are wrapper-level no-ops, while FaultInjectionTestFS still records the first close attempt for crash/recovery simulation.
- Added targeted fault_injection_fs_test coverage for injected metadata-close failures to ensure FaultInjectionTestFS does not retry the inner Close() path.
Differential Revision: D104959945
Summary:
Prerequisite for multi-DB stress test support where each StressTest instance owns one DB. Moves fault injection from a single global to per-StressTest instance so each DB gets isolated fault injection; errors injected into one DB do not leak to others.
FS/Env architecture change:
Before (upstream):
raw_fs → FaultInjectionTestFS (global) → DbStressFSWrapper → db_stress_env → DB::Open
After (this diff):
Global: raw_env (no wrappers)
Used outside StressTest: non-FS ops (threads, time, sleep), test framework FS setup (dirs), DB destruction
Used inside StressTest: cleanup ops that must succeed (external file delete)
Per StressTest:
raw_fs → DbStressFSWrapper (db_stress_fs_, always)
→ FaultInjectionTestFS (db_fault_injection_fs_, optional)
→ CompositeEnvWrapper (db_env_, always) → DB::Open
Expected values state: Env::Default() (always local PosixEnv even when raw_env is remote)
FS layer order swapped (DbStressFSWrapper now innermost). It's safe because these two do and should not depend on each other.
Stored members (per StressTest):
- db_stress_fs_: DbStressFSWrapper. Always active regardless of fault injection flags.
- db_fault_injection_fs_: FaultInjectionTestFS. Only when fault injection flags set. Direct access for Enable/Disable/SetThreadLocal.
- db_env_: CompositeEnvWrapper. Always present. THE env for all DB I/O (options_.env).
Eliminated globals: db_stress_env → renamed to raw_env (no wrappers, DbStressFSWrapper moved to per-StressTest); db_stress_listener_env, db_stress_raw_fs removed; fault_fs_guard, fault_env_guard moved to per-StressTest.
Other changes:
- CleanupOutputDirectory simplified (uses raw_env, no disable/enable needed)
- SstFileManager recreated with per-StressTest env in Open()
- Remote compaction override env uses options.env directly (fixes pre-existing silent bug)
- Comments added: Env::Default() always local, DbStressDestroyDb MANIFEST explanation, fault injection log path in TEST_TMPDIR
- TestFSWritableFile::Close() now mirrors the production FSWritableFile close boundary. After the first Close() attempt, later explicit or destructor Close() calls are wrapper-level no-ops, while FaultInjectionTestFS still records the first close attempt for crash/recovery simulation.
- Added targeted fault_injection_fs_test coverage for injected metadata-close failures to ensure FaultInjectionTestFS does not retry the inner Close() path.
Differential Revision: D104959945
Summary:
Prerequisite for multi-DB stress test support where each StressTest instance owns one DB. Moves fault injection from a single global to per-StressTest instance so each DB gets isolated fault injection; errors injected into one DB do not leak to others.
FS/Env architecture change:
Before (upstream):
raw_fs → FaultInjectionTestFS (global) → DbStressFSWrapper → db_stress_env → DB::Open
After (this diff):
Global: raw_env (no wrappers)
Used outside StressTest: non-FS ops (threads, time, sleep), test framework FS setup (dirs), DB destruction
Used inside StressTest: cleanup ops that must succeed (external file delete)
Per StressTest:
raw_fs → DbStressFSWrapper (db_stress_fs_, always)
→ FaultInjectionTestFS (db_fault_injection_fs_, optional)
→ CompositeEnvWrapper (db_env_, always) → DB::Open
Expected values state: Env::Default() (always local PosixEnv even when raw_env is remote)
FS layer order swapped (DbStressFSWrapper now innermost). It's safe because these two do and should not depend on each other.
Stored members (per StressTest):
- db_stress_fs_: DbStressFSWrapper. Always active regardless of fault injection flags.
- db_fault_injection_fs_: FaultInjectionTestFS. Only when fault injection flags set. Direct access for Enable/Disable/SetThreadLocal.
- db_env_: CompositeEnvWrapper. Always present. THE env for all DB I/O (options_.env).
Eliminated globals: db_stress_env → renamed to raw_env (no wrappers, DbStressFSWrapper moved to per-StressTest); db_stress_listener_env, db_stress_raw_fs removed; fault_fs_guard, fault_env_guard moved to per-StressTest.
Other changes:
- CleanupOutputDirectory simplified (uses raw_env, no disable/enable needed)
- SstFileManager recreated with per-StressTest env in Open()
- Remote compaction override env uses options.env directly (fixes pre-existing silent bug)
- Comments added: Env::Default() always local, DbStressDestroyDb MANIFEST explanation, fault injection log path in TEST_TMPDIR
- TestFSWritableFile::Close() now mirrors the production FSWritableFile close boundary. After the first Close() attempt, later explicit or destructor Close() calls are wrapper-level no-ops, while FaultInjectionTestFS still records the first close attempt for crash/recovery simulation.
- Added targeted fault_injection_fs_test coverage for injected metadata-close failures to ensure FaultInjectionTestFS does not retry the inner Close() path.
Differential Revision: D104959945
df8f323 to
1909be9
Compare
Summary:
Prerequisite for multi-DB stress test support where each StressTest instance owns one DB. Moves fault injection from a single global to per-StressTest instance so each DB gets isolated fault injection; errors injected into one DB do not leak to others.
FS/Env architecture change:
Before (upstream):
raw_fs → FaultInjectionTestFS (global) → DbStressFSWrapper → db_stress_env → DB::Open
After (this diff):
Global: raw_env (no wrappers)
Used outside StressTest: non-FS ops (threads, time, sleep), test framework FS setup (dirs), DB destruction
Used inside StressTest: cleanup ops that must succeed (external file delete)
Per StressTest:
raw_fs → DbStressFSWrapper (db_stress_fs_, always)
→ FaultInjectionTestFS (db_fault_injection_fs_, optional)
→ CompositeEnvWrapper (db_env_, always) → DB::Open
Expected values state: Env::Default() (always local PosixEnv even when raw_env is remote)
FS layer order swapped (DbStressFSWrapper now innermost). It's safe because these two do and should not depend on each other.
Stored members (per StressTest):
- db_stress_fs_: DbStressFSWrapper. Always active regardless of fault injection flags.
- db_fault_injection_fs_: FaultInjectionTestFS. Only when fault injection flags set. Direct access for Enable/Disable/SetThreadLocal.
- db_env_: CompositeEnvWrapper. Always present. THE env for all DB I/O (options_.env).
Eliminated globals: db_stress_env → renamed to raw_env (no wrappers, DbStressFSWrapper moved to per-StressTest); db_stress_listener_env, db_stress_raw_fs removed; fault_fs_guard, fault_env_guard moved to per-StressTest.
Other changes:
- CleanupOutputDirectory simplified (uses raw_env, no disable/enable needed)
- SstFileManager recreated with per-StressTest env in Open()
- Remote compaction override env uses options.env directly (fixes pre-existing silent bug)
- Comments added: Env::Default() always local, DbStressDestroyDb MANIFEST explanation, fault injection log path in TEST_TMPDIR
- TestFSWritableFile::Close() now mirrors the production FSWritableFile close boundary. After the first Close() attempt, later explicit or destructor Close() calls are wrapper-level no-ops, while FaultInjectionTestFS still records the first close attempt for crash/recovery simulation.
- Added targeted fault_injection_fs_test coverage for injected metadata-close failures to ensure FaultInjectionTestFS does not retry the inner Close() path.
Reviewed By: anand1976
Differential Revision: D104959945
Summary:
Prerequisite for multi-DB stress test support where each StressTest instance owns one DB. Moves fault injection from a single global to per-StressTest instance so each DB gets isolated fault injection; errors injected into one DB do not leak to others.
FS/Env architecture change:
Before (upstream):
raw_fs → FaultInjectionTestFS (global) → DbStressFSWrapper → db_stress_env → DB::Open
After (this diff):
Global: raw_env (no wrappers)
Used outside StressTest: non-FS ops (threads, time, sleep), test framework FS setup (dirs), DB destruction
Used inside StressTest: cleanup ops that must succeed (external file delete)
Per StressTest:
raw_fs → DbStressFSWrapper (db_stress_fs_, always)
→ FaultInjectionTestFS (db_fault_injection_fs_, optional)
→ CompositeEnvWrapper (db_env_, always) → DB::Open
Expected values state: Env::Default() (always local PosixEnv even when raw_env is remote)
FS layer order swapped (DbStressFSWrapper now innermost). It's safe because these two do and should not depend on each other.
Stored members (per StressTest):
- db_stress_fs_: DbStressFSWrapper. Always active regardless of fault injection flags.
- db_fault_injection_fs_: FaultInjectionTestFS. Only when fault injection flags set. Direct access for Enable/Disable/SetThreadLocal.
- db_env_: CompositeEnvWrapper. Always present. THE env for all DB I/O (options_.env).
Eliminated globals: db_stress_env → renamed to raw_env (no wrappers, DbStressFSWrapper moved to per-StressTest); db_stress_listener_env, db_stress_raw_fs removed; fault_fs_guard, fault_env_guard moved to per-StressTest.
Other changes:
- CleanupOutputDirectory simplified (uses raw_env, no disable/enable needed)
- SstFileManager recreated with per-StressTest env in Open()
- Remote compaction override env uses options.env directly (fixes pre-existing silent bug)
- Comments added: Env::Default() always local, DbStressDestroyDb MANIFEST explanation, fault injection log path in TEST_TMPDIR
- TestFSWritableFile::Close() now mirrors the production FSWritableFile close boundary. After the first Close() attempt, later explicit or destructor Close() calls are wrapper-level no-ops, while FaultInjectionTestFS still records the first close attempt for crash/recovery simulation.
- Added targeted fault_injection_fs_test coverage for injected metadata-close failures to ensure FaultInjectionTestFS does not retry the inner Close() path.
Reviewed By: anand1976
Differential Revision: D104959945
Summary:
Prerequisite for multi-DB stress test support where each StressTest instance owns one DB. Moves fault injection from a single global to per-StressTest instance so each DB gets isolated fault injection; errors injected into one DB do not leak to others.
FS/Env architecture change:
Before (upstream):
raw_fs → FaultInjectionTestFS (global) → DbStressFSWrapper → db_stress_env → DB::Open
After (this diff):
Global: raw_env (no wrappers)
Used outside StressTest: non-FS ops (threads, time, sleep), test framework FS setup (dirs), DB destruction
Used inside StressTest: cleanup ops that must succeed (external file delete)
Per StressTest:
raw_fs → DbStressFSWrapper (db_stress_fs_, always)
→ FaultInjectionTestFS (db_fault_injection_fs_, optional)
→ CompositeEnvWrapper (db_env_, always) → DB::Open
Expected values state: Env::Default() (always local PosixEnv even when raw_env is remote)
FS layer order swapped (DbStressFSWrapper now innermost). It's safe because these two do and should not depend on each other.
Stored members (per StressTest):
- db_stress_fs_: DbStressFSWrapper. Always active regardless of fault injection flags.
- db_fault_injection_fs_: FaultInjectionTestFS. Only when fault injection flags set. Direct access for Enable/Disable/SetThreadLocal.
- db_env_: CompositeEnvWrapper. Always present. THE env for all DB I/O (options_.env).
Eliminated globals: db_stress_env → renamed to raw_env (no wrappers, DbStressFSWrapper moved to per-StressTest); db_stress_listener_env, db_stress_raw_fs removed; fault_fs_guard, fault_env_guard moved to per-StressTest.
Other changes:
- CleanupOutputDirectory simplified (uses raw_env, no disable/enable needed)
- SstFileManager recreated with per-StressTest env in Open()
- Remote compaction override env uses options.env directly (fixes pre-existing silent bug)
- Comments added: Env::Default() always local, DbStressDestroyDb MANIFEST explanation, fault injection log path in TEST_TMPDIR
- TestFSWritableFile::Close() now mirrors the production FSWritableFile close boundary. After the first Close() attempt, later explicit or destructor Close() calls are wrapper-level no-ops, while FaultInjectionTestFS still records the first close attempt for crash/recovery simulation.
- Added targeted fault_injection_fs_test coverage for injected metadata-close failures to ensure FaultInjectionTestFS does not retry the inner Close() path.
Reviewed By: anand1976
Differential Revision: D104959945
Summary:
Prerequisite for multi-DB stress test support where each StressTest instance owns one DB. Moves fault injection from a single global to per-StressTest instance so each DB gets isolated fault injection; errors injected into one DB do not leak to others.
FS/Env architecture change:
Before (upstream):
raw_fs → FaultInjectionTestFS (global) → DbStressFSWrapper → db_stress_env → DB::Open
After (this diff):
Global: raw_env (no wrappers)
Used outside StressTest: non-FS ops (threads, time, sleep), test framework FS setup (dirs), DB destruction
Used inside StressTest: cleanup ops that must succeed (external file delete)
Per StressTest:
raw_fs → DbStressFSWrapper (db_stress_fs_, always)
→ FaultInjectionTestFS (db_fault_injection_fs_, optional)
→ CompositeEnvWrapper (db_env_, always) → DB::Open
Expected values state: Env::Default() (always local PosixEnv even when raw_env is remote)
FS layer order swapped (DbStressFSWrapper now innermost). It's safe because these two do and should not depend on each other.
Stored members (per StressTest):
- db_stress_fs_: DbStressFSWrapper. Always active regardless of fault injection flags.
- db_fault_injection_fs_: FaultInjectionTestFS. Only when fault injection flags set. Direct access for Enable/Disable/SetThreadLocal.
- db_env_: CompositeEnvWrapper. Always present. THE env for all DB I/O (options_.env).
Eliminated globals: db_stress_env → renamed to raw_env (no wrappers, DbStressFSWrapper moved to per-StressTest); db_stress_listener_env, db_stress_raw_fs removed; fault_fs_guard, fault_env_guard moved to per-StressTest.
Other changes:
- CleanupOutputDirectory simplified (uses raw_env, no disable/enable needed)
- SstFileManager recreated with per-StressTest env in Open()
- Remote compaction override env uses options.env directly (fixes pre-existing silent bug)
- Comments added: Env::Default() always local, DbStressDestroyDb MANIFEST explanation, fault injection log path in TEST_TMPDIR
- TestFSWritableFile::Close() now mirrors the production FSWritableFile close boundary. After the first Close() attempt, later explicit or destructor Close() calls are wrapper-level no-ops, while FaultInjectionTestFS still records the first close attempt for crash/recovery simulation.
- Added targeted fault_injection_fs_test coverage for injected metadata-close failures to ensure FaultInjectionTestFS does not retry the inner Close() path.
Reviewed By: anand1976
Differential Revision: D104959945
Summary:
Prerequisite for multi-DB stress test support where each StressTest instance owns one DB. Moves fault injection from a single global to per-StressTest instance so each DB gets isolated fault injection; errors injected into one DB do not leak to others.
FS/Env architecture change:
Before (upstream):
raw_fs → FaultInjectionTestFS (global) → DbStressFSWrapper → db_stress_env → DB::Open
After (this diff):
Global: raw_env (no wrappers)
Used outside StressTest: non-FS ops (threads, time, sleep), test framework FS setup (dirs), DB destruction
Used inside StressTest: cleanup ops that must succeed (external file delete)
Per StressTest:
raw_fs → DbStressFSWrapper (db_stress_fs_, always)
→ FaultInjectionTestFS (db_fault_injection_fs_, optional)
→ CompositeEnvWrapper (db_env_, always) → DB::Open
Expected values state: Env::Default() (always local PosixEnv even when raw_env is remote)
FS layer order swapped (DbStressFSWrapper now innermost). It's safe because these two do and should not depend on each other.
Stored members (per StressTest):
- db_stress_fs_: DbStressFSWrapper. Always active regardless of fault injection flags.
- db_fault_injection_fs_: FaultInjectionTestFS. Only when fault injection flags set. Direct access for Enable/Disable/SetThreadLocal.
- db_env_: CompositeEnvWrapper. Always present. THE env for all DB I/O (options_.env).
Eliminated globals: db_stress_env → renamed to raw_env (no wrappers, DbStressFSWrapper moved to per-StressTest); db_stress_listener_env, db_stress_raw_fs removed; fault_fs_guard, fault_env_guard moved to per-StressTest.
Other changes:
- CleanupOutputDirectory simplified (uses raw_env, no disable/enable needed)
- SstFileManager recreated with per-StressTest env in Open()
- Remote compaction override env uses options.env directly (fixes pre-existing silent bug)
- Comments added: Env::Default() always local, DbStressDestroyDb MANIFEST explanation, fault injection log path in TEST_TMPDIR
- TestFSWritableFile::Close() now mirrors the production FSWritableFile close boundary. After the first Close() attempt, later explicit or destructor Close() calls are wrapper-level no-ops, while FaultInjectionTestFS still records the first close attempt for crash/recovery simulation.
- Added targeted fault_injection_fs_test coverage for injected metadata-close failures to ensure FaultInjectionTestFS does not retry the inner Close() path.
Reviewed By: anand1976
Differential Revision: D104959945
Summary:
Prerequisite for multi-DB stress test support where each StressTest instance owns one DB. Moves fault injection from a single global to per-StressTest instance so each DB gets isolated fault injection; errors injected into one DB do not leak to others.
FS/Env architecture change:
Before (upstream):
raw_fs → FaultInjectionTestFS (global) → DbStressFSWrapper → db_stress_env → DB::Open
After (this diff):
Global: raw_env (no wrappers)
Used outside StressTest: non-FS ops (threads, time, sleep), test framework FS setup (dirs), DB destruction
Used inside StressTest: cleanup ops that must succeed (external file delete)
Per StressTest:
raw_fs → DbStressFSWrapper (db_stress_fs_, always)
→ FaultInjectionTestFS (db_fault_injection_fs_, optional)
→ CompositeEnvWrapper (db_env_, always) → DB::Open
Expected values state: Env::Default() (always local PosixEnv even when raw_env is remote)
FS layer order swapped (DbStressFSWrapper now innermost). It's safe because these two do and should not depend on each other.
Stored members (per StressTest):
- db_stress_fs_: DbStressFSWrapper. Always active regardless of fault injection flags.
- db_fault_injection_fs_: FaultInjectionTestFS. Only when fault injection flags set. Direct access for Enable/Disable/SetThreadLocal.
- db_env_: CompositeEnvWrapper. Always present. THE env for all DB I/O (options_.env).
Eliminated globals: db_stress_env → renamed to raw_env (no wrappers, DbStressFSWrapper moved to per-StressTest); db_stress_listener_env, db_stress_raw_fs removed; fault_fs_guard, fault_env_guard moved to per-StressTest.
Other changes:
- CleanupOutputDirectory simplified (uses raw_env, no disable/enable needed)
- SstFileManager recreated with per-StressTest env in Open()
- Remote compaction override env uses options.env directly (fixes pre-existing silent bug)
- Comments added: Env::Default() always local, DbStressDestroyDb MANIFEST explanation, fault injection log path in TEST_TMPDIR
- TestFSWritableFile::Close() now mirrors the production FSWritableFile close boundary. After the first Close() attempt, later explicit or destructor Close() calls are wrapper-level no-ops, while FaultInjectionTestFS still records the first close attempt for crash/recovery simulation.
- Added targeted fault_injection_fs_test coverage for injected metadata-close failures to ensure FaultInjectionTestFS does not retry the inner Close() path.
Reviewed By: anand1976
Differential Revision: D104959945
Summary:
Prerequisite for multi-DB stress test support where each StressTest instance owns one DB. Moves fault injection from a single global to per-StressTest instance so each DB gets isolated fault injection; errors injected into one DB do not leak to others.
FS/Env architecture change:
Before (upstream):
raw_fs → FaultInjectionTestFS (global) → DbStressFSWrapper → db_stress_env → DB::Open
After (this diff):
Global: raw_env (no wrappers)
Used outside StressTest: non-FS ops (threads, time, sleep), test framework FS setup (dirs), DB destruction
Used inside StressTest: cleanup ops that must succeed (external file delete)
Per StressTest:
raw_fs → DbStressFSWrapper (db_stress_fs_, always)
→ FaultInjectionTestFS (db_fault_injection_fs_, optional)
→ CompositeEnvWrapper (db_env_, always) → DB::Open
Expected values state: Env::Default() (always local PosixEnv even when raw_env is remote)
FS layer order swapped (DbStressFSWrapper now innermost). Safe because:
- Error injection returns early; inner wrapper never executes (same behavior)
- DbStressFSWrapper assertions do not modify data (checksum validation, IOActivity checks)
- MANIFEST rename tracking slightly better for crash simulation in new order
Stored members (per StressTest):
- db_stress_fs_: DbStressFSWrapper. Always active regardless of fault injection flags.
- db_fault_injection_fs_: FaultInjectionTestFS. Only when fault injection flags set. Direct access for Enable/Disable/SetThreadLocal.
- db_env_: CompositeEnvWrapper. Always present. THE env for all DB I/O (options_.env).
Eliminated globals: db_stress_env → renamed to raw_env (no wrappers, DbStressFSWrapper moved to per-StressTest); db_stress_listener_env, db_stress_raw_fs removed; fault_fs_guard, fault_env_guard moved to per-StressTest.
Other changes:
- CleanupOutputDirectory simplified (uses raw_env, no disable/enable needed)
- SstFileManager recreated with per-StressTest env in Open()
- Remote compaction override env uses options.env directly (fixes pre-existing silent bug)
- Comments added: Env::Default() always local, DbStressDestroyDb MANIFEST explanation, fault injection log path in TEST_TMPDIR
- TestFSWritableFile::Close() now mirrors the production FSWritableFile close boundary. After the first Close() attempt, later explicit or destructor Close() calls are wrapper-level no-ops, while FaultInjectionTestFS still records the first close attempt for crash/recovery simulation.
- Added targeted fault_injection_fs_test coverage for injected metadata-close failures to ensure FaultInjectionTestFS does not retry the inner Close() path.
Reviewed By: anand1976
Differential Revision: D104959945
Summary:
Prerequisite for multi-DB stress test support where each StressTest instance owns one DB. Moves fault injection from a single global to per-StressTest instance so each DB gets isolated fault injection; errors injected into one DB do not leak to others.
FS/Env architecture change:
Before (upstream):
raw_fs → FaultInjectionTestFS (global) → DbStressFSWrapper → db_stress_env → DB::Open
After (this diff):
Global: raw_env (no wrappers)
Used outside StressTest: non-FS ops (threads, time, sleep), test framework FS setup (dirs), DB destruction
Used inside StressTest: cleanup ops that must succeed (external file delete)
Per StressTest:
raw_fs → DbStressFSWrapper (db_stress_fs_, always)
→ FaultInjectionTestFS (db_fault_injection_fs_, optional)
→ CompositeEnvWrapper (db_env_, always) → DB::Open
Expected values state: Env::Default() (always local PosixEnv even when raw_env is remote)
FS layer order swapped (DbStressFSWrapper now innermost). Safe because:
- Error injection returns early; inner wrapper never executes (same behavior)
- DbStressFSWrapper assertions do not modify data (checksum validation, IOActivity checks)
- MANIFEST rename tracking slightly better for crash simulation in new order
Stored members (per StressTest):
- db_stress_fs_: DbStressFSWrapper. Always active regardless of fault injection flags.
- db_fault_injection_fs_: FaultInjectionTestFS. Only when fault injection flags set. Direct access for Enable/Disable/SetThreadLocal.
- db_env_: CompositeEnvWrapper. Always present. THE env for all DB I/O (options_.env).
Eliminated globals: db_stress_env → renamed to raw_env (no wrappers, DbStressFSWrapper moved to per-StressTest); db_stress_listener_env, db_stress_raw_fs removed; fault_fs_guard, fault_env_guard moved to per-StressTest.
Other changes:
- CleanupOutputDirectory simplified (uses raw_env, no disable/enable needed)
- SstFileManager recreated with per-StressTest env in Open()
- Remote compaction override env uses options.env directly (fixes pre-existing silent bug)
- Comments added: Env::Default() always local, DbStressDestroyDb MANIFEST explanation, fault injection log path in TEST_TMPDIR
- TestFSWritableFile::Close() now mirrors the production FSWritableFile close boundary. After the first Close() attempt, later explicit or destructor Close() calls are wrapper-level no-ops, while FaultInjectionTestFS still records the first close attempt for crash/recovery simulation.
- Added targeted fault_injection_fs_test coverage for injected metadata-close failures to ensure FaultInjectionTestFS does not retry the inner Close() path.
Reviewed By: anand1976
Differential Revision: D104959945
Summary:
Prerequisite for multi-DB stress test support where each StressTest instance owns one DB. Moves fault injection from a single global to per-StressTest instance so each DB gets isolated fault injection; errors injected into one DB do not leak to others.
FS/Env architecture change:
Before (upstream):
raw_fs → FaultInjectionTestFS (global) → DbStressFSWrapper → db_stress_env → DB::Open
After (this diff):
Global: raw_env (no wrappers)
Used outside StressTest: non-FS ops (threads, time, sleep), test framework FS setup (dirs), DB destruction
Used inside StressTest: cleanup ops that must succeed (external file delete)
Per StressTest:
raw_fs → DbStressFSWrapper (db_stress_fs_, always)
→ FaultInjectionTestFS (db_fault_injection_fs_, optional)
→ CompositeEnvWrapper (db_env_, always) → DB::Open
Expected values state: Env::Default() (always local PosixEnv even when raw_env is remote)
FS layer order swapped (DbStressFSWrapper now innermost). Safe because:
- Error injection returns early; inner wrapper never executes (same behavior)
- DbStressFSWrapper assertions do not modify data (checksum validation, IOActivity checks)
- MANIFEST rename tracking slightly better for crash simulation in new order
Stored members (per StressTest):
- db_stress_fs_: DbStressFSWrapper. Always active regardless of fault injection flags.
- db_fault_injection_fs_: FaultInjectionTestFS. Only when fault injection flags set. Direct access for Enable/Disable/SetThreadLocal.
- db_env_: CompositeEnvWrapper. Always present. THE env for all DB I/O (options_.env).
Eliminated globals: db_stress_env → renamed to raw_env (no wrappers, DbStressFSWrapper moved to per-StressTest); db_stress_listener_env, db_stress_raw_fs removed; fault_fs_guard, fault_env_guard moved to per-StressTest.
Other changes:
- CleanupOutputDirectory simplified (uses raw_env, no disable/enable needed)
- SstFileManager recreated with per-StressTest env in Open()
- Remote compaction override env uses options.env directly (fixes pre-existing silent bug)
- Comments added: Env::Default() always local, DbStressDestroyDb MANIFEST explanation, fault injection log path in TEST_TMPDIR
- TestFSWritableFile::Close() now mirrors the production FSWritableFile close boundary. After the first Close() attempt, later explicit or destructor Close() calls are wrapper-level no-ops, while FaultInjectionTestFS still records the first close attempt for crash/recovery simulation.
- Added targeted fault_injection_fs_test coverage for injected metadata-close failures to ensure FaultInjectionTestFS does not retry the inner Close() path.
Reviewed By: anand1976
Differential Revision: D104959945
🟡 Codex Code ReviewAuto-triggered after CI passed — reviewing commit 73aa105 ❌ Codex review failed before producing findings. ℹ️ About this responseGenerated by Codex CLI. Limitations:
Commands:
|
✅ Claude Code ReviewAuto-triggered after CI passed — reviewing commit 73aa105 SummaryWell-structured refactoring that moves fault injection from global state to per-StressTest instance, a necessary prerequisite for multi-DB stress test support. The FS layer order swap and Close() idempotency changes are sound. A few issues worth addressing. High-severity findings (1):
Full review (click to expand)Findings🔴 HIGHH1. Pre-existing memory leak in
|
| Context | Impact | Assessment |
|---|---|---|
| Remote compaction | Uses options.env instead of global db_stress_env |
Correct -- fixes pre-existing silent bug |
| Background threads | SetBackgroundThreads on raw_env; DB uses db_env_ |
Safe -- CompositeEnvWrapper delegates thread ops to base env |
--destroy_db_and_exit |
Uses raw_env before StressTest creation |
Safe -- functionally equivalent to old db_stress_listener_env |
| Secondary DB | Opens with GetDbEnv() |
Correct |
| Expected values | Uses Env::Default() |
Unchanged, correct |
| Crash callback | Raw pointer instead of shared_ptr | Practically safe (see M1) |
Positive Observations
- Clean separation of concerns:
raw_envfor infrastructure,db_env_for DB I/O is a clear ownership model. - Fixes pre-existing bug: Remote compaction override now correctly uses
options.env. - Close() idempotency mirrors production:
TestFSWritableFilenow matches theFSWritableFileproduction close boundary contract. - Signal-safe crash callback: Moving from
shared_ptrto raw pointer is correct for signal safety. - DbStressListener stores shared_ptr: Ensures the FS outlives the listener. Good defensive design.
- Thorough manual testing: 6 test scenarios cover the key behavioral changes well.
ℹ️ About this response
Generated by Claude Code.
Review methodology: claude_md/code_review.md
Limitations:
- Claude may miss context from files not in the diff
- Large PRs may be truncated
- Always apply human judgment to AI suggestions
Commands:
/claude-review [context]— Request a code review/claude-query <question>— Ask about the PR or codebase
|
This pull request has been merged in 97551b7. |
Summary:
Prerequisite for multi-DB stress test support where each StressTest instance owns one DB. Moves fault injection from a single global to per-StressTest instance so each DB gets isolated fault injection; errors injected into one DB do not leak to others.
FS/Env architecture change:
Before (upstream):
raw_fs → FaultInjectionTestFS (global) → DbStressFSWrapper → db_stress_env → DB::Open
After (this diff):
Global: raw_env (no wrappers)
Used outside StressTest: non-FS ops (threads, time, sleep), test framework FS setup (dirs), DB destruction
Used inside StressTest: cleanup ops that must succeed (external file delete)
FS layer order swapped (DbStressFSWrapper now innermost). Safe because:
Stored members (per StressTest):
Eliminated globals: db_stress_env → renamed to raw_env (no wrappers, DbStressFSWrapper moved to per-StressTest); db_stress_listener_env, db_stress_raw_fs removed; fault_fs_guard, fault_env_guard moved to per-StressTest.
Other changes:
Test plan:
TEST 2: Smoke test without fault injection
$ db_stress --db=/tmp/v2_final/1/db --expected_values_dir=/tmp/v2_final/1/ev --destroy_db_initially --max_key=1000 --ops_per_thread=500 --threads=4 --clear_column_family_one_in=0 --reopen=0
Verification successful
TEST 3: Fault injection smoke
$ db_stress --db=/tmp/v2_final/2/db --expected_values_dir=/tmp/v2_final/2/ev --destroy_db_initially --max_key=100 --ops_per_thread=200 --threads=2 --clear_column_family_one_in=0 --write_fault_one_in=1 --sync_fault_injection=1 --open_write_fault_one_in=16 --reopen=3
Verification successful
DB LOG: [ERROR] OpenCompactionOutputFiles for table #50 fails at NewWritableFile with status IO error: injected write error
DB LOG: ErrorHandler: Set background retryable IO error
TEST 4: MANIFEST preservation with swapped FS order
$ db_stress --db=/tmp/v2_final/3/db ... --reopen=3
Verification successful
$ ls /tmp/v2_final/3/db/MANIFEST*renamed
MANIFEST-000001_renamed_ MANIFEST-000005_renamed_ MANIFEST-000019_renamed_
TEST 5: DestroyDB cleans MANIFEST*renamed files
$ db_stress --destroy_db_and_exit=1 --db=/tmp/v2_final/3/db
Successfully destroyed db
$ ls /tmp/v2_final/3/db
No such file or directory
TEST 6: Instrumented wrapper-order validation in db_stress_env_wrapper.h
Temporary instrumentation added to db_stress_env_wrapper.h:
static std::atomic<uint64_t> io_activity_check_count{0};
static std::atomic<uint64_t> sst_checksum_verify_count{0};
Printed at StressTest::CleanUp().
Test A: without fault injection
$ ./db_stress --db=/tmp/d2_inst_final/db --expected_values_dir=/tmp/d2_inst_final/ev --destroy_db_initially --max_key=500 --ops_per_thread=200 --threads=2 --clear_column_family_one_in=0 --reopen=3 --continuous_verification_interval=0 --test_secondary=0 --enable_checksum_handoff=1 --file_checksum_impl=crc32c
Verification successful
DbStressFSWrapper instrumentation: IOActivity checks=942, SST checksum verifications=66
MANIFEST-000001_renamed_ MANIFEST-000005_renamed_ MANIFEST-000019_renamed_ MANIFEST-000033_renamed_
Test B: with fault injection
$ ./db_stress --db=/tmp/d2_inst_final2/db --expected_values_dir=/tmp/d2_inst_final2/ev --destroy_db_initially --max_key=500 --ops_per_thread=200 --threads=2 --clear_column_family_one_in=0 --reopen=3 --continuous_verification_interval=0 --test_secondary=0 --write_fault_one_in=100 --sync_fault_injection=1 --open_write_fault_one_in=16 --enable_checksum_handoff=1 --file_checksum_impl=crc32c
Verification successful
DbStressFSWrapper instrumentation: IOActivity checks=318, SST checksum verifications=94
MANIFEST-000001_renamed_ MANIFEST-000005_renamed_ MANIFEST-000019_renamed_ MANIFEST-000037_renamed_
LOG.old.1778909841646222: 4 injected errors
LOG.old.1778909841886623: 4 injected errors
LOG.old.1778909842158291: 4 injected errors
Proves: after FS order swap, IOActivity assertion checks and SST checksum verifications are still executing, and MANIFEST preservation and fault injection still work together.
Reviewed By: anand1976
Differential Revision: D104959945