env: suppress liburing TSAN false positives#14710
Conversation
liburing initializes io_uring mappings through a raw __sys_mmap path, so TSAN does not clear stale shadow state before setup.c reads the kernel-managed ring pointers. Suppress the known liburing setup frames while keeping RocksDB's explicit mmap annotations for later accesses. Imported from D101303486. Test Plan: COMPILE_WITH_TSAN=1 make -j128 env_test timeout 60s ./env_test --gtest_filter=EnvPosixTest.IOUringAddressReuseNoTsanFalsePositive
✅ clang-tidy: No findings on changed linesCompleted in 31.4s. |
Codex Code Review - OBSOLETESuperseded by a newer AI review. Expand to see the original review.🟡 Codex Code ReviewAuto-triggered after CI passed — reviewing commit b553120 ❌ Codex review failed before producing any output. ℹ️ About this responseGenerated by Codex CLI. Limitations:
Commands:
|
Claude Code Review - OBSOLETESuperseded by a newer AI review. Expand to see the original review.✅ Claude Code ReviewAuto-triggered after CI passed — reviewing commit b553120 SummaryThis PR adds High-severity findings (1):
Full review (click to expand)Findings🔴 HIGHH1. ODR conflict:
|
| Context | Does code execute? | Assumptions hold? | Action needed? |
|---|---|---|---|
| Non-TSAN builds | NO | N/A | Correctly guarded by __SANITIZE_THREAD__ |
| Non-io_uring builds | NO | N/A | Correctly guarded by ROCKSDB_IOURING_PRESENT |
| RocksDB as static library | YES | NO — ODR conflict possible | H1 applies |
| RocksDB as shared library | YES | NO — symbol conflict possible | H1 applies |
| Clang TSAN | YES | YES | port/lang.h:62 defines __SANITIZE_THREAD__ for Clang |
| GCC TSAN | YES | YES | GCC defines __SANITIZE_THREAD__ natively |
Positive Observations
- The comment explaining the false positive mechanism is excellent — it documents the root cause (raw
__sys_mmapbypassing TSAN interceptors), why the existing annotation is insufficient (too late), and why the suppression is safe. - The preprocessor guards are correct and well-scoped.
- The approach complements the existing
TsanAnnotateIOUringMemory()mechanism — the annotation handles post-init shadow state, while the suppression handles during-init reads.
ℹ️ About this response
Generated by Claude Code.
Review methodology: claude_md/code_review.md
Limitations:
- Claude may miss context from files not in the diff
- Large PRs may be truncated
- Always apply human judgment to AI suggestions
Commands:
/claude-review [context]— Request a code review/claude-query <question>— Ask about the PR or codebase
Move the liburing TSAN suppressions into a shipped suppressions file and wire RocksDB TSAN make/crash-test flows to use it by default without overriding caller-provided TSAN_OPTIONS.
Initialize the no-secondary-cache helper fields directly instead of delegating through the secondary-cache constructor with this. This avoids GCC 16 -Wmaybe-uninitialized in the unity build while preserving the same without_secondary_compat value.
🟡 Codex Code ReviewAuto-triggered after CI passed — reviewing commit bf450a6 ❌ Codex review failed before producing any output. ℹ️ About this responseGenerated by Codex CLI. Limitations:
Commands:
|
✅ Claude Code ReviewAuto-triggered after CI passed — reviewing commit bf450a6 SummaryClean, well-structured PR that ships liburing TSAN suppressions as a file and fixes a GCC 16 unity-build warning in High-severity findings (0): No high-severity findings. Full review (click to expand)Findings🔴 HIGHNone. 🟡 MEDIUMM1. User-provided
|
| Context | Safe? | Notes |
|---|---|---|
| Unity builds (GCC 16) | YES | Primary motivation for CacheItemHelper change; warning eliminated |
| TSAN builds via Makefile | YES | TSAN_OPTIONS exported under COMPILE_WITH_TSAN guard |
| TSAN builds via crash_test.mk standalone | YES | crash_test.mk has its own COMPILE_WITH_TSAN guard |
| Direct db_crashtest.py usage | YES | stress_cmd_env() handles subprocess propagation |
| Non-TSAN builds | YES | All changes guarded by COMPILE_WITH_TSAN or file-existence check |
| User with custom TSAN_OPTIONS | PARTIAL | Suppressions dropped (by design, documented in file header) |
| Other test scripts (write_stress_runner.py, ldb_test.py) | N/A | Don't exercise io_uring paths heavily; not blocking |
CacheItemHelper caller verification: 10 static/global instantiations of the 2-arg constructor were identified across cache.cc, typed_cache.h, tiered_secondary_cache.h, cache_test.cc, lru_cache_test.cc, and block_based_table_factory.cc. All produce identical runtime values with the new direct initialization.
Assumption stress-test results:
- "Constructor equivalence" claim: Verified. Member init order matches declaration order; values identical; skipped assertions are tautologies.
- "TSAN_OPTIONS won't override user values" claim: Verified.
?=andnot in envcorrectly preserve existing values. Edge case ofTSAN_OPTIONS=""is consistent with standard semantics.
Positive Observations
- Good use of
os.path.exists()guard for graceful degradation when suppressions file is absent. - Defensive
os.environ.copy()instress_cmd_env()avoids mutating global state. - Well-documented suppressions file with explanation of the root cause and guidance for downstream users.
- Tests properly save/restore
TSAN_OPTIONSin setUp/tearDown to avoid test pollution. - The
_TSAN_SUPPRESSIONS_FILEusesos.path.abspath(os.path.join(os.path.dirname(__file__), ...))for reliable path resolution regardless of working directory.
ℹ️ About this response
Generated by Claude Code.
Review methodology: claude_md/code_review.md
Limitations:
- Claude may miss context from files not in the diff
- Large PRs may be truncated
- Always apply human judgment to AI suggestions
Commands:
/claude-review [context]— Request a code review/claude-query <question>— Ask about the PR or codebase
|
@xingbowang has imported this pull request. If you are a Meta employee, you can view this in D104103800. |
|
@xingbowang merged this pull request in 224e849. |
Summary:
tools/tsan_suppressions.txtinstead of defining the process-wide__tsan_default_suppressions()hook, avoiding conflicts with downstream applications.TSAN_OPTIONS.db_crashtest.pylaunches by passing the default suppressions todb_stresssubprocesses.CacheItemHelperby directly initializing the no-secondary-cache helper fields instead of delegating withthis.Imported from D101303486.
Test Plan:
COMPILE_WITH_TSAN=1 make -j128 env_testtimeout 60s ./env_test --gtest_filter=EnvPosixTest.IOUringAddressReuseNoTsanFalsePositivepython3 tools/db_crashtest_test.pypython3 -m py_compile tools/db_crashtest.py tools/db_crashtest_test.pybuild-linux-unity-and-headerspassed after theCacheItemHelperfix