Skip to content

Fix flaky preallocation tests on btrfs, zfs, tmpfs, and overlayfs#14744

Closed
pdillinger wants to merge 4 commits into
facebook:mainfrom
pdillinger:flaky_fallocate
Closed

Fix flaky preallocation tests on btrfs, zfs, tmpfs, and overlayfs#14744
pdillinger wants to merge 4 commits into
facebook:mainfrom
pdillinger:flaky_fallocate

Conversation

@pdillinger

Copy link
Copy Markdown
Contributor

Summary:
Fix flaky test failures in EnvPosixTestWithParam.AllocateTest and DBWALTest.TruncateLastLogAfterRecoverWithFlush that occur when running on filesystems where preallocated space is not reliably reflected in st_blocks.

The tests use stat() to check st_blocks and verify that fallocate with FALLOC_FL_KEEP_SIZE actually preallocates disk space. However, on certain filesystems, this check is unreliable:

  1. btrfs and zfs: Copy-on-write filesystems where preallocated extents are not reliably reflected in st_blocks, especially under load.

  2. tmpfs: Memory filesystem that may not report preallocated blocks in st_blocks.

  3. overlayfs: Union filesystem common in containers that may not pass through fallocate properly or report preallocated space.

  4. Any filesystem where FALLOC_FL_KEEP_SIZE is not supported: The preallocation will fail silently (error ignored with PermitUncheckedError), leaving only the written data in st_blocks.

Changes made:

env/env_test.cc:

  • Added filesystem magic number definitions for TMPFS_MAGIC, OVERLAYFS_SUPER_MAGIC, and ZFS_SUPER_MAGIC
  • Extended the AllocateTest to skip block count checks on zfs, tmpfs, and overlayfs in addition to btrfs
  • Added runtime fallback: if st_blocks is less than expected, print a warning and skip the check instead of failing. This handles unknown filesystems or configurations where preallocation isn't supported.

db/db_wal_test.cc:

  • Added includes and filesystem magic number definitions
  • Added ShouldSkipAllocationCheck() helper function to detect problematic filesystems
  • Modified TruncateLastLogAfterRecoverWithoutFlush, TruncateLastLogAfterRecoverWithFlush, TruncateLastLogAfterRecoverWALEmpty, and ReadOnlyRecoveryNoTruncate tests to skip allocation checks on problematic filesystems
  • Added runtime fallback checks similar to env_test.cc

These changes make the tests robust against filesystem differences while still validating preallocation behavior on filesystems where it works correctly (ext4, xfs).

Test Plan:
Many local 'make -j100 check' runs that would previously fail with good probability.

Summary:
Fix flaky test failures in EnvPosixTestWithParam.AllocateTest and
DBWALTest.TruncateLastLogAfterRecoverWithFlush that occur when running
on filesystems where preallocated space is not reliably reflected in
st_blocks.

The tests use stat() to check st_blocks and verify that fallocate with
FALLOC_FL_KEEP_SIZE actually preallocates disk space. However, on certain
filesystems, this check is unreliable:

1. btrfs and zfs: Copy-on-write filesystems where preallocated extents
   are not reliably reflected in st_blocks, especially under load.

2. tmpfs: Memory filesystem that may not report preallocated blocks in
   st_blocks.

3. overlayfs: Union filesystem common in containers that may not pass
   through fallocate properly or report preallocated space.

4. Any filesystem where FALLOC_FL_KEEP_SIZE is not supported: The
   preallocation will fail silently (error ignored with
   PermitUncheckedError), leaving only the written data in st_blocks.

Changes made:

env/env_test.cc:
- Added filesystem magic number definitions for TMPFS_MAGIC,
  OVERLAYFS_SUPER_MAGIC, and ZFS_SUPER_MAGIC
- Extended the AllocateTest to skip block count checks on zfs, tmpfs,
  and overlayfs in addition to btrfs
- Added runtime fallback: if st_blocks is less than expected, print a
  warning and skip the check instead of failing. This handles unknown
  filesystems or configurations where preallocation isn't supported.

db/db_wal_test.cc:
- Added includes and filesystem magic number definitions
- Added ShouldSkipAllocationCheck() helper function to detect
  problematic filesystems
- Modified TruncateLastLogAfterRecoverWithoutFlush,
  TruncateLastLogAfterRecoverWithFlush, TruncateLastLogAfterRecoverWALEmpty,
  and ReadOnlyRecoveryNoTruncate tests to skip allocation checks on
  problematic filesystems
- Added runtime fallback checks similar to env_test.cc

These changes make the tests robust against filesystem differences while
still validating preallocation behavior on filesystems where it works
correctly (ext4, xfs).

Test Plan:
Many local 'make -j100 check' runs that would previously fail with good
probability.
@pdillinger pdillinger requested a review from anand1976 May 15, 2026 15:31
@meta-cla meta-cla Bot added the CLA Signed label May 15, 2026
@github-actions

github-actions Bot commented May 15, 2026

Copy link
Copy Markdown

⚠️ clang-tidy: 1 warning(s) on changed lines

Completed in 185.1s.

Summary by check

Check Count
cppcoreguidelines-pro-type-member-init 1
Total 1

Details

db/db_wal_test.cc (1 warning(s))
db/db_wal_test.cc:117:5: warning: uninitialized record type: 'fs_stat' [cppcoreguidelines-pro-type-member-init]
@meta-codesync

meta-codesync Bot commented May 15, 2026

Copy link
Copy Markdown

@pdillinger has imported this pull request. If you are a Meta employee, you can view this in D105331256.

Summary:

Test Plan:

Reviewers:

Subscribers:

Tasks:

Tags:
@meta-codesync

meta-codesync Bot commented May 15, 2026

Copy link
Copy Markdown

@pdillinger has imported this pull request. If you are a Meta employee, you can view this in D105331256.

@github-actions

github-actions Bot commented May 15, 2026

Copy link
Copy Markdown

Codex Code Review - OBSOLETE

Superseded by a newer AI review. Expand to see the original review.

🟡 Codex Code Review

Auto-triggered after CI passed — reviewing commit 611b91a


Codex review failed before producing findings.

WARNING: proceeding, even though we could not update PATH: Refusing to create helper binaries under temporary dir "/tmp" (codex_home: AbsolutePathBuf("/tmp/codex-home"))
error: the argument '--base <BRANCH>' cannot be used with '[PROMPT]'

Usage: codex exec review --commit <SHA> --base <BRANCH> --title <TITLE> --model <MODEL> --config <key=value> --dangerously-bypass-approvals-and-sandbox --output-last-message <FILE> [PROMPT]

For more information, try '--help'.

ℹ️ About this response

Generated by Codex CLI.
Review methodology: claude_md/code_review.md

Limitations:

  • Codex may miss context from files not in the diff
  • Large PRs may be truncated
  • Always apply human judgment to AI suggestions

Commands:

  • /codex-review [context] — Request a code review
  • /codex-query <question> — Ask about the PR or codebase
@github-actions

github-actions Bot commented May 15, 2026

Copy link
Copy Markdown

Claude Code Review - OBSOLETE

Superseded by a newer AI review. Expand to see the original review.

✅ Claude Code Review

Auto-triggered after CI passed — reviewing commit 611b91a


Summary

Sound fix for flaky preallocation tests on non-ext4 filesystems. Test-only change, no production code affected. The approach is consistent with the existing btrfs skip pattern and correctly extends it to zfs, tmpfs, and overlayfs.

High-severity findings (0):

No high-severity findings.

Full review (click to expand)

Findings

🟡 MEDIUM

M1. %lu format specifier for uint64_t / size_tdb/db_wal_test.cc
  • Issue: The warning messages use %lu to print uint64_t allocated and size_t preallocated_size. While this works on 64-bit Linux (where unsigned long is 64 bits), it is technically incorrect for uint64_t which should use PRIu64, and non-portable for size_t which should use %zu. RocksDB uses PRIu64 throughout (200+ occurrences).
  • Suggested fix: Use PRIu64 for uint64_t values and %zu for size_t values. Requires #include <cinttypes>.
M2. Code duplication of filesystem magic numbers and skip logic
  • Issue: Filesystem magic number #defines are now duplicated in 4+ files: env/env_test.cc, db/db_wal_test.cc, env/io_posix.cc, env/fs_posix.cc, env/env_posix.cc. The skip check logic also differs in style between the two test files (inline boolean in env_test.cc vs helper method in db_wal_test.cc).
  • Suggested fix: Consider extracting a common header with the magic number definitions. Not blocking for this PR but would reduce maintenance burden.
M3. Preprocessor guard inconsistency between files
  • Issue: In env/env_test.cc, magic numbers are defined with #if !defined(X) (no OS_LINUX guard), matching existing convention (env/fs_posix.cc, env/env_posix.cc). In db/db_wal_test.cc, they use #if defined(OS_LINUX) && !defined(X). The extra OS_LINUX guard on the definitions is unnecessary since the macros go unused on non-Linux anyway.
  • Suggested fix: Follow existing convention — use #if !defined(X) without OS_LINUX guard on macro definitions.

🟢 LOW / NIT

L1. Runtime fallback could theoretically mask real bugs on ext4
  • Issue: When allocated < preallocated_size on an unknown filesystem, the test skips. Acceptable pragmatic choice — ext4 reliably reflects preallocation, so real regressions won't be masked there.
L2. Consider ROCKSDB_GTEST_BYPASS when skipping
  • Issue: Skipped tests still report PASSED. Using ROCKSDB_GTEST_BYPASS would improve test output clarity. The fprintf(stderr) approach is consistent with existing btrfs pattern though.

Positive Observations

  • Conservative approach: adds detection rather than removing assertions. On ext4/xfs, all assertions remain enforced.
  • ShouldSkipAllocationCheck helper is well-structured.
  • Magic number values verified correct against existing definitions in env/io_posix.cc, env/fs_posix.cc, env/env_posix.cc.
  • ReadOnlyRecoveryNoTruncate correctly skips both initial and post-reopen assertions consistently.

ℹ️ About this response

Generated by Claude Code.
Review methodology: claude_md/code_review.md

Limitations:

  • Claude may miss context from files not in the diff
  • Large PRs may be truncated
  • Always apply human judgment to AI suggestions

Commands:

  • /claude-review [context] — Request a code review
  • /claude-query <question> — Ask about the PR or codebase
@meta-codesync

meta-codesync Bot commented May 22, 2026

Copy link
Copy Markdown

@pdillinger has imported this pull request. If you are a Meta employee, you can view this in D105331256.

@github-actions

Copy link
Copy Markdown

🟡 Codex Code Review

Auto-triggered after CI passed — reviewing commit ce4cb13


Codex review failed before producing findings.

WARNING: proceeding, even though we could not update PATH: Refusing to create helper binaries under temporary dir "/tmp" (codex_home: AbsolutePathBuf("/tmp/codex-home"))
error: the argument '--base <BRANCH>' cannot be used with '[PROMPT]'

Usage: codex exec review --commit <SHA> --base <BRANCH> --title <TITLE> --model <MODEL> --config <key=value> --dangerously-bypass-approvals-and-sandbox --output-last-message <FILE> [PROMPT]

For more information, try '--help'.

ℹ️ About this response

Generated by Codex CLI.
Review methodology: claude_md/code_review.md

Limitations:

  • Codex may miss context from files not in the diff
  • Large PRs may be truncated
  • Always apply human judgment to AI suggestions

Commands:

  • /codex-review [context] — Request a code review
  • /codex-query <question> — Ask about the PR or codebase
@github-actions

Copy link
Copy Markdown

✅ Claude Code Review

Auto-triggered after CI passed — reviewing commit ce4cb13


Summary

Clean test-only PR that fixes flaky preallocation tests on btrfs, zfs, tmpfs, and overlayfs. The approach is sound and consistent with the existing btrfs-skip pattern. No production code changes.

High-severity findings (0):

No high-severity findings.

Full review (click to expand)

Findings

🔴 HIGH

None.

🟡 MEDIUM

M1. Printf format specifier portability -- db/db_wal_test.cc
  • Issue: The diff uses %lu to print uint64_t (allocated) and size_t (preallocated_size) variables. On 32-bit platforms, %lu is 32-bit but uint64_t is 64-bit, producing undefined behavior or truncated output.
  • Mitigating factor: These code paths are guarded by ROCKSDB_FALLOCATE_PRESENT and ROCKSDB_PLATFORM_POSIX, so they effectively only run on 64-bit Linux where unsigned long == 64-bit. The risk is low in practice.
  • Suggested fix: Use static_cast<unsigned long long> with %llu, or use PRIu64 from <inttypes.h> for uint64_t values and %zu for size_t values.
M2. Inconsistent macro guard patterns between files -- env/env_test.cc vs db/db_wal_test.cc
  • Issue: In env_test.cc, the new filesystem magic constants (TMPFS_MAGIC, ZFS_SUPER_MAGIC, OVERLAYFS_SUPER_MAGIC) are guarded only by !defined(X), matching the existing BTRFS_SUPER_MAGIC pattern in that file. In db_wal_test.cc, all macros are additionally guarded by #if defined(OS_LINUX).
  • Root cause: The two files have different existing conventions. env_test.cc already had unguarded BTRFS_SUPER_MAGIC; db_wal_test.cc is adding these for the first time so uses the more defensive pattern.
  • Suggested fix: Either approach works (the macros are only used inside #ifdef OS_LINUX blocks anyway), but for consistency consider matching one pattern. The db_wal_test.cc approach (guarding with OS_LINUX) is more defensive.

🟢 LOW / NIT

L1. Code duplication of filesystem magic constants
  • Issue: TMPFS_MAGIC, ZFS_SUPER_MAGIC, OVERLAYFS_SUPER_MAGIC, and BTRFS_SUPER_MAGIC are now defined in both env/env_test.cc and db/db_wal_test.cc.
  • Suggested fix: Could be extracted to a shared test utility header, but given these are simple #ifndef/#define guards on well-known kernel constants, the duplication is tolerable for a test-only fix.
L2. Filesystem detection logic duplication
  • Issue: env_test.cc uses inline statfs() checks while db_wal_test.cc uses a ShouldSkipAllocationCheck() helper. The helper approach is cleaner since it's used 8 times across 4 tests.
  • Suggested fix: The asymmetry is justified by usage frequency. No action needed, but future changes could consolidate into a shared utility.
L3. Post-recovery check lacks runtime fallback (intentional)
  • Issue: Pre-recovery checks use a two-tier approach (filesystem detection + runtime fallback for unknown FS). Post-recovery checks only use filesystem detection.
  • Analysis: This asymmetry is actually correct. Post-recovery checks assert allocated < preallocated_size (truncation happened). If preallocation was never reflected in st_blocks, allocated would already be small, so allocated < preallocated_size trivially passes. A runtime fallback is unnecessary here.

Cross-Component Analysis

Context Does code execute? Assumptions hold? Action needed?
Linux ext4/xfs YES YES - all assertions fire None
Linux btrfs/zfs/tmpfs/overlayfs YES NO - st_blocks unreliable Skipped (this PR)
Non-Linux (macOS, Windows) NO N/A - guarded by POSIX/fallocate None
32-bit Linux Theoretically YES %lu format risk M1 covers this

Positive Observations

  • The (void)file_name; pattern in ShouldSkipAllocationCheck correctly suppresses unused parameter warnings on non-Linux.
  • The static_cast<decltype(fs_stat.f_type)> pattern for magic number comparison is correct and avoids signed/unsigned comparison warnings.
  • Filesystem magic numbers are all verified correct against Linux kernel headers (linux/magic.h).
  • The runtime fallback in env_test.cc is a nice defense-in-depth: even if a new unknown filesystem doesn't support preallocation, the test gracefully degrades instead of flaking.
  • ShouldSkipAllocationCheck fails safe -- if statfs() fails, it returns false (checks are NOT skipped), which is the conservative direction.

ℹ️ About this response

Generated by Claude Code.
Review methodology: claude_md/code_review.md

Limitations:

  • Claude may miss context from files not in the diff
  • Large PRs may be truncated
  • Always apply human judgment to AI suggestions

Commands:

  • /claude-review [context] — Request a code review
  • /claude-query <question> — Ask about the PR or codebase
@meta-codesync

meta-codesync Bot commented May 22, 2026

Copy link
Copy Markdown

@pdillinger merged this pull request in acfa68c.

pdillinger added a commit that referenced this pull request Jun 4, 2026
…4744)

Summary:
Fix flaky test failures in EnvPosixTestWithParam.AllocateTest and DBWALTest.TruncateLastLogAfterRecoverWithFlush that occur when running on filesystems where preallocated space is not reliably reflected in st_blocks.

The tests use stat() to check st_blocks and verify that fallocate with FALLOC_FL_KEEP_SIZE actually preallocates disk space. However, on certain filesystems, this check is unreliable:

1. btrfs and zfs: Copy-on-write filesystems where preallocated extents are not reliably reflected in st_blocks, especially under load.

2. tmpfs: Memory filesystem that may not report preallocated blocks in st_blocks.

3. overlayfs: Union filesystem common in containers that may not pass through fallocate properly or report preallocated space.

4. Any filesystem where FALLOC_FL_KEEP_SIZE is not supported: The preallocation will fail silently (error ignored with PermitUncheckedError), leaving only the written data in st_blocks.

Changes made:

env/env_test.cc:
- Added filesystem magic number definitions for TMPFS_MAGIC, OVERLAYFS_SUPER_MAGIC, and ZFS_SUPER_MAGIC
- Extended the AllocateTest to skip block count checks on zfs, tmpfs, and overlayfs in addition to btrfs
- Added runtime fallback: if st_blocks is less than expected, print a warning and skip the check instead of failing. This handles unknown filesystems or configurations where preallocation isn't supported.

db/db_wal_test.cc:
- Added includes and filesystem magic number definitions
- Added ShouldSkipAllocationCheck() helper function to detect problematic filesystems
- Modified TruncateLastLogAfterRecoverWithoutFlush, TruncateLastLogAfterRecoverWithFlush, TruncateLastLogAfterRecoverWALEmpty, and ReadOnlyRecoveryNoTruncate tests to skip allocation checks on problematic filesystems
- Added runtime fallback checks similar to env_test.cc

These changes make the tests robust against filesystem differences while still validating preallocation behavior on filesystems where it works correctly (ext4, xfs).

Pull Request resolved: #14744

Test Plan: Many local 'make -j100 check' runs that would previously fail with good probability.

Reviewed By: hx235

Differential Revision: D105331256

Pulled By: pdillinger

fbshipit-source-id: 862a1512da1466cb037af15342404939b677c02a
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

2 participants