Skip to content

Fix use-after-free in EmbeddedBlobResolvingIterator when key() called before value()#14888

Open
anand1976 wants to merge 1 commit into
facebook:mainfrom
anand1976:export-D109796428
Open

Fix use-after-free in EmbeddedBlobResolvingIterator when key() called before value()#14888
anand1976 wants to merge 1 commit into
facebook:mainfrom
anand1976:export-D109796428

Conversation

@anand1976

Copy link
Copy Markdown
Contributor

Differential Revision: D109796428

… before value()

Differential Revision: D109796428
@meta-codesync

meta-codesync Bot commented Jun 26, 2026

Copy link
Copy Markdown

@anand1976 has exported this pull request. If you are a Meta employee, you can view the originating Diff in D109796428.

@github-actions

Copy link
Copy Markdown

✅ clang-tidy: No findings on changed lines

Completed in 178.4s.

@github-actions

Copy link
Copy Markdown

🟡 Codex Code Review

Auto-triggered after CI passed — reviewing commit ad944dd


Codex review failed before producing findings.

WARNING: proceeding, even though we could not create PATH aliases: Refusing to create helper binaries under temporary dir "/tmp" (codex_home: AbsolutePathBuf("/tmp/codex-home"))
error: the argument '--base <BRANCH>' cannot be used with '[PROMPT]'

Usage: codex exec review --commit <SHA> --base <BRANCH> --title <TITLE> --model <MODEL> --config <key=value> --dangerously-bypass-approvals-and-sandbox --output-last-message <FILE> [PROMPT]

For more information, try '--help'.

ℹ️ About this response

Generated by Codex CLI.
Review methodology: claude_md/code_review.md

Limitations:

  • Codex may miss context from files not in the diff
  • Large PRs may be truncated
  • Always apply human judgment to AI suggestions

Commands:

  • /codex-review [context] — Request a code review
  • /codex-query <question> — Ask about the PR or codebase
@github-actions

Copy link
Copy Markdown

✅ Claude Code Review

Auto-triggered after CI passed — reviewing commit ad944dd


Summary

This is a well-targeted, minimal fix for a real use-after-free bug in EmbeddedBlobResolvingIterator::MaterializeValue(). The one-line guard (&& !key_resolved_) correctly prevents std::move from overwriting resolved_internal_key_ when it was already populated by a prior key() call. The regression test accurately reproduces the CompactionIterator usage pattern (key-then-value) and would catch the bug under ASAN. The fix is correct for all execution paths including wide-column entities.

No high-severity findings.

Full review (click to expand)

Findings

🟡 MEDIUM

M1. Test does not exercise the wide-column entity path -- sst_file_reader_test.cc
  • Issue: The regression test only covers the kTypeBlobIndex (whole-value same-file blob) path. The kTypeWideColumnEntity path through MaterializeValue() also has the resolved_key move at line 329, but the guard !key_resolved_ is not exercised for wide columns. For wide-column entities, ResolveKeyType() is not called (line 307 condition is false), so key_resolved_ stays false and the move still proceeds -- which is the correct behavior. However, there is no test confirming this path remains correct.
  • Root cause: The fix is only relevant for kTypeBlobIndex entries where ResolveKeyType() sets key_resolved_ = true. For kTypeWideColumnEntity, the guard condition doesn't change behavior (key_resolved_ is false in both old and new code). But a test would prevent future regressions.
  • Suggested fix: Consider adding a test with a wide-column entity containing same-file blob columns, exercising the key-then-value pattern to confirm no regression.

�� LOW / NIT

L1. Redundant resolved key computation -- embedded_blob_resolving_iterator.h:307 + block_based_table_reader.cc:1456
  • Issue: When MaterializeValue() is called for a kTypeBlobIndex entry, it first calls ResolveKeyType() (line 307) which parses the key, decodes the BlobIndex, and builds the resolved key. Then MaybeResolveEmbeddedValue() (line 319) performs the exact same parsing and key construction again internally. There's a FIXME comment at block_based_table_reader.cc:1456 acknowledging this: // FIXME: de-dup ParseInternalKey() work from callers.
  • Root cause: The two resolution paths (ResolveKeyType for cheap key-only resolution, MaybeResolveEmbeddedValue for full value materialization) were designed independently with overlapping concerns.
  • Suggested fix: This is pre-existing and out of scope for this PR. The FIXME is already tracked.
L2. Test uses TableReaderCaller::kCompaction but doesn't use allow_unprepared_value=true -- sst_file_reader_test.cc:442
  • Issue: The test creates the iterator with TableReaderCaller::kCompaction but the default allow_unprepared_value is false (it's not passed). The real CompactionIterator may use allow_unprepared_value=true. When EmbeddedBlobResolvingIterator wraps the inner iterator, it forces allow_unprepared_value=false on the inner iterator regardless (line 2768-2769 of block_based_table_reader.cc), so this doesn't affect correctness. But the test could more accurately mirror the real compaction call pattern.
  • Root cause: Minor test fidelity issue; does not affect the validity of the test.
  • Suggested fix: No action needed -- the test correctly triggers the bug path regardless.

Cross-Component Analysis

Context Does code execute? Fix correct? Notes
CompactionIterator YES YES Primary trigger: key_ = input_.key(); value_ = input_.value(); at cc:710-711
DBIter (user reads) YES (via MergingIterator) YES key()-then-value() pattern possible
SstFileReader/sst_dump Excluded (kSSTDumpTool check) N/A EmbeddedBlobResolvingIterator not created
ReadOnly DB YES YES Same iterator path
WritePreparedTxnDB YES YES No additional interaction with this code
Wide-column entities YES YES key_resolved_ stays false, move still occurs correctly

Assumption stress test results:

  • key() then value(): Fix prevents the move. key_resolved_ is true, guard skips move. Correct.
  • value() only: ResolveKeyType() called first (line 307), sets key_resolved_. Guard skips move. resolved_internal_key_ already set by ResolveKeyType() with identical key. Correct.
  • key() only: ResolveKeyType() sets resolved_internal_key_. MaterializeValue() never called. Correct.
  • PrepareValue(): Calls MaterializeValue(). Same as value()-only analysis. Correct.
  • Wide-column + key() then value(): ResolveKeyType() returns early (not kTypeBlobIndex). key_resolved_ stays false. Move proceeds. Correct.

Positive Observations

  • The fix is minimal and precisely targeted -- a single boolean guard that prevents the specific use-after-free without changing any other behavior.
  • Both ResolveKeyType() and MaybeResolveEmbeddedValue() produce byte-identical resolved keys (same InternalKey(user_key, seq, kTypeValue) construction), making the skip-if-already-resolved semantics correct.
  • The regression test is well-structured: it mimics the actual CompactionIterator pattern, saves the key Slice, calls value(), then verifies the key Slice is still valid. Under ASAN, the original bug would be caught as heap-use-after-free.
  • The test exercises both entries (key1, key2) covering the Next() + re-resolution path.

ℹ️ About this response

Generated by Claude Code.
Review methodology: claude_md/code_review.md

Limitations:

  • Claude may miss context from files not in the diff
  • Large PRs may be truncated
  • Always apply human judgment to AI suggestions

Commands:

  • /claude-review [context] — Request a code review
  • /claude-query <question> — Ask about the PR or codebase
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

1 participant