Skip to content

Add blog post on Blob Direct Write and partitioned blob files#14873

Open
xingbowang wants to merge 2 commits into
facebook:mainfrom
xingbowang:2026_06_20_bdw_blog
Open

Add blog post on Blob Direct Write and partitioned blob files#14873
xingbowang wants to merge 2 commits into
facebook:mainfrom
xingbowang:2026_06_20_bdw_blog

Conversation

@xingbowang

Copy link
Copy Markdown
Contributor

Summary

  • Add a RocksDB blog post explaining Blob Direct Write and partitioned blob files.
  • Cover the write-path transformation to BlobIndex, partition selection, blob-file lifecycle, read fallback for in-flight direct-write files, and the current v1 scope.
  • Describe policy-driven grouping use cases such as TTL bucketing with Universal Compaction and wide-column metadata.

Test Plan

  • git diff --check upstream/main...2026_06_20_bdw_blog
@meta-cla meta-cla Bot added the CLA Signed label Jun 21, 2026
@github-actions

Copy link
Copy Markdown

✅ clang-tidy: No findings on changed lines

Completed in 0.0s.

@github-actions

github-actions Bot commented Jun 21, 2026

Copy link
Copy Markdown

Codex Code Review - OBSOLETE

Superseded by a newer AI review. Expand to see the original review.

🟡 Codex Code Review

Auto-triggered after CI passed — reviewing commit b6f9299


Codex review failed before producing findings.

WARNING: proceeding, even though we could not create PATH aliases: Refusing to create helper binaries under temporary dir "/tmp" (codex_home: AbsolutePathBuf("/tmp/codex-home"))
error: the argument '--base <BRANCH>' cannot be used with '[PROMPT]'

Usage: codex exec review --commit <SHA> --base <BRANCH> --title <TITLE> --model <MODEL> --config <key=value> --dangerously-bypass-approvals-and-sandbox --output-last-message <FILE> [PROMPT]

For more information, try '--help'.

ℹ️ About this response

Generated by Codex CLI.
Review methodology: claude_md/code_review.md

Limitations:

  • Codex may miss context from files not in the diff
  • Large PRs may be truncated
  • Always apply human judgment to AI suggestions

Commands:

  • /codex-review [context] — Request a code review
  • /codex-query <question> — Ask about the PR or codebase
@github-actions

github-actions Bot commented Jun 21, 2026

Copy link
Copy Markdown

Claude Code Review - OBSOLETE

Superseded by a newer AI review. Expand to see the original review.

✅ Claude Code Review

Auto-triggered after CI passed — reviewing commit b6f9299


Summary

Blog post is well-written, technically accurate, and the code examples match the actual codebase. The API signatures, class names, option names, and lifecycle methods all exist and match their implementations.

High-severity findings (1):

  • [authors.yml] Author xbw is not defined in docs/_data/authors.yml. The blog post will not display the author name correctly. Per CLAUDE.md: "Blog post authors must be defined in docs/_data/authors.yml to be displayed."
Full review (click to expand)

Findings

🔴 HIGH

H1. Missing author definition -- docs/_data/authors.yml
  • Issue: The blog post front matter specifies author: xbw, but xbw is not defined in docs/_data/authors.yml.
  • Root cause: Author entry was not added alongside the blog post.
  • Suggested fix: Add an entry to docs/_data/authors.yml:
    xbw:
      full_name: Xingbo Wang
    (Adjust full_name to the author's actual name.)

🟡 MEDIUM

No medium-severity findings.

🟢 LOW / NIT

L1. No #include shown in code examples
  • Issue: The code examples don't show the necessary #include directives. Standard for blog posts but could help readers trying the examples directly.

Cross-Component Analysis

Documentation-only change. No code modified. Only impact is author rendering in Jekyll (blocked by H1).

Positive Observations

  • Technically thorough and well-structured, covering write path, read path, lifecycle, partitioning, wide-column interaction, and v1 scope limitations.
  • All code examples verified accurate against actual API signatures in include/rocksdb/blob_file_partition_strategy.h.
  • All referenced option names (enable_blob_direct_write, blob_direct_write_partitions, blob_direct_write_partition_strategy) exist in include/rocksdb/advanced_options.h.
  • All referenced internal classes and methods (BlobWriteBatchTransformer, BlobFilePartitionManager, RotateCurrentGeneration, PrepareFlushAdditions, ResolveBlobDirectWriteIndex) exist in the codebase.
  • The "Current Scope" section is commendably transparent about v1 limitations.

ℹ️ About this response

Generated by Claude Code.
Review methodology: claude_md/code_review.md

Limitations:

  • Claude may miss context from files not in the diff
  • Large PRs may be truncated
  • Always apply human judgment to AI suggestions

Commands:

  • /claude-review [context] — Request a code review
  • /claude-query <question> — Ask about the PR or codebase
@xingbowang xingbowang force-pushed the 2026_06_20_bdw_blog branch from b6f9299 to cc398c8 Compare June 23, 2026 16:11
@github-actions

Copy link
Copy Markdown

🟡 Codex Code Review

Auto-triggered after CI passed — reviewing commit cc398c8


Codex review failed before producing findings.

WARNING: proceeding, even though we could not create PATH aliases: Refusing to create helper binaries under temporary dir "/tmp" (codex_home: AbsolutePathBuf("/tmp/codex-home"))
error: the argument '--base <BRANCH>' cannot be used with '[PROMPT]'

Usage: codex exec review --commit <SHA> --base <BRANCH> --title <TITLE> --model <MODEL> --config <key=value> --dangerously-bypass-approvals-and-sandbox --output-last-message <FILE> [PROMPT]

For more information, try '--help'.

ℹ️ About this response

Generated by Codex CLI.
Review methodology: claude_md/code_review.md

Limitations:

  • Codex may miss context from files not in the diff
  • Large PRs may be truncated
  • Always apply human judgment to AI suggestions

Commands:

  • /codex-review [context] — Request a code review
  • /codex-query <question> — Ask about the PR or codebase
@github-actions

Copy link
Copy Markdown

✅ Claude Code Review

Auto-triggered after CI passed — reviewing commit cc398c8


Summary

This PR adds a well-written blog post explaining the Blob Direct Write feature and partitioned blob files. The technical claims align with the codebase. The author (xbw) is properly defined in authors.yml. No non-ASCII characters were found.

High-severity findings (0):

No high-severity findings.

Full review (click to expand)

Findings

🔴 HIGH

None.

🟡 MEDIUM

M1. Code example sets blob_direct_write_partitions = 8 but default is 1 — could confuse readers
  • Issue: The code example in the "Partitioned Blob Files" section sets options.blob_direct_write_partitions = 8; without comment. While valid, the default value in advanced_options.h:1239 is 1. A reader might think 8 is a recommended or default value.
  • Suggested fix: Add a brief inline comment like // default is 1 or mention in the surrounding prose that the default is a single partition and the example uses 8 to illustrate multi-partition behavior.

🟢 LOW / NIT

L1. TtlBucketPartitionStrategy example calls undefined ExtractTtlBucket
  • Issue: The example strategy calls ExtractTtlBucket(key, value) which is an undefined function. This is fine for a blog post (it's illustrative), but a reader might attempt to compile it verbatim.
  • Suggested fix: Add a comment like // application-defined helper next to the call.
L2. Blog could mention blob_compression_type interaction
  • Issue: The blog mentions "If blob compression is enabled, the strategy still receives the original uncompressed value" which is good. However, it doesn't mention blob_compression_type as the configurable option name. Minor omission.

Technical Accuracy Verification

Claim Verified? Notes
BlobWriteBatchTransformer is the core class YES db/blob/blob_write_batch_transformer.{h,cc}
BlobFilePartitionManager manages partitions YES db/blob/blob_file_partition_manager.{h,cc}
WriteBlob() method exists YES blob_file_partition_manager.h:76
RotateCurrentGeneration() method exists YES blob_file_partition_manager.h:92
PrepareFlushAdditions() method exists YES blob_file_partition_manager.h:100
ResolveBlobDirectWriteIndex() method exists YES blob_file_partition_manager.h:151
BlobFilePartitionStrategy API matches YES blob_file_partition_strategy.h
enable_blob_direct_write is immutable option YES advanced_options.h:1227
IngestWriteBatchWithIndex blocked YES db_impl_write.cc:272-275
allow_concurrent_memtable_write option name YES options.h:1429
Single manager mutex in v1 YES blob_file_partition_manager.h:278
Author xbw in authors.yml YES authors.yml:109-111

Positive Observations

  • Well-structured with clear sections covering write path, read path, lifecycle, partitioning use cases, and limitations.
  • Limitations section is thorough and honest about v1 constraints including crash recovery.
  • Code examples are syntactically correct and match the actual API.
  • No non-ASCII characters (compliant with CLAUDE.md).
  • Front matter follows the same convention as other recent blog posts.

ℹ️ About this response

Generated by Claude Code.
Review methodology: claude_md/code_review.md

Limitations:

  • Claude may miss context from files not in the diff
  • Large PRs may be truncated
  • Always apply human judgment to AI suggestions

Commands:

  • /claude-review [context] — Request a code review
  • /claude-query <question> — Ask about the PR or codebase
@meta-codesync

meta-codesync Bot commented Jun 24, 2026

Copy link
Copy Markdown

@xingbowang has imported this pull request. If you are a Meta employee, you can view this in D109564817.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

1 participant