Skip to content

Bug fix: Reject empty string as a column family name#14732

Closed
pdillinger wants to merge 2 commits into
facebook:mainfrom
pdillinger:empty_cf_name
Closed

Bug fix: Reject empty string as a column family name#14732
pdillinger wants to merge 2 commits into
facebook:mainfrom
pdillinger:empty_cf_name

Conversation

@pdillinger

Copy link
Copy Markdown
Contributor

Summary:
Previously, calling DB::CreateColumnFamily(opts, "", &handle) returned Status::OK() with a usable handle, but the column family was not persisted in the manifest. Any data written to the empty-named CF was silently lost on DB reopen, and ListColumnFamilies would not show it.

The empty string is also reserved as a sentinel meaning "no/unknown column family" in various RocksDB APIs and serialization formats (e.g. TablePropertiesCollectorFactory::Context::kUnknownColumnFamily and table properties), so allowing it as a real CF name is ambiguous in addition to being broken.

This change rejects an empty CF name with Status::InvalidArgument at the top of DBImpl::CreateColumnFamilyImpl, which covers the single-CF CreateColumnFamily API as well as both CreateColumnFamilies overloads (by-names and by-descriptors).

Test Plan:

  • Added ColumnFamilyTest.EmptyNameRejected covering all three Create entry points; verifies IsInvalidArgument() and that no spurious handles are returned.
Summary:
Previously, calling `DB::CreateColumnFamily(opts, "", &handle)` returned
`Status::OK()` with a usable handle, but the column family was not
persisted in the manifest. Any data written to the empty-named CF was
silently lost on DB reopen, and `ListColumnFamilies` would not show it.

The empty string is also reserved as a sentinel meaning "no/unknown
column family" in various RocksDB APIs and serialization formats (e.g.
`TablePropertiesCollectorFactory::Context::kUnknownColumnFamily` and
table properties), so allowing it as a real CF name is ambiguous in
addition to being broken.

This change rejects an empty CF name with `Status::InvalidArgument` at
the top of `DBImpl::CreateColumnFamilyImpl`, which covers the
single-CF `CreateColumnFamily` API as well as both `CreateColumnFamilies`
overloads (by-names and by-descriptors).

Test Plan:
* Added `ColumnFamilyTest.EmptyNameRejected` covering all three
  Create entry points; verifies `IsInvalidArgument()` and that no
  spurious handles are returned.
@meta-cla meta-cla Bot added the CLA Signed label May 12, 2026
@meta-codesync

meta-codesync Bot commented May 12, 2026

Copy link
Copy Markdown

@pdillinger has imported this pull request. If you are a Meta employee, you can view this in D104753911.

@github-actions

github-actions Bot commented May 12, 2026

Copy link
Copy Markdown

✅ clang-tidy: No findings on changed lines

Completed in 254.2s.

@meta-codesync

meta-codesync Bot commented May 12, 2026

Copy link
Copy Markdown

@pdillinger has imported this pull request. If you are a Meta employee, you can view this in D104753911.

@github-actions

Copy link
Copy Markdown

🟡 Codex Code Review

Auto-triggered after CI passed — reviewing commit 2e18528


Codex review failed before producing findings.

WARNING: proceeding, even though we could not update PATH: Refusing to create helper binaries under temporary dir "/tmp" (codex_home: AbsolutePathBuf("/tmp/codex-home"))
error: the argument '--base <BRANCH>' cannot be used with '[PROMPT]'

Usage: codex exec review --commit <SHA> --base <BRANCH> --title <TITLE> --model <MODEL> --config <key=value> --dangerously-bypass-approvals-and-sandbox --output-last-message <FILE> [PROMPT]

For more information, try '--help'.

ℹ️ About this response

Generated by Codex CLI.
Review methodology: claude_md/code_review.md

Limitations:

  • Codex may miss context from files not in the diff
  • Large PRs may be truncated
  • Always apply human judgment to AI suggestions

Commands:

  • /codex-review [context] — Request a code review
  • /codex-query <question> — Ask about the PR or codebase
@github-actions

Copy link
Copy Markdown

✅ Claude Code Review

Auto-triggered after CI passed — reviewing commit 2e18528


Summary

Clean, well-targeted bug fix. The empty-name check is correctly placed, uses the right error code, preserves all invariants, and has no performance impact. The test covers the three main API entry points. Two minor test coverage gaps identified.

High-severity findings (0):

No high-severity findings.

Full review (click to expand)

Findings

🔴 HIGH

None.

🟡 MEDIUM

M1. Missing test for DB::Open with create_missing_column_familiesdb/column_family_test.cc
  • Issue: The new check also affects the DB::Open path when create_missing_column_families=true and a user passes a ColumnFamilyDescriptor with an empty name (db/db_impl/db_impl_open.cc:2634). This path is not tested.
  • Root cause: The test only exercises the three CreateColumnFamily/CreateColumnFamilies entry points, not the DB::Open entry point that also calls CreateColumnFamilyImpl.
  • Suggested fix: Add a test case that opens a DB with create_missing_column_families=true and includes a ColumnFamilyDescriptor("", opts) in the descriptor list, verifying IsInvalidArgument().
M2. Bulk API with empty name as first element not tested — db/column_family_test.cc
  • Issue: The by-names bulk test only covers the case where the empty name is the second element (after a valid "ok_name"). The by-descriptors bulk test covers empty as the only element. But the case where all names are empty (testing handles.empty() with success_once == false) is not explicitly tested for the by-names overload.
  • Root cause: Minor coverage gap.
  • Suggested fix: Consider adding a case with {"", "another"} to verify handles.empty() and that WrapUpCreateColumnFamilies is skipped when no CF was created.

🟢 LOW / NIT

L1. Release note is accurate and well-written — unreleased_history/bug_fixes/reject_empty_cf_name.md
  • The note clearly explains the prior broken behavior and the fix. No changes needed.
L2. Comment could mention the DB::Open path — db/db_impl/db_impl.cc:4274
  • The comment explains the sentinel usage and prior broken behavior well. It could optionally note that the check also covers the create_missing_column_families path in DB::Open, but this is not required.

Cross-Component Analysis

Context Does code execute? Assumptions hold? Action needed?
Internal CF creation (persistent stats) YES Uses non-empty constants Safe, no action
DB::Open with create_missing_column_families YES Correctly rejects empty names Safe (test gap noted in M1)
WritePreparedTxnDB YES (creates CFs) Same check applies Safe
ReadOnly DB NO (can't create CFs) N/A N/A
Table properties sentinel ("" = unknown CF) N/A (read path, not creation) No conflict Safe

Assumption stress-test:

  • Claim: "Empty string is reserved as sentinel." Verified: builder.cc:57 asserts kUnknownColumnFamily ID ↔ empty name; table_properties.cc:150 substitutes "N/A" for empty; meta_blocks.cc:164 skips serializing empty names. The claim holds.
  • Claim: "Prior behavior was silently broken." Verified: Empty-named CFs bypass manifest persistence because the name is empty in serialization. Data loss on reopen confirmed by the PR description and corroborated by the serialization skip logic.

Positive Observations

  • The check is placed optimally — before any expensive operations (option validation, directory creation, mutex acquisition).
  • *handle = nullptr is set before the early return, maintaining the contract that handle is null on error.
  • Error message is clear and follows existing conventions (cf. "Column family already exists" at line 4292).
  • The Status s; reordering is semantically neutral (local default-constructed variable).
  • All four review agents independently confirmed correctness, thread safety, and performance neutrality.

ℹ️ About this response

Generated by Claude Code.
Review methodology: claude_md/code_review.md

Limitations:

  • Claude may miss context from files not in the diff
  • Large PRs may be truncated
  • Always apply human judgment to AI suggestions

Commands:

  • /claude-review [context] — Request a code review
  • /claude-query <question> — Ask about the PR or codebase
@meta-codesync

meta-codesync Bot commented May 12, 2026

Copy link
Copy Markdown

@pdillinger merged this pull request in 59633e3.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

2 participants