Finish C API code generation (continues #14572)#14868
Conversation
Summary: RocksDB's C API (include/rocksdb/c.h / db/c.cc) is large and hand-maintained, with hundreds of repetitive setter/getter wrappers that follow strict naming and type conventions. Adding new C++ fields requires manually adding matching C wrappers, which is tedious and error-prone. This PR introduces a semi-automated code generation system for the mechanical parts of the C API while keeping complex wrappers (callbacks, ownership-transfer, multi-get families) hand-written. The system has two complementary generators: 1. Auto-discovery (auto_simple_bindings.py): Scans selected C++ public option/metadata structs and auto-generates getter/setter wrappers for simple scalar, enum, string, and chrono fields. Generated .inc files are checked in and compiled into c.h/c.cc via #include. Fields that cannot be auto-generated must be explicitly blocklisted in auto_simple_bindings_blocklist.json with a policy and reason. 2. Spec-driven generator (generate_c_api.py): Takes spec.json as input and emits consistent boilerplate for method-style wrappers whose C shape cannot be inferred from the C++ header alone (e.g. rocksdb_put, rocksdb_transaction_commit, WriteBatch methods). Coverage improvement: This PR adds 725 new C API functions, growing the public C API surface from 1,053 to 1,778 exported functions -- a 68.9% increase. The bulk of the new coverage (436 functions) comes from auto-discovered option struct setters/getters that were previously missing. Coverage breakdown by family: - Option structs (auto-discovered): 436 new functions - Metadata structs (auto-discovered): 89 new functions - ReadOptions (auto-discovered): 46 new functions - JobInfo structs (auto-discovered): 46 new functions - Spec-driven (subset wrappers): 56 new functions - DB simple operations: 27 new functions - Transaction/TransactionDB/WriteBatch: 29 new functions The generated code emits the same style as today's hand-written wrappers (SaveError, Slice(), malloc-owned buffers, unsigned char booleans) and is organized in clearly marked generated sections within c.h/c.cc. Test Plan: - Existing db_test.c C API tests pass (1743 lines of tests extended/verified) - python3 tools/c_api_gen/regen_all.py && git diff --exit-code -- include/rocksdb/c_api_gen db/c_api_gen verifies generated output is stable - python3 tools/c_api_gen/validate_generated_equivalence.py --ref HEAD verifies equivalence with reference hand-written wrappers
# Conflicts: # utilities/trie_index/trie_index_factory.cc
…ream) Reconciles the semi-automated C API code generation branch (PR facebook#14572, authored by Xingbo Wang) with ~4 months of upstream main. Conflicts were limited to the generated/test files: include/rocksdb/c.h, db/c.cc, db/c_test.c. Reconciliation strategy (generated files are resolved by regeneration, not hand-merge): - Bucket A (already generated by this branch; upstream added hand-written copies): rocksdb_options_{get,set}_memtable_batch_lookup_optimization, rocksdb_readoptions_{get,set}_optimize_multiget_for_io, rocksdb_block_based_options_set_uniform_cv_threshold, rocksdb_transactiondb_options_set_write_policy. Verified the generated signatures are ABI-identical to upstream's hand-written ones; the generated versions win and upstream's hand-written copies are dropped on regeneration. - Bucket B (simple new C++ option fields; auto-discovery now emits them from the merged headers): rocksdb_options_{get,set}_async_wal_precreate, rocksdb_options_{get,set}_use_direct_io_for_compaction_reads. - Bucket C (complex wrappers; ported by hand into the c_base.* templates): rocksdb_set_db_options, rocksdb_backup_engine_stop_backup, rocksdb_backup_engine_options_set_backup_rate_limiter, rocksdb_backup_engine_options_set_restore_rate_limiter. - New upstream ReadOptions field read_scoped_block_buffer_provider (a complex provider pointer) added to auto_simple_bindings_blocklist.json with policy "manual"; the fail-closed generator surfaced it automatically. include/rocksdb/c.h and db/c.cc regenerated via tools/c_api_gen/regen_all.py; db/c_test.c resolved by hand (kept both sides' new test coverage). All 14 upstream-added C API functions verified present; db/c.cc syntax-checks clean under C++20.
…lates Addresses RocksDB maintainer feedback and the build-system issues found while taking over PR facebook#14572. Maintainer concern (pdillinger): include/rocksdb must contain only user-includable headers. The hand-written generator template c_base.h was a non-includable "SOURCE TEMPLATE" living in include/rocksdb and still carrying unresolved #include "c_api_gen/*.inc" directives, which broke `make check-headers` and was shipped by `make install`. Move the templates next to the generator inputs they are: include/rocksdb/c_base.h -> tools/c_api_gen/c_base.h db/c_base.cc -> tools/c_api_gen/c_base.cc include/rocksdb/ now contains only the user-facing, self-contained, @generated c.h. Generator/validator paths and the generated banners are updated to match; the inlined c.h/c.cc are unaffected at the API level. Build system: - Remove the dead `$(OBJ_DIR)/db/c.o: $(C_API_CODEGEN_STAMP)` prerequisites (the variable was never defined, so they were silent no-ops) and the orphan tools/c_api_gen/stamp .gitignore entry. Regeneration requires clang++ and is intentionally not on the compile path; the committed c.h/c.cc are self-contained. - `make check`'s C API staleness check is moved inside the SKIP_FORMAT_BUCK_CHECKS guard (so it can be opted out like the other format checks) and now runs the robust temp-dir verifier instead of regenerating in place + `git diff`. It skips gracefully (with a message) when clang++ is not installed, so `make check` works without the codegen toolchain. CI remains the authoritative gate. clang-format reproducibility: - Thread a pinned `--clang-format` through regen_all.py and down from verify_generated_up_to_date.py, and format with an explicit `--style=file:.clang-format`. This closes the hole where the regenerated c.h/c.cc were formatted with an unpinned clang-format while CI pinned clang-format-21, which could produce spurious staleness failures. README documents the required clang/clang-format versions. Cleanup: - Remove a duplicated _find_clang_binary/get_clang_binary pair (a merge artifact; the first definitions were dead) and broaden the clang++ version fallback list (newest first, incl. clang++-21). - CI: `apt-get update` before installing clang-format-21. Generated c.h/c.cc/.inc refreshed; tools/c_api_gen/verify_generated_up_to_date.py passes and db/c.cc syntax-checks clean under C++20.
…s gate Fixes a P0 link-time breakage for downstream C bindings (rust-rocksdb, etc.): five public C API functions were declared in include/rocksdb/c.h but their hand-written implementations had been removed during the codegen migration and never regenerated, so they were defined nowhere. They build fine into librocksdb (the symbols are simply absent) and RocksDB's own c_test does not call them, so CI never noticed -- but any binding that references them fails to link. Restore the implementations in tools/c_api_gen/c_base.cc (verbatim from the historical/upstream versions, preserving their signatures and therefore ABI): - rocksdb_options_set_bottommost_compression_options - rocksdb_options_set_bottommost_compression_options_zstd_max_train_bytes - rocksdb_options_set_bottommost_compression_options_use_zstd_dict_trainer - rocksdb_options_set_bottommost_compression_options_max_dict_buffer_bytes - rocksdb_options_set_max_bytes_for_level_multiplier_additional These set multiple fields of a nested struct / take a caller-owned array, which is exactly why the simple auto-generator cannot emit them; a comment documents that they are intentionally hand-written. Add tools/c_api_gen/check_api_completeness.py: a cheap, dependency-free gate (no clang/libclang needed) asserting every `extern ROCKSDB_LIBRARY_API` function declared in c.h has exactly one definition in c.cc. This is the check that would have caught the bug above automatically; it is wired into `make check-c-api-gen` (runs unconditionally, before the clang-gated staleness check) and into CI. Running the new gate also surfaced a pre-existing upstream bug unrelated to codegen: rocksdb_writebatch_wi_create_from(const char*, size_t) has been declared in c.h but defined nowhere in RocksDB for a long time. WriteBatchWithIndex has no constructor from a serialized representation (unlike plain WriteBatch), so the function is unimplementable as declared and no consumer can have been linking against it successfully. Remove the dead declaration so the API is honest and the completeness gate passes. After the fix: all 1737 declared C API functions have exactly one definition; verify_generated_up_to_date passes; db/c.cc syntax-checks clean under C++20.
RocksDB's C API has a strong backward-compatibility guarantee, but the auto-generator derived each wrapper's C type purely from the C++ field type. For functions that shipped before codegen this silently changed 5 public signatures -- an ABI break for every downstream binding (Rust, Go, Python, ...) that was compiled against the old headers: rocksdb_writeoptions_disable_WAL int -> unsigned char rocksdb_restore_options_set_keep_log_files int -> unsigned char rocksdb_block_based_options_set_checksum char -> int rocksdb_block_based_options_set_format_version int -> uint32_t rocksdb_block_based_options_set_block_size size_t -> uint64_t rocksdb_writeoptions_disable_WAL in particular is one of the most widely used C API functions. Add an ABI type-pinning layer: - tools/c_api_gen/abi_type_overrides.json records, per function, the C type that must be preserved for compatibility, with a reason. - auto_simple_bindings.py applies these overrides to the generated specs: the public C parameter/return type is pinned to the historical type while the body still casts to the real C++ field type via static_cast<decltype(field)>(value), so behavior is identical and only the signature is held stable. Policy: only already-shipped functions get overrides; brand-new functions use the natural C++ -> C mapping. A repo-wide signature diff against the merge-base now reports 0 ABI drifts (previously 5). The hardened equivalence validator (next commit) enforces this so future drift fails CI. verify_generated_up_to_date and the link-completeness check pass; db/c.cc syntax-checks clean under C++20.
…ze validator Adds the authoritative, ongoing backward-compatibility enforcement that the ABI-preservation work needs, and fixes the existing validator's misleading defaults (per review feedback that it provided near-zero real assurance). New: tools/c_api_gen/check_api_compatibility.py - Compares the current public c.h against a reference revision at the SIGNATURE level (return type + parameter types; parameter names are ignored since they do not affect the C ABI). Robust: no false positives from formatting or function-body token differences. - Fails on any removed function or changed signature; brand-new functions are reported, never an error. - Intentional, reviewed incompatibilities are recorded in api_compatibility_allowlist.json with a reason. Seeded with the one intentional removal (rocksdb_writebatch_wi_create_from, which was never implemented). - Dependency-free (git + python). Wired into `make check-c-api-gen` (graceful skip if the baseline ref is not resolvable locally) and into CI against origin/main. Confirmed: 1070 reference functions preserved, 668 new, 0 removed/changed. Fix: validate_generated_equivalence.py - `--ref` was defaulting to HEAD, i.e. comparing the generated output against itself (a tautology now that the wrappers are generated). Make `--ref` required and document that it must be a pre-migration revision; note that it is a best-effort body-equivalence aid and check_api_compatibility.py is the authoritative back-compat gate. README: document the source templates, the verification tools, and abi_type_overrides.json.
…se.cc All of these live in the hand-written template (the generators emit clean code), so they are fixed in tools/c_api_gen/c_base.cc and propagated by regeneration; db/c.cc is never edited directly. Correctness: - CopyStringVector: null-check the malloc result before strdup'ing into it (previously segfaulted on allocation failure). - WalFilter::LogRecordFound: std::move the WriteBatch out of the local c_new_batch into *new_batch instead of copying a soon-to-be-destroyed batch. - set_exclude_files_callback: capture the callback and state by value rather than the options wrapper pointer, so the stored std::function does not depend on the wrapper outliving it (matches the safer progress_callback pattern). clang-tidy (clears all 20 warnings the bot reported on db/c.cc changed lines): - SliceTransformWrapper: initialize rep_ and declare the rule-of-5 (it owns rep_ and is neither copyable nor movable) [cppcoreguidelines-pro-type-member-init, -special-member-functions]. - Use std::make_shared for the VectorRepFactory and the two WriteBufferManager allocations [modernize-make-shared]. - Add braces to the table-properties positional accessors, the WAL filter destructor, and the callback logger [readability-braces-around-statements]. - Split "Slice a, b;" declarations [readability-isolate-declaration]. The remaining clang-tidy findings are on pre-existing wrapper structs (rocksdb_comparator_t, rocksdb_mergeoperator_t, ...) that this PR does not modify and that the bot did not flag; left untouched to avoid churning unrelated code. db/c.cc syntax-checks clean under C++20; completeness, compatibility, and staleness checks pass.
…at check
Merge-completeness fix. While reconciling with main, regeneration overwrote
three enums that upstream had extended by hand after the branch diverged,
dropping five public enum constants from c.h (a backward-incompatible removal
that broke C consumers and the merged c_test.c which references them):
- rocksdb tickers enum: add rocksdb_blob_cache_read_byte and bump
rocksdb_total_metric_count 85 -> 86
- index block search type enum: add
rocksdb_block_based_table_index_block_search_type_auto = 2
- transaction DB write policy enum (was entirely absent):
rocksdb_txndb_write_policy_write_{committed,prepared,unprepared}
Restored in tools/c_api_gen/c_base.h.
My function-level reconciliation and the link-completeness check only covered
functions, so these slipped through. Extend check_api_compatibility.py to also
diff non-function public symbols (enum constants and typedefs) against the
reference: it now confirms 229 enum/typedef symbols are preserved and would
have flagged this regression. (Enum constants do not affect linking, so the
completeness check cannot see them; this is the right gate.)
~44% of the newly generated C API functions had no test coverage, the bulk being auto-discovered option getters/setters. Rather than hand-writing hundreds of assertions (which would drift as the API grows), derive the tests from the same checked-in generated fragments. Add tools/c_api_gen/gen_roundtrip_tests.py: it parses the generated declaration fragments, pairs every rocksdb_<obj>_set_<field> with its rocksdb_<obj>_get_<field>, and for each option object with a parameterless create/destroy emits a create -> set sentinel -> assert getter returns it -> destroy test. The sentinel is chosen from the getter's return type (unsigned char -> 1, const char* -> "test", else 42), which round-trips reliably even through ABI-pinned setter parameter types because the generated setters are direct field assignments. No clang is required (it only reads the committed .inc files). Output c_api_gen/c_generated_roundtrip_tests.c.inc (232 assertions across 25 option objects) is included by db/c_test.c and invoked from main() in a new "generated_option_roundtrips" phase. Wired into regen_all.py so coverage tracks the generated surface automatically. c_test.c compiles clean.
…nerated files - Add the required unreleased_history/public_api_changes entry for the large set of new C API functions. - add_public_api.md Step 5 and the file checklist still told contributors to hand-edit include/rocksdb/c.h and db/c.cc, which are now @generated and would be clobbered on regeneration. Point manual wrappers at the source templates (tools/c_api_gen/c_base.h / c_base.cc) + regen, and mark c.h/c.cc @generated.
… enum type
P2 cleanup:
- Remove tools/c_api_gen/generated/c_preview.{h,cc}.inc. These were a checked-in
preview output of generate_c_api.py, not referenced by regen_all.py or any
build/CI step, so they silently drifted. Update the README to describe
regen_all.py as the entry point and to preview spec output to a scratch path
instead.
- Document why FamilyConfig.enum_c_type is per-family (option families ship
enums as int, job-info families as uint32_t, e.g. the pre-existing
rocksdb_flushjobinfo_flush_reason): unifying them would be an ABI break, so
the apparent inconsistency is intentional and must stay.
Running c_test surfaced a real bug in the generated round-trip tests: rocksdb_options_get_use_fsync(obj) == 42 failed. use_fsync is a bool field whose spec-driven C wrapper uses the legacy `int` getter shape (not `unsigned char`), so the getter-return-type heuristic picked the scalar sentinel 42 -- which a bool field collapses to 1. A bool field is not distinguishable from a real scalar by the C signature alone, so test with 1 and 0 instead: both round-trip through bool, scalar, and enum fields, and exercising both directions also verifies the setter actually changes the value regardless of its default. 462 assertions across 25 option objects.
✅ clang-tidy: No findings on changed linesCompleted in 73.5s. |
…heck ref) Addresses the red checks on the PR: - check-format-and-targets (clang-format): my hand-written additions to the c_base templates weren't clang-format-clean and propagated into the generated c.h/c.cc. Reflowed the two backup-engine rate-limiter declarations (return type on its own line) and the exclude-files callback body to match clang-format, and shortened the generated banner's "Edit ..." line so it stays <= 80 columns for the longer `c_base.cc` edit target (clang-format was reflowing it). Verified with git-clang-format against the merge base: "clang-format did not modify any files". - check-format-and-targets (check-sources): replaced a non-ASCII em dash in a regen_all.py comment with "--" (RocksDB requires ASCII-only source files). - build-linux-clang-21-no_test_run: the backward-compat CI step referenced origin/main, which `git fetch origin main` does not create (it writes FETCH_HEAD). Use --ref FETCH_HEAD. Verified locally: check-sources, check-buck-targets (BUCK unchanged), git-clang-format, link-completeness, backward-compat (FETCH_HEAD), and the generated-up-to-date check all pass. (The 3 build-windows-vs2022 failures are unrelated CI infra: CMake reports "could not find any instance of Visual Studio" during configuration, before any compilation; all Linux/macOS CMake builds pass.)
The backward-compat CI step runs `git fetch` and the checker runs `git show`, but the containerized checkout (/__w/rocksdb/rocksdb) is owned by a different user, so git aborts with "detected dubious ownership" (exit 128). Mark the workspace as a safe directory before the git operations.
|
@xingbowang has imported this pull request. If you are a Meta employee, you can view this in D109149150. |
|
Peter had another comment before that I have not addressed. Please address accordingly. THanks.
|
|
Both of these are already taken care of in this PR. I think Peter was looking at the original #14572, before the cleanup. The non-includable file he means is So the only thing this PR touches under
The // Copyright (c) Meta Platforms, Inc. and affiliates.
// This source code is licensed under both the GPLv2 (found in the
// COPYING file in the root directory) and Apache 2.0 License
// (found in the LICENSE.Apache file in the root directory).
// @generated
// -----------------------------------------------------------------------------
// Auto-generated by tools/c_api_gen/regen_all.py.
// DO NOT EDIT THIS FILE DIRECTLY.
// Edit tools/c_api_gen/c_base.h or the inputs under tools/c_api_gen/,
// then rerun: python3 tools/c_api_gen/regen_all.py
// -----------------------------------------------------------------------------
|
|
From Peter:
|
…ew feedback Follow-up to maintainer feedback on the PR. check-c-api-gen could silently lose coverage (a reviewer concern): the make target is skippable (no clang++ -> skip; unresolvable compat ref -> skip; the whole thing is behind SKIP_FORMAT_BUCK_CHECKS), and CI doesn't even run the make target -- it called the three checkers directly. So a CHECK_C_API_GEN_STRICT flag that only lived in the make target would be inert in CI. - Add CHECK_C_API_GEN_STRICT: when set, every "skip" in `make check-c-api-gen` becomes a hard error. Treat 0/no/false (not just empty) as "off" so CHECK_C_API_GEN_STRICT=0 doesn't accidentally enable it. - Actually use it: the dedicated build-linux-clang-21-no_test_run CI job now runs `make check-c-api-gen CHECK_C_API_GEN_STRICT=1 API_COMPAT_REF=FETCH_HEAD CLANG_FORMAT_BINARY=clang-format-21` (single source of truth instead of three ad-hoc python invocations), so if a prerequisite ever goes missing the job hard-fails instead of quietly passing. - Fix the clang-availability gate: it tested `$(CXX)` verbatim, which mis-detects a ccache-prefixed/versioned CXX (e.g. "ccache clang++-21") -- both as a false "no clang" skip and, for g++, as a false "have clang" that then crashes the generator. Detect a clang the way the generator does. Correspondingly, make the generator's _find_clang_binary pick the clang *token* out of a multi-word $CXX (use the bare clang for -ast-dump, not the launcher). - Fix the stale "does NOT modify the working tree" comment (the verifier regenerates in place and compares against a pre-run snapshot). Buck c_test: db/c_test.c #includes the generated c_api_gen/*.inc, so under Buck's sandbox the c_test_bin target needs them as headers. That target lives in the internal //rocks/buckifier:defs.bzl (the OSS buckifier only emits a parameterless add_c_test_wrapper()), so the headers glob has to be added there. Document the coupling in buckifier/targets_builder.py so it isn't lost. Make/CMake already resolve the include via -I. / PROJECT_SOURCE_DIR.
|
Went through all three: Buck c_test: the Windows CI: I merged check-c-api-gen skipping: good catch, and it was worse than just the skips - the dedicated
While there I also fixed the clang-detection gate (it tested |
🟡 Codex Code ReviewAuto-triggered after CI passed — reviewing commit 25d3b04 ❌ Codex review failed before producing findings. ℹ️ About this responseGenerated by Codex CLI. Limitations:
Commands:
|
✅ Claude Code ReviewAuto-triggered after CI passed — reviewing commit 25d3b04 SummaryThis is a well-structured, large-scale PR that finishes the C API code generation infrastructure. The design is sound: hand-written templates + two generators (spec-driven and auto-discovery) producing .inc fragments inlined into the final c.h/c.cc, with comprehensive CI gates for completeness, backward compatibility, and staleness. The ABI type-pinning layer is a thoughtful solution to the historical signature drift problem. The 462 auto-generated round-trip tests provide good coverage of the generated getters/setters. High-severity findings (2):
Full review (click to expand)Findings🔴 HIGHH1.
|
| Context | Code executes? | Assumptions hold? | Action needed? |
|---|---|---|---|
| WritePreparedTxnDB | N/A | N/A | C API wrappers are thin, no transaction-layer assumptions |
| ReadOnly DB | N/A | N/A | Wrappers delegate to C++ methods which handle this |
| User-defined timestamps | Yes (via _with_ts functions) |
Yes | Properly wrapped with ts/tslen parameters |
| FIFO / Universal compaction | N/A | N/A | No compaction-specific logic in wrappers |
| Concurrent writers | N/A | N/A | Thread safety delegated to C++ layer |
| Windows / MSVC | Yes | Yes | c.h is pure C with _WIN32 DLL export guards |
Positive Observations
- Comprehensive static_asserts: The PR adds layout assertions for 15+ wrapper structs, validating that
reinterpret_castbetween C++ types and their C wrapper structs is safe. This is a strong defensive pattern. - ABI type-pinning is elegant: Rather than silently changing signatures or maintaining manual overrides scattered throughout the code, the
abi_type_overrides.json+static_cast<decltype(field)>pattern centralizes ABI compatibility in one place. - Three-layer CI gate (completeness + compatibility + staleness) provides defense-in-depth against common code-generation migration bugs.
- Exclude-files callback fix: Capturing callback + state by value in the lambda (line 1769) rather than capturing the wrapper pointer is the correct pattern, avoiding a use-after-free if the options object is destroyed before the lambda runs.
- WalFilter WriteBatch move: Using
std::move(c_new_batch.rep)(line 368) instead of copy is both correct and efficient sincec_new_batchis a local about to be destroyed.
ℹ️ About this response
Generated by Claude Code.
Review methodology: claude_md/code_review.md
Limitations:
- Claude may miss context from files not in the diff
- Large PRs may be truncated
- Always apply human judgment to AI suggestions
Commands:
/claude-review [context]— Request a code review/claude-query <question>— Ask about the PR or codebase
|
@xingbowang has imported this pull request. If you are a Meta employee, you can view this in D109149150. |
|
There is some internal CI failures. I am fixing it. Should be merged in a day or 2. |
|
@xingbowang merged this pull request in 1cec28d. |
Summary
This continues and finishes #14572 ("Add semi-automated code generation for RocksDB C API bindings") by @xingbowang. The original author is unavailable to finish it, so I've taken it over. All 13 of the original commits are preserved (this branch was created from the PR head and builds on top of it —
git logshows the originalXingbo Wangauthorship intact); my follow-up work is in the commits prefixedC API codegen:.The underlying design is unchanged and is the original author's: hand-written source templates (
tools/c_api_gen/c_base.h/c_base.cc) plus two generators (auto-discovery from the C++ headers + a spec-driven generator) are inlined into a single, self-contained,@generatedinclude/rocksdb/c.handdb/c.cc. This grows the public C API by 668 functions while keepingc.ha single includable header with no-Irequirement (sobindgenand other FFI tools keep working unchanged).This branch reconciles the PR with ~4 months of
mainand addresses the outstanding review feedback (clang-tidy bot, the automated code review, thec.hself-containedness discussion, and @pdillinger's points aboutinclude/rocksdbhygiene and@generatedmarking).What changed on top of the original PR
Reconciled with current
mainmain(conflicts were confined to the generated/test files) and regenerated. Reconciled the 14 C API functionsmainadded since the merge-base (e.g.rocksdb_set_db_options, the backup-engine rate limiters,memtable_batch_lookup_optimization,optimize_multiget_for_io, …) and restored 5 enum constants that upstream had added by hand (rocksdb_txndb_write_policy_*,..._index_block_search_type_auto,rocksdb_blob_cache_read_byte).Maintainer feedback (@pdillinger)
include/rocksdb. Moved the hand-written templates out ofinclude/rocksdb/anddb/totools/c_api_gen/c_base.{h,cc}. They were#include-ing generated fragments, which brokemake check-headersand was shipped bymake install.include/rocksdb/now contains only the user-facing, self-contained,@generatedc.h.c.h/c.cccarry the// @generatedmarker.Backward compatibility (zero ABI break)
rocksdb_writeoptions_disable_WALint→unsigned char). Added an ABI type-pinning layer (tools/c_api_gen/abi_type_overrides.json) so already-shipped functions keep their historical C signature (the body still casts to the real field type). A repo-wide diff against the merge-base now reports 0 ABI drift.check_api_compatibility.pygate (wired into CI +make) fails on any removed/changed public function or removed enum/typedef symbol, vs a reference revision. Intentional changes go in an allowlist with a reason.Correctness (from the automated review)
c.hbut defined nowhere (link failure for downstream bindings such asrust-rocksdb). Addedcheck_api_completeness.py(dependency-free; runs in CI +make) asserting every declared function has exactly one definition — this is the gate that would have caught it.CopyStringVectornow null-checksmalloc; the WAL filterstd::moves theWriteBatch; the backup exclude-files callback captures by value instead of the wrapper pointer.Build / CI robustness
C_API_CODEGEN_STAMPMakefile prerequisite (it was a silent no-op).make checkstaleness check is now opt-out-able (behindSKIP_FORMAT_BUCK_CHECKS) and skips gracefully whenclang++is unavailable, somake checkworks without the codegen toolchain. CI remains the authoritative gate.clang-formatconsistently throughregen_all.py/verify_generated_up_to_date.py(CI uses clang-format-21) so regeneration is byte-reproducible across environments.clang-tidywarnings the bot reported ondb/c.ccchanged lines (fixed in thec_base.cctemplate, not the generated output).Test coverage
gen_roundtrip_tests.py, which derives 462 set→get→assert round-trip checks across 25 option objects from the same generated fragments and wires them intodb/c_test.c. Coverage now tracks the generated surface automatically.Docs
unreleased_history/public_api_changesnote and fixedclaude_md/add_public_api.md, which still told contributors to hand-edit the now-@generatedc.h/c.cc.Test Plan
make c_test && ./c_test— passes, including the 462 generated round-trip assertions (a successful link also confirms the API is complete).python3 tools/c_api_gen/check_api_completeness.py— all 1737 declared functions defined exactly once.python3 tools/c_api_gen/check_api_compatibility.py --ref <release>— 1070 reference functions + 229 enum/typedef symbols preserved, 0 removed/changed.python3 tools/c_api_gen/verify_generated_up_to_date.py— generated output is stable.include/rocksdb/c.hconfirmed self-contained (only<stdbool.h>,<stddef.h>,<stdint.h>).cc @xingbowang