You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Finish C API code generation (continues #14572) (#14868)
Summary:
This continues and finishes **#14572 ("Add semi-automated code generation for RocksDB C API bindings") by xingbowang. The original author is unavailable to finish it, so I've taken it over. **All 13 of the original commits are preserved** (this branch was created from the PR head and builds on top of it — `git log` shows the original `Xingbo Wang` authorship intact); my follow-up work is in the commits prefixed `C API codegen:`.
The underlying design is unchanged and is the original author's: hand-written source templates (`tools/c_api_gen/c_base.h` / `c_base.cc`) plus two generators (auto-discovery from the C++ headers + a spec-driven generator) are inlined into a single, self-contained, `generated` `include/rocksdb/c.h` and `db/c.cc`. This grows the public C API by **668 functions** while keeping `c.h` a single includable header with no `-I` requirement (so `bindgen` and other FFI tools keep working unchanged).
This branch reconciles the PR with ~4 months of `main` and addresses the outstanding review feedback (clang-tidy bot, the automated code review, the `c.h` self-containedness discussion, and pdillinger's points about `include/rocksdb` hygiene and `generated` marking).
## What changed on top of the original PR
### Reconciled with current `main`
- Merged current `main` (conflicts were confined to the generated/test files) and regenerated. Reconciled the 14 C API functions `main` added since the merge-base (e.g. `rocksdb_set_db_options`, the backup-engine rate limiters, `memtable_batch_lookup_optimization`, `optimize_multiget_for_io`, …) and restored 5 enum constants that upstream had added by hand (`rocksdb_txndb_write_policy_*`, `..._index_block_search_type_auto`, `rocksdb_blob_cache_read_byte`).
### Maintainer feedback (pdillinger)
- **No non-user-includable files in `include/rocksdb`.** Moved the hand-written templates out of `include/rocksdb/` and `db/` to `tools/c_api_gen/c_base.{h,cc}`. They were `#include`-ing generated fragments, which broke `make check-headers` and was shipped by `make install`. `include/rocksdb/` now contains only the user-facing, self-contained, `generated` `c.h`.
- `c.h` / `c.cc` carry the `// generated` marker.
### Backward compatibility (zero ABI break)
- The generator derived each wrapper's C type purely from the C++ field, which had silently changed **5 already-shipped signatures** (e.g. `rocksdb_writeoptions_disable_WAL` `int` → `unsigned char`). Added an ABI type-pinning layer (`tools/c_api_gen/abi_type_overrides.json`) so already-shipped functions keep their historical C signature (the body still casts to the real field type). A repo-wide diff against the merge-base now reports **0 ABI drift**.
- New `check_api_compatibility.py` gate (wired into CI + `make`) fails on any removed/changed public function **or** removed enum/typedef symbol, vs a reference revision. Intentional changes go in an allowlist with a reason.
### Correctness (from the automated review)
- Restored 5 option setters that were declared in `c.h` but **defined nowhere** (link failure for downstream bindings such as `rust-rocksdb`). Added `check_api_completeness.py` (dependency-free; runs in CI + `make`) asserting every declared function has exactly one definition — this is the gate that would have caught it.
- `CopyStringVector` now null-checks `malloc`; the WAL filter `std::move`s the `WriteBatch`; the backup exclude-files callback captures by value instead of the wrapper pointer.
### Build / CI robustness
- Removed the dead `C_API_CODEGEN_STAMP` Makefile prerequisite (it was a silent no-op).
- The `make check` staleness check is now opt-out-able (behind `SKIP_FORMAT_BUCK_CHECKS`) and skips gracefully when `clang++` is unavailable, so `make check` works without the codegen toolchain. CI remains the authoritative gate.
- Pinned `clang-format` consistently through `regen_all.py` / `verify_generated_up_to_date.py` (CI uses clang-format-21) so regeneration is byte-reproducible across environments.
- Cleared all 20 `clang-tidy` warnings the bot reported on `db/c.cc` changed lines (fixed in the `c_base.cc` template, not the generated output).
- Updated the internal Buck `c_test_bin` wrapper to expose generated `c_api_gen/*.inc` fragments as headers, so sandboxed Buck builds can compile `db/c_test.c` after the generated round-trip tests are included.
### Test coverage
- Added `gen_roundtrip_tests.py`, which derives **462 set→get→assert round-trip checks across 25 option objects** from the same generated fragments and wires them into `db/c_test.c`. Coverage now tracks the generated surface automatically.
### Docs
- Added the `unreleased_history/public_api_changes` note and fixed `claude_md/add_public_api.md`, which still told contributors to hand-edit the now-`generated` `c.h`/`c.cc`.
Pull Request resolved: #14868
Test Plan:
- `make c_test && ./c_test` — **passes**, including the 462 generated round-trip assertions (a successful link also confirms the API is complete).
- `python3 tools/c_api_gen/check_api_completeness.py` — all 1737 declared functions defined exactly once.
- `python3 tools/c_api_gen/check_api_compatibility.py --ref <release>` — 1070 reference functions + 229 enum/typedef symbols preserved, 0 removed/changed.
- `python3 tools/c_api_gen/verify_generated_up_to_date.py` — generated output is stable.
- `include/rocksdb/c.h` confirmed self-contained (only `<stdbool.h>`, `<stddef.h>`, `<stdint.h>`).
cc xingbowang
- `buck2 build --flagfile fbcode//mode/dev fbcode//internal_repo_rocksdb/repo:c_test_bin` — passes.
- `buck2 build --flagfile fbcode//mode/dev --config fbcode.arch=aarch64 fbcode//internal_repo_rocksdb/repo:c_test_bin` — passes.
Reviewed By: pdillinger
Differential Revision: D109149150
Pulled By: xingbowang
fbshipit-source-id: 3417375345f360a4c78bdfe27e9850b89d0a226a
0 commit comments