Skip to content

Commit 1cec28d

Browse files
zaidoon1meta-codesync[bot]
authored andcommitted
Finish C API code generation (continues #14572) (#14868)
Summary: This continues and finishes **#14572 ("Add semi-automated code generation for RocksDB C API bindings") by xingbowang. The original author is unavailable to finish it, so I've taken it over. **All 13 of the original commits are preserved** (this branch was created from the PR head and builds on top of it — `git log` shows the original `Xingbo Wang` authorship intact); my follow-up work is in the commits prefixed `C API codegen:`. The underlying design is unchanged and is the original author's: hand-written source templates (`tools/c_api_gen/c_base.h` / `c_base.cc`) plus two generators (auto-discovery from the C++ headers + a spec-driven generator) are inlined into a single, self-contained, `generated` `include/rocksdb/c.h` and `db/c.cc`. This grows the public C API by **668 functions** while keeping `c.h` a single includable header with no `-I` requirement (so `bindgen` and other FFI tools keep working unchanged). This branch reconciles the PR with ~4 months of `main` and addresses the outstanding review feedback (clang-tidy bot, the automated code review, the `c.h` self-containedness discussion, and pdillinger's points about `include/rocksdb` hygiene and `generated` marking). ## What changed on top of the original PR ### Reconciled with current `main` - Merged current `main` (conflicts were confined to the generated/test files) and regenerated. Reconciled the 14 C API functions `main` added since the merge-base (e.g. `rocksdb_set_db_options`, the backup-engine rate limiters, `memtable_batch_lookup_optimization`, `optimize_multiget_for_io`, …) and restored 5 enum constants that upstream had added by hand (`rocksdb_txndb_write_policy_*`, `..._index_block_search_type_auto`, `rocksdb_blob_cache_read_byte`). ### Maintainer feedback (pdillinger) - **No non-user-includable files in `include/rocksdb`.** Moved the hand-written templates out of `include/rocksdb/` and `db/` to `tools/c_api_gen/c_base.{h,cc}`. They were `#include`-ing generated fragments, which broke `make check-headers` and was shipped by `make install`. `include/rocksdb/` now contains only the user-facing, self-contained, `generated` `c.h`. - `c.h` / `c.cc` carry the `// generated` marker. ### Backward compatibility (zero ABI break) - The generator derived each wrapper's C type purely from the C++ field, which had silently changed **5 already-shipped signatures** (e.g. `rocksdb_writeoptions_disable_WAL` `int` → `unsigned char`). Added an ABI type-pinning layer (`tools/c_api_gen/abi_type_overrides.json`) so already-shipped functions keep their historical C signature (the body still casts to the real field type). A repo-wide diff against the merge-base now reports **0 ABI drift**. - New `check_api_compatibility.py` gate (wired into CI + `make`) fails on any removed/changed public function **or** removed enum/typedef symbol, vs a reference revision. Intentional changes go in an allowlist with a reason. ### Correctness (from the automated review) - Restored 5 option setters that were declared in `c.h` but **defined nowhere** (link failure for downstream bindings such as `rust-rocksdb`). Added `check_api_completeness.py` (dependency-free; runs in CI + `make`) asserting every declared function has exactly one definition — this is the gate that would have caught it. - `CopyStringVector` now null-checks `malloc`; the WAL filter `std::move`s the `WriteBatch`; the backup exclude-files callback captures by value instead of the wrapper pointer. ### Build / CI robustness - Removed the dead `C_API_CODEGEN_STAMP` Makefile prerequisite (it was a silent no-op). - The `make check` staleness check is now opt-out-able (behind `SKIP_FORMAT_BUCK_CHECKS`) and skips gracefully when `clang++` is unavailable, so `make check` works without the codegen toolchain. CI remains the authoritative gate. - Pinned `clang-format` consistently through `regen_all.py` / `verify_generated_up_to_date.py` (CI uses clang-format-21) so regeneration is byte-reproducible across environments. - Cleared all 20 `clang-tidy` warnings the bot reported on `db/c.cc` changed lines (fixed in the `c_base.cc` template, not the generated output). - Updated the internal Buck `c_test_bin` wrapper to expose generated `c_api_gen/*.inc` fragments as headers, so sandboxed Buck builds can compile `db/c_test.c` after the generated round-trip tests are included. ### Test coverage - Added `gen_roundtrip_tests.py`, which derives **462 set→get→assert round-trip checks across 25 option objects** from the same generated fragments and wires them into `db/c_test.c`. Coverage now tracks the generated surface automatically. ### Docs - Added the `unreleased_history/public_api_changes` note and fixed `claude_md/add_public_api.md`, which still told contributors to hand-edit the now-`generated` `c.h`/`c.cc`. Pull Request resolved: #14868 Test Plan: - `make c_test && ./c_test` — **passes**, including the 462 generated round-trip assertions (a successful link also confirms the API is complete). - `python3 tools/c_api_gen/check_api_completeness.py` — all 1737 declared functions defined exactly once. - `python3 tools/c_api_gen/check_api_compatibility.py --ref <release>` — 1070 reference functions + 229 enum/typedef symbols preserved, 0 removed/changed. - `python3 tools/c_api_gen/verify_generated_up_to_date.py` — generated output is stable. - `include/rocksdb/c.h` confirmed self-contained (only `<stdbool.h>`, `<stddef.h>`, `<stdint.h>`). cc xingbowang - `buck2 build --flagfile fbcode//mode/dev fbcode//internal_repo_rocksdb/repo:c_test_bin` — passes. - `buck2 build --flagfile fbcode//mode/dev --config fbcode.arch=aarch64 fbcode//internal_repo_rocksdb/repo:c_test_bin` — passes. Reviewed By: pdillinger Differential Revision: D109149150 Pulled By: xingbowang fbshipit-source-id: 3417375345f360a4c78bdfe27e9850b89d0a226a
1 parent 30295a4 commit 1cec28d

57 files changed

Lines changed: 40627 additions & 3169 deletions

File tree

Some content is hidden

Large Commits have some content hidden by default. Use the searchbox below for content that may be hidden.

‎.github/workflows/pr-jobs.yml‎

Lines changed: 17 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -314,6 +314,23 @@ jobs:
314314
- run: CC=clang-21 CXX=clang++-21 USE_CLANG=1 make -j32 all microbench
315315
- run: make clean
316316
- run: CC=clang-21 CXX=clang++-21 USE_CLANG=1 DEBUG_LEVEL=0 make -j32 release
317+
- name: Install clang-format-21 for C API gen checks
318+
run: apt-get update && apt-get install -y clang-format-21
319+
- name: Check C API codegen (link-complete, backward-compatible, up to date)
320+
# Run the make target in STRICT mode so this core CI job HARD-FAILS rather
321+
# than silently skipping if a prerequisite (clang++, clang-format, or the
322+
# compat baseline ref) is ever missing. See `make check-c-api-gen`.
323+
run: |
324+
# The container checkout is owned by a different user, so git refuses to
325+
# operate on it ("dubious ownership") -- mark it safe for the `git fetch`
326+
# and the `git show` the compat checker runs.
327+
git config --global --add safe.directory "$GITHUB_WORKSPACE"
328+
git config --global --add safe.directory '*'
329+
git fetch --no-tags --depth=1 origin main
330+
CXX=clang++-21 make check-c-api-gen \
331+
CHECK_C_API_GEN_STRICT=1 \
332+
API_COMPAT_REF=FETCH_HEAD \
333+
CLANG_FORMAT_BINARY=clang-format-21
317334
- uses: "./.github/actions/teardown-ccache"
318335
if: always()
319336
- uses: "./.github/actions/post-steps"

‎.gitignore‎

Lines changed: 2 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -88,6 +88,8 @@ fbcode/
8888
fbcode
8989
buckifier/*.pyc
9090
buckifier/__pycache__
91+
tools/__pycache__/
92+
tools/c_api_gen/__pycache__/
9193
.arcconfig
9294

9395
compile_commands.json

‎Makefile‎

Lines changed: 83 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -764,7 +764,6 @@ util/build_version.cc: $(filter-out $(OBJ_DIR)/util/build_version.o, $(LIB_OBJEC
764764
$(AM_V_at)$(gen_build_version) > $@
765765
endif
766766
CLEAN_FILES += util/build_version.cc
767-
768767
default: all
769768

770769
#-----------------------------------------------
@@ -809,7 +808,8 @@ endif # PLATFORM_SHARED_EXT
809808
.PHONY: check clean coverage ldb_tests package dbg gen-pc build_size \
810809
release tags tags0 valgrind_check format static_lib shared_lib all \
811810
rocksdbjavastatic rocksdbjava install install-static install-shared \
812-
uninstall analyze tools tools_lib check-headers checkout_folly clang-tidy
811+
uninstall analyze tools tools_lib check-headers checkout_folly clang-tidy \
812+
check-c-api-c_test
813813

814814
# Auto-configure git hooks on first build so developers do not need to run
815815
# "make install-hooks" manually. This is a no-op if already set.
@@ -1127,8 +1127,89 @@ ifndef SKIP_FORMAT_BUCK_CHECKS
11271127
$(MAKE) check-buck-targets
11281128
$(MAKE) check-sources
11291129
$(MAKE) check-workflow-yaml
1130+
$(MAKE) check-c-api-gen
11301131
endif
11311132

1133+
# Check that the auto-generated C API files are up to date. It regenerates the
1134+
# fragments and the inlined c.h/c.cc and compares them against a snapshot of the
1135+
# checked-in copies (no net change when everything is up to date). It requires
1136+
# clang++ (libclang, used to parse the C++ headers) and clang-format. When those
1137+
# are unavailable the staleness/compat sub-checks are SKIPPED with a message so
1138+
# `make check` still works without the codegen toolchain; the link-completeness
1139+
# sub-check always runs (it needs no toolchain).
1140+
#
1141+
# Pin the formatter to match CI by setting CLANG_FORMAT_BINARY, e.g.:
1142+
# make check-c-api-gen CLANG_FORMAT_BINARY=clang-format-21
1143+
# This target is part of `make check` and is skipped by SKIP_FORMAT_BUCK_CHECKS.
1144+
#
1145+
# Set CHECK_C_API_GEN_STRICT=1 to turn every "skip" below into a hard error, so a
1146+
# core CI job that runs this target cannot silently regress to a no-op if a
1147+
# prerequisite (clang++, or the compat baseline ref) goes missing. The dedicated
1148+
# build-linux-clang-21-no_test_run CI job runs this target with the flag set.
1149+
# Any non-empty value other than 0/no/false enables strict mode.
1150+
CLANG_FORMAT_BINARY ?=
1151+
# Backward-compatibility baseline for the C API (signature-level) check. CI
1152+
# overrides this with the PR's merge target; locally it falls back to main /
1153+
# origin/main and is skipped if neither resolves.
1154+
API_COMPAT_REF ?= main
1155+
CHECK_C_API_GEN_STRICT ?=
1156+
check-c-api-gen:
1157+
# Link-completeness is a property of the checked-in c.h/c.cc and needs no
1158+
# clang toolchain, so it always runs: every declared public C API function
1159+
# must have exactly one definition (guards against dropped wrappers that
1160+
# would break downstream language bindings at link time).
1161+
$(PYTHON) tools/c_api_gen/check_api_completeness.py
1162+
# Backward-compatibility: no public C function may be removed or have its
1163+
# signature changed vs the baseline. Skipped if the baseline ref is not
1164+
# resolvable locally (CI passes an explicit ref), unless CHECK_C_API_GEN_STRICT.
1165+
@strict=""; case "$(CHECK_C_API_GEN_STRICT)" in ""|0|no|NO|false|FALSE) ;; *) strict=1 ;; esac; \
1166+
ref=""; \
1167+
if git rev-parse --verify --quiet "$(API_COMPAT_REF)^{commit}" >/dev/null; then ref="$(API_COMPAT_REF)"; \
1168+
elif git rev-parse --verify --quiet "origin/$(API_COMPAT_REF)^{commit}" >/dev/null; then ref="origin/$(API_COMPAT_REF)"; fi; \
1169+
if [ -n "$$ref" ]; then \
1170+
$(PYTHON) tools/c_api_gen/check_api_compatibility.py --ref "$$ref"; \
1171+
elif [ -n "$$strict" ]; then \
1172+
echo "ERROR: C API compat baseline '$(API_COMPAT_REF)' not resolvable and CHECK_C_API_GEN_STRICT is set" >&2; exit 1; \
1173+
else \
1174+
echo "Skipping C API backward-compatibility check ($(API_COMPAT_REF) not found; set API_COMPAT_REF)"; \
1175+
fi
1176+
# Staleness: regenerate and confirm the checked-in output is current. Needs a
1177+
# clang++ (the generator parses C++ ASTs); detect one the way the generator
1178+
# does (a clang in $(CXX) -- which may be ccache-prefixed/versioned -- else a
1179+
# bare/versioned clang++ on PATH) rather than testing $(CXX) verbatim.
1180+
@strict=""; case "$(CHECK_C_API_GEN_STRICT)" in ""|0|no|NO|false|FALSE) ;; *) strict=1 ;; esac; \
1181+
cf_arg=""; \
1182+
if [ -n "$(CLANG_FORMAT_BINARY)" ]; then cf_arg="--clang-format $(CLANG_FORMAT_BINARY)"; fi; \
1183+
have_clang=""; \
1184+
for c in $$(printf '%s\n' $(CXX) | grep -i clang) clang++ clang++-21 clang++-20 clang++-19 clang++-18 clang++-17 clang++-16 clang++-15 clang++-14 clang++-13; do \
1185+
if command -v "$$c" >/dev/null 2>&1; then have_clang=1; break; fi; \
1186+
done; \
1187+
if [ -n "$$have_clang" ]; then \
1188+
$(PYTHON) tools/c_api_gen/verify_generated_up_to_date.py $$cf_arg; \
1189+
elif [ -n "$$strict" ]; then \
1190+
echo "ERROR: no clang++ found and CHECK_C_API_GEN_STRICT is set; cannot run the C API staleness check" >&2; exit 1; \
1191+
else \
1192+
echo "Skipping C API codegen staleness check (no clang++ found; install clang++ or set CXX to a clang to enable)"; \
1193+
fi
1194+
1195+
# Quick local validation for C API generation plus the focused C API test.
1196+
# This verifies the checked-in generated fragments as well as the inlined
1197+
# include/rocksdb/c.h and db/c.cc outputs, then runs c_test in an isolated
1198+
# TEST_TMPDIR to avoid stale-state failures.
1199+
check-c-api-c_test:
1200+
$(PYTHON) tools/c_api_gen/verify_generated_up_to_date.py
1201+
$(MAKE) c_test
1202+
@tmpdir=$$(mktemp -d); \
1203+
trap 'rm -rf "$$tmpdir"' EXIT; \
1204+
echo "===== Running c_test with TEST_TMPDIR=$$tmpdir"; \
1205+
if command -v timeout >/dev/null 2>&1; then \
1206+
TEST_TMPDIR="$$tmpdir" timeout 60 ./c_test; \
1207+
elif command -v gtimeout >/dev/null 2>&1; then \
1208+
TEST_TMPDIR="$$tmpdir" gtimeout 60 ./c_test; \
1209+
else \
1210+
TEST_TMPDIR="$$tmpdir" ./c_test; \
1211+
fi
1212+
11321213
# TODO add ldb_tests
11331214
check_some: $(ROCKSDBTESTS_SUBSET)
11341215
for t in $(ROCKSDBTESTS_SUBSET); do echo "===== Running $$t (`date`)"; ./$$t || exit 1; done

‎buckifier/targets_builder.py‎

Lines changed: 6 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -118,6 +118,12 @@ def add_binary(
118118
self.total_bin = self.total_bin + 1
119119

120120
def add_c_test(self):
121+
# The actual c_test_bin target is defined by add_c_test_wrapper in the
122+
# internal //rocks/buckifier:defs.bzl (not in this OSS repo). db/c_test.c
123+
# #includes the generated c_api_gen/*.inc fragments, so under Buck's
124+
# hermetic sandbox that wrapper must expose them as headers, e.g.
125+
# headers = native.glob(["c_api_gen/**/*.inc"])
126+
# (Make/CMake resolve the include via -I. / PROJECT_SOURCE_DIR.)
121127
with open(self.path, "ab") as targets_file:
122128
targets_file.write(
123129
b"""
Lines changed: 39 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,39 @@
1+
// @generated
2+
// -----------------------------------------------------------------------------
3+
// Auto-generated by tools/c_api_gen/generate_c_api.py.
4+
// DO NOT EDIT THIS FILE DIRECTLY.
5+
// Source: tools/c_api_gen/spec.json
6+
// -----------------------------------------------------------------------------
7+
8+
// To regenerate this file:
9+
// python3 tools/c_api_gen/generate_c_api.py
10+
// --spec tools/c_api_gen/spec.json --section 'BlockBasedOptions simple'
11+
// --header-out
12+
// c_api_gen/c_generated_block_based_options_subset.h.inc
13+
// --source-out
14+
// c_api_gen/c_generated_block_based_options_subset.cc.inc
15+
16+
/* BlockBasedOptions simple */
17+
18+
void rocksdb_block_based_options_set_data_block_hash_ratio(
19+
rocksdb_block_based_table_options_t* options, double v) {
20+
options->rep.data_block_hash_table_util_ratio = v;
21+
}
22+
23+
void rocksdb_block_based_options_set_top_level_index_pinning_tier(
24+
rocksdb_block_based_table_options_t* options, int v) {
25+
options->rep.metadata_cache_options.top_level_index_pinning =
26+
static_cast<ROCKSDB_NAMESPACE::PinningTier>(v);
27+
}
28+
29+
void rocksdb_block_based_options_set_partition_pinning_tier(
30+
rocksdb_block_based_table_options_t* options, int v) {
31+
options->rep.metadata_cache_options.partition_pinning =
32+
static_cast<ROCKSDB_NAMESPACE::PinningTier>(v);
33+
}
34+
35+
void rocksdb_block_based_options_set_unpartitioned_pinning_tier(
36+
rocksdb_block_based_table_options_t* options, int v) {
37+
options->rep.metadata_cache_options.unpartitioned_pinning =
38+
static_cast<ROCKSDB_NAMESPACE::PinningTier>(v);
39+
}
Lines changed: 32 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,32 @@
1+
// @generated
2+
// -----------------------------------------------------------------------------
3+
// Auto-generated by tools/c_api_gen/generate_c_api.py.
4+
// DO NOT EDIT THIS FILE DIRECTLY.
5+
// Source: tools/c_api_gen/spec.json
6+
// -----------------------------------------------------------------------------
7+
8+
// To regenerate this file:
9+
// python3 tools/c_api_gen/generate_c_api.py
10+
// --spec tools/c_api_gen/spec.json --section 'BlockBasedOptions simple'
11+
// --header-out
12+
// c_api_gen/c_generated_block_based_options_subset.h.inc
13+
// --source-out
14+
// c_api_gen/c_generated_block_based_options_subset.cc.inc
15+
16+
/* BlockBasedOptions simple */
17+
18+
extern ROCKSDB_LIBRARY_API void
19+
rocksdb_block_based_options_set_data_block_hash_ratio(
20+
rocksdb_block_based_table_options_t* options, double v);
21+
22+
extern ROCKSDB_LIBRARY_API void
23+
rocksdb_block_based_options_set_top_level_index_pinning_tier(
24+
rocksdb_block_based_table_options_t* options, int v);
25+
26+
extern ROCKSDB_LIBRARY_API void
27+
rocksdb_block_based_options_set_partition_pinning_tier(
28+
rocksdb_block_based_table_options_t* options, int v);
29+
30+
extern ROCKSDB_LIBRARY_API void
31+
rocksdb_block_based_options_set_unpartitioned_pinning_tier(
32+
rocksdb_block_based_table_options_t* options, int v);
Lines changed: 21 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,21 @@
1+
// @generated
2+
// -----------------------------------------------------------------------------
3+
// Auto-generated by tools/c_api_gen/generate_c_api.py.
4+
// DO NOT EDIT THIS FILE DIRECTLY.
5+
// Source: tools/c_api_gen/spec.json
6+
// -----------------------------------------------------------------------------
7+
8+
// To regenerate this file:
9+
// python3 tools/c_api_gen/generate_c_api.py
10+
// --spec tools/c_api_gen/spec.json --section 'CuckooOptions simple'
11+
// --header-out
12+
// c_api_gen/c_generated_cuckoo_options_subset.h.inc
13+
// --source-out
14+
// c_api_gen/c_generated_cuckoo_options_subset.cc.inc
15+
16+
/* CuckooOptions simple */
17+
18+
void rocksdb_cuckoo_options_set_hash_ratio(
19+
rocksdb_cuckoo_table_options_t* options, double v) {
20+
options->rep.hash_table_ratio = v;
21+
}
Lines changed: 19 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,19 @@
1+
// @generated
2+
// -----------------------------------------------------------------------------
3+
// Auto-generated by tools/c_api_gen/generate_c_api.py.
4+
// DO NOT EDIT THIS FILE DIRECTLY.
5+
// Source: tools/c_api_gen/spec.json
6+
// -----------------------------------------------------------------------------
7+
8+
// To regenerate this file:
9+
// python3 tools/c_api_gen/generate_c_api.py
10+
// --spec tools/c_api_gen/spec.json --section 'CuckooOptions simple'
11+
// --header-out
12+
// c_api_gen/c_generated_cuckoo_options_subset.h.inc
13+
// --source-out
14+
// c_api_gen/c_generated_cuckoo_options_subset.cc.inc
15+
16+
/* CuckooOptions simple */
17+
18+
extern ROCKSDB_LIBRARY_API void rocksdb_cuckoo_options_set_hash_ratio(
19+
rocksdb_cuckoo_table_options_t* options, double v);

0 commit comments

Comments
 (0)