Add RocksDB rate limiter I/O usage metrics#14884
Conversation
Summary: Add rate limiter statistics that make it easier to diagnose whether RocksDB background IO is being throttled by the GenericRateLimiter. New ticker counters: - rocksdb.rate.limiter.bytes.read/write: bytes granted by the limiter, split by operation type. - rocksdb.rate.limiter.requests.read/write: requests granted by the limiter, split by operation type. - rocksdb.rate.limiter.delayed.requests.read/write: requests that actually had to wait for a future refill because available tokens were exhausted. - rocksdb.rate.limiter.total.wait.micros.read/write: cumulative wait time for delayed requests. New histograms: - rocksdb.rate.limiter.wait.micros.read/write: per-request wait latency for delayed requests, enabling percentile debugging. GenericRateLimiter now overrides the OpType-aware Request() path so these stats preserve the configured limiter mode. The legacy three-argument Request() path is treated as write IO for compatibility. Existing NUMBER_RATE_LIMITER_DRAINS is left intact, but the new delayed request counters are the clearer signal for out-of-token throttling because one delayed request can observe multiple drain intervals. Also update the Java ticker and histogram mappings so Java consumers can access the new statistics. Test Plan: - make format-auto - make -j16 rate_limiter_test statistics_test - ./rate_limiter_test - ./statistics_test - make check-sources
✅ clang-tidy: No findings on changed linesCompleted in 62.2s. |
|
@xingbowang has imported this pull request. If you are a Meta employee, you can view this in D109694030. |
🟡 Codex Code ReviewAuto-triggered after CI passed — reviewing commit fb90cde ❌ Codex review failed before producing findings. ℹ️ About this responseGenerated by Codex CLI. Limitations:
Commands:
|
|
@xingbowang has imported this pull request. If you are a Meta employee, you can view this in D109694030. |
✅ Claude Code ReviewAuto-triggered after CI passed — reviewing commit fb90cde SummaryWell-structured PR that adds useful rate limiter I/O diagnostics with clean separation of concerns. The RequestImpl refactoring correctly narrows the mutex scope and records stats outside the lock. No correctness bugs found. High-severity findings (0): No high-severity findings. Full review (click to expand)Findings🔴 HIGHNone. 🟡 MEDIUMM1. Test does not cover read-path delayed requests or read histogram —
|
| Context | Affected? | Notes |
|---|---|---|
| Custom RateLimiter subclasses | No | Base class 4-arg Request unchanged |
| WritePreparedTxnDB | No | Same Request API |
| statistics_test sanity checks | Must pass | PR correctly maintains TickersNameMap/HistogramsNameMap ordering |
| Java JNI mappings | Correct | No value collisions (-0x71 to -0x78 are free) |
| C++ enum value shift | Safe | Header documents values are not stable |
Positive Observations
- Mutex scope narrowing reduces contention on
request_mutex_— meaningful improvement for high-throughput I/O. - Clean separation of request granting (under mutex) from stats recording (outside mutex) using local flag variables.
- Correct
stop_handling — no stats recorded for interrupted requests. - Backward-compatible — 3-arg legacy path defaults to kWrite; custom subclasses unaffected.
- Stats only on delayed path — histogram and wait time recorded only when request actually waited, avoiding fast-path noise.
ℹ️ About this response
Generated by Claude Code.
Review methodology: claude_md/code_review.md
Limitations:
- Claude may miss context from files not in the diff
- Large PRs may be truncated
- Always apply human judgment to AI suggestions
Commands:
/claude-review [context]— Request a code review/claude-query <question>— Ask about the PR or codebase
|
@xingbowang merged this pull request in 75c1ffd. |
Summary:
Add rate limiter statistics that make it easier to diagnose whether RocksDB background IO is being throttled by the GenericRateLimiter.
New ticker counters:
New histograms:
GenericRateLimiter now overrides the OpType-aware Request() path so these stats preserve the configured limiter mode. The legacy three-argument Request() path is treated as write IO for compatibility. Existing NUMBER_RATE_LIMITER_DRAINS is left intact, but the new delayed request counters are the clearer signal for out-of-token throttling because one delayed request can observe multiple drain intervals.
Also update the Java ticker and histogram mappings so Java consumers can access the new statistics.
Test Plan: