Add file ingestion histograms (#14836)#14836
Conversation
f2dbd3e to
75aae94
Compare
|
@joshkang97 has exported this pull request. If you are a Meta employee, you can view the originating Diff in D107721260. |
✅ clang-tidy: No findings on changed linesCompleted in 258.4s. |
Summary: Adds a public `Statistics` histogram, `INGEST_EXTERNAL_FILE_TIME` (`"rocksdb.ingest.external.file.micros"`), recording the end-to-end latency in microseconds of each `IngestExternalFile(s)` call. Ingestion timing was previously only available through per-thread `perf_context` counters, which require setting a `PerfLevel` and are not aggregated, so there was no process-wide latency distribution (p50/p99/max) for dashboards. It is recorded with an RAII `StopWatch` at the top of `DBImpl::IngestExternalFiles` -- one sample per call (not per column family), covering all return paths. It is null-safe and self-gating on the stats level, so there is no cost when statistics are off, and ingestion is not a hot path. Java bindings are kept in sync per the `statistics.h` requirement; the C API needs no change. Reviewed By: xingbowang Differential Revision: D107721260
75aae94 to
fa4d9ea
Compare
🟡 Codex Code ReviewAuto-triggered after CI passed — reviewing commit fa4d9ea ❌ Codex review failed before producing findings. ℹ️ About this responseGenerated by Codex CLI. Limitations:
Commands:
|
✅ Claude Code ReviewAuto-triggered after CI passed — reviewing commit fa4d9ea SummaryClean, well-scoped addition of two histogram metrics for No high-severity findings. Full review (click to expand)Findings🔴 HIGHNo high-severity findings. 🟡 MEDIUMM1. Missing test for
|
| Context | Affected? | Assessment |
|---|---|---|
| WritePreparedTxnDB | No | Ingestion path unchanged |
| ReadOnly DB | No | Ingestion not supported |
| User-defined timestamps | No | Timing is orthogonal |
| Concurrent ingestion | Safe | stats_ is thread-safe, RecordTimeToHistogram is thread-safe |
| Java bindings | Yes | 0x43/0x44 values correct, non-conflicting, within signed byte range |
Thread safety: stats_ is set once at DB open and never modified. Reading it without mutex is safe and follows established patterns throughout the codebase (e.g., DB_GET StopWatch at db_impl.cc:3144).
Performance: Three NowMicros() calls on the ingestion path (not a hot path). Self-gating when stats are disabled. No concern.
Positive Observations
- Clean separation of prepare vs run phases, providing actionable insight (prepare = no write blocking, run = write blocking under mutex)
- Proper null-safety and stats-level gating
- Only recording on success avoids skewing latency distributions with error paths
- Java bindings kept in sync with correct hex value assignment
- Release note is concise and informative
- Test covers both success and failure paths
ℹ️ About this response
Generated by Claude Code.
Review methodology: claude_md/code_review.md
Limitations:
- Claude may miss context from files not in the diff
- Large PRs may be truncated
- Always apply human judgment to AI suggestions
Commands:
/claude-review [context]— Request a code review/claude-query <question>— Ask about the PR or codebase
|
This pull request has been merged in 1ede57e. |
Summary:
Adds a public
Statisticshistogram,INGEST_EXTERNAL_FILE_TIME("rocksdb.ingest.external.file.micros"), recording the end-to-end latency in microseconds of eachIngestExternalFile(s)call. Ingestion timing was previously only available through per-threadperf_contextcounters, which require setting aPerfLeveland are not aggregated, so there was no process-wide latency distribution (p50/p99/max) for dashboards.It is recorded with an RAII
StopWatchat the top ofDBImpl::IngestExternalFiles-- one sample per call (not per column family), covering all return paths. It is null-safe and self-gating on the stats level, so there is no cost when statistics are off, and ingestion is not a hot path. Java bindings are kept in sync per thestatistics.hrequirement; the C API needs no change.Reviewed By: xingbowang
Differential Revision: D107721260