Skip to content

feat(otel): safe tracing#13626

Merged
owenlin0 merged 3 commits intomainfrom
owen/implement_safe_tracing
Mar 6, 2026
Merged

feat(otel): safe tracing#13626
owenlin0 merged 3 commits intomainfrom
owen/implement_safe_tracing

Conversation

@owenlin0
Copy link
Copy Markdown
Collaborator

@owenlin0 owenlin0 commented Mar 5, 2026

Motivation

Today config.toml has three different OTEL knobs under [otel]:

  • exporter controls where OTEL logs go
  • trace_exporter controls where OTEL traces go
  • metrics_exporter controls where metrics go

Those often (pretty much always?) serve different purposes.

For example, for OpenAI internal usage, the log exporter is already being used for IT/security telemetry, and that use case is intentionally content-rich: tool calls, arguments, outputs, MCP payloads, and in some cases user content are all useful there. log_user_prompt is a good example of that distinction. When it’s enabled, we include raw prompt text in OTEL logs, which is acceptable for the security use case.

The trace exporter is a different story. The goal there is to give OpenAI engineers visibility into latency and request behavior when they run Codex locally, without sending sensitive prompt or tool data as trace event data. In other words, traces should help answer “what was slow?” or “where did time go?”, not “what did the user say?” or “what did the tool return?”

The complication is that Rust’s tracing crate does not make a hard distinction between “logs” and “trace events.” It gives us one instrumentation API for logs and trace events (via tracing::event!), and subscribers decide what gets treated as logs, trace events, or both.

Before this change, our OTEL trace layer was effectively attached to the general tracing stream, which meant turning on trace_exporter could pick up content-rich events that were originally written with logging (and the log_exporter) in mind. That made it too easy for sensitive data to end up in exported traces by accident.

Concrete example

In otel_manager.rs, this tracing::event! call would be exported in both logs AND traces (as a trace event).

    pub fn user_prompt(&self, items: &[UserInput]) {
        let prompt = items
            .iter()
            .flat_map(|item| match item {
                UserInput::Text { text, .. } => Some(text.as_str()),
                _ => None,
            })
            .collect::<String>();

        let prompt_to_log = if self.metadata.log_user_prompts {
            prompt.as_str()
        } else {
            "[REDACTED]"
        };

        tracing::event!(
            tracing::Level::INFO,
            event.name = "codex.user_prompt",
            event.timestamp = %timestamp(),
            // ...
            prompt = %prompt_to_log,
        );
    }

Instead of tracing::event!, we should now be using log_event! and trace_event! instead to more clearly indicate which sink (logs vs. traces) that event should be exported to.

What changed

This PR makes the log and trace export distinct instead of treating them as two sinks for the same data.

On the provider side, OTEL logs and traces now have separate routing/filtering policy. The log exporter keeps receiving the existing codex_otel events, while trace export is limited to spans and trace events.

On the event side, OtelManager now emits two flavors of telemetry where needed:

  • a log-only event with the current rich payloads
  • a tracing-safe event with summaries only

It also has a convenience log_and_trace_event! macro for emitting to both logs and traces when it's safe to do so, as well as log- and trace-specific fields.

That means prompts, tool args, tool output, account email, MCP metadata, and similar content stay in the log lane, while traces get the pieces that are actually useful for performance work: durations, counts, sizes, status, token counts, tool origin, and normalized error classes.

This preserves current IT/security logging behavior while making it safe to turn on trace export for employees.

Full list of things removed from trace export

  • raw user prompt text from codex.user_prompt
  • raw tool arguments and output from codex.tool_result
  • MCP server metadata from codex.tool_result (mcp_server, mcp_server_origin)
  • account identity fields like user.email and user.account_id from trace-safe OTEL events
  • host.name from trace resources
  • generic codex.tool_decision events from traces
  • generic codex.sse_event events from traces
  • the full ToolCall debug payload from the handle_tool_call span

What traces now keep instead is mostly:

  • spans
  • trace-safe OTEL events
  • counts, lengths, durations, status, token counts, and tool origin summaries
@owenlin0 owenlin0 force-pushed the owen/implement_safe_tracing branch from c3e83e8 to 85a72d5 Compare March 5, 2026 23:49
@owenlin0 owenlin0 marked this pull request as ready for review March 5, 2026 23:53
@owenlin0
Copy link
Copy Markdown
Collaborator Author

owenlin0 commented Mar 5, 2026

@codex review

Copy link
Copy Markdown
Contributor

@chatgpt-codex-connector chatgpt-codex-connector bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 85a72d5361

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

@owenlin0
Copy link
Copy Markdown
Collaborator Author

owenlin0 commented Mar 6, 2026

@codex review

@chatgpt-codex-connector
Copy link
Copy Markdown
Contributor

Codex Review: Didn't find any major issues. What shall we delve into next?

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

@owenlin0 owenlin0 merged commit c3736cf into main Mar 6, 2026
29 of 31 checks passed
@owenlin0 owenlin0 deleted the owen/implement_safe_tracing branch March 6, 2026 00:30
@github-actions github-actions bot locked and limited conversation to collaborators Mar 6, 2026
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.

Labels

None yet

2 participants