[hooks] add non-streaming (non-stdin style) shell-only PostToolUse support#15531
[hooks] add non-streaming (non-stdin style) shell-only PostToolUse support#15531eternal-openai merged 15 commits intomainfrom
Conversation
21c16e1 to
266ecb6
Compare
f2b1367 to
285b9f8
Compare
Codex has been incrementally shipping hook support (PreToolUse, PostToolUse, UserPromptSubmit, SessionStart, Stop), but the Codex converter silently dropped all hooks during conversion -- the only converter that didn't even warn. This adds partial hook conversion for the 5 compatible events and specific warnings for everything unconvertible. - Add CodexHookEventName union, CodexHookCommand, CodexHookMatcher, CodexHooks types and hooks field to CodexBundle - Add CODEX_EVENTS map with toolScoped flag, isBashCompatibleMatcher() helper, and convertHooksForCodex() to the converter - Write .codex/hooks.json with backup support in the writer - Auto-enable [features] codex_hooks = true in config.toml when hooks present - Refactor renderCodexConfig() to options object for extensibility - Update Codex spec with hooks documentation - 18 new tests (12 converter + 6 writer) PostToolUse conversion depends on openai/codex#15531 merging.
…h-fix # Conflicts: # codex-rs/core/tests/suite/shell_snapshot.rs # codex-rs/exec/src/event_processor_with_human_output.rs # codex-rs/tui/tests/suite/model_availability_nux.rs # codex-rs/tui_app_server/tests/suite/model_availability_nux.rs
| } else { | ||
| None | ||
| }; | ||
| let hook_abort_error = dispatch_after_tool_use_hook(AfterToolUseHookDispatch { |
There was a problem hiding this comment.
Are post tool use and after tool use different hooks?
There was a problem hiding this comment.
Yeah, this was introduced by @gt-oai in the old version. I'm maintaining support for now, but it'll get deprecated/removed later
codex-rs/core/src/tools/registry.rs
Outdated
| .await; | ||
|
|
||
| if outcome.should_stop { | ||
| let replacement_text = outcome |
There was a problem hiding this comment.
both branches of this if look very similar
| .unwrap_or_else(|| "PostToolUse hook stopped execution".to_string()); | ||
| let mut guard = response_cell.lock().await; | ||
| if let Some(result) = guard.as_mut() { | ||
| result.result = Box::new(FunctionToolOutput::from_text( |
There was a problem hiding this comment.
what if the tool is CustomTool and needs custom response.
- Why are we highjacking the tool result?
Shouldn't the model still get the real tool result?
There was a problem hiding this comment.
CustomTool support out of scope for this PR, I'm trying to just handle "Bash" (i.e. the name for shell-like stuff in the spec)
Killing the tool result is intentional, if the hook "blocks" the output, it basically has the opportunity to rewrite it with its own context
|
|
||
| fn to_response_item(&self, call_id: &str, payload: &ToolPayload) -> ResponseInputItem; | ||
|
|
||
| fn post_tool_use_response(&self, _call_id: &str, _payload: &ToolPayload) -> Option<JsonValue> { |
There was a problem hiding this comment.
what should this property contain for a notmal tool output?
There was a problem hiding this comment.
By default it should be None. A tool only populates post_tool_use_response if it intentionally supports a stable PostToolUse payload shape
| exec_command_pre_tool_use_payload(&invocation.tool_name, &invocation.payload) | ||
| } | ||
|
|
||
| fn post_tool_use_payload(&self, result: &AnyToolResult) -> Option<PostToolUsePayload> { |
There was a problem hiding this comment.
AnyToolResult is what ToolResult is wrapped into at higher levels of execution to arase the concrete handler output type, it's a bit circular to pass it back in.
There was a problem hiding this comment.
Agreed. I cleaned this up so handlers no longer receive the AnyToolResult just to shape the hook payload. They now get call_id, payload, and &dyn ToolOutput, which keeps the boundary better
Small note, the spec is pretty bad around tool_response (and tool_input): "...The exact schema for both depends on the tool." ... with no further elaboration :)
So basically we have to handle "any" and I've been reading how this works in the wild as the source of truth ¯_(ツ)_/¯
| tool_name: &str, | ||
| payload: &ToolPayload, | ||
| ) -> Option<PreToolUsePayload> { | ||
| if tool_name != "exec_command" { |
There was a problem hiding this comment.
can this be shared with normal arg parsing?
There was a problem hiding this comment.
Yep, agreed, fixed
| return None; | ||
| }; | ||
|
|
||
| serde_json::from_str::<ShellCommandToolCallParams>(arguments) |
There was a problem hiding this comment.
is parse_arguments appropriate here?
There was a problem hiding this comment.
I think so, yes, fixed
| Ok(output) => { | ||
| let post_tool_use_response = | ||
| crate::tools::format_exec_output_str(&output, turn.truncation_policy); | ||
| let content = emitter.finish(event_ctx, Ok(output)).await?; |
There was a problem hiding this comment.
can let content = emitter.finish(event_ctx, Err(error)).await?;
line be pulled out of the if and only post_tool_use_response calculated separately?
There was a problem hiding this comment.
why don't we use content directly?
There was a problem hiding this comment.
Agree, simplified. emitter.finish now sits outside the branch, and only the post_tool_use_response calculation stays conditional
I kept that separate on purpose. content seems like more model-facing rendered shell text, which is more transcript/UI-oriented. tool_response for PostToolUse is meant to be the stable tool-specific result payload, so I didn’t want to couple the hook input to the transcript formatting
| Err(err) => (err.to_string(), false), | ||
| }; | ||
| emit_metric_for_tool_read(&invocation, success).await; | ||
| let post_tool_use_payload = if success { |
There was a problem hiding this comment.
for the future it would be nice to offload hook processing into it's own routine. non blocking
There was a problem hiding this comment.
Ok, noted for a future followup
codex-rs/core/src/tools/registry.rs
Outdated
| } | ||
| _ => None, | ||
| } | ||
| fn replace_result_with_feedback(result: &mut AnyToolResult, text: String) { |
| } | ||
|
|
||
| fn pre_tool_use_payload(&self, invocation: &ToolInvocation) -> Option<PreToolUsePayload> { | ||
| shell_pre_tool_use_payload(&invocation.payload) |
There was a problem hiding this comment.
why don't we inline these methods?
|
|
||
| let args = parse_arguments::<ExecCommandArgs>(arguments).ok()?; | ||
| if args.tty { | ||
| return None; |
There was a problem hiding this comment.
Was just playing it safe for now, I didn't handle streaming/interactive shell usage in this PR, and that mirrors the PreToolUse one. Definitely needs a followup, but it doesn't fit cleanly into the spec, so I've been thinking about what to do about it (maybe a spec extension)
| .map(|args| PreToolUsePayload { command: args.cmd }) | ||
| } | ||
|
|
||
| fn exec_command_post_tool_use_payload( |
CHAINED PR - note that base is eternal/hooks-pretooluse-bash, not main -- so the following PR should be first
Matching post-tool hook to the pre-tool functionality here: #15211
So, PreToolUse calls for plain shell calls, allows blocking. This PostToolUse call runs after the command executed
example run: