Skip to content

start of hooks engine#13276

Merged
eternal-openai merged 25 commits intomainfrom
eternal/start-of-hooks
Mar 10, 2026
Merged

start of hooks engine#13276
eternal-openai merged 25 commits intomainfrom
eternal/start-of-hooks

Conversation

@eternal-openai
Copy link
Copy Markdown
Contributor

@eternal-openai eternal-openai commented Mar 2, 2026

(Experimental)

This PR adds a first MVP for hooks, with SessionStart and Stop

The core design is:

  • hooks live in a dedicated engine under codex-rs/hooks
  • each hook type has its own event-specific file
  • hook execution is synchronous and blocks normal turn progression while running
  • matching hooks run in parallel, then their results are aggregated into a normalized HookRunSummary

On the AppServer side, hooks are exposed as operational metadata rather than transcript-native items:

  • new live notifications: hook/started, hook/completed
  • persisted/replayed hook results live on Turn.hookRuns
  • we intentionally did not add hook-specific ThreadItem variants

Hooks messages are not persisted, they remain ephemeral. The context changes they add are (they get appended to the user's prompt)

@github-actions
Copy link
Copy Markdown
Contributor

github-actions bot commented Mar 2, 2026

All contributors have signed the CLA ✍️ ✅
Posted by the CLA Assistant Lite bot.

@eternal-openai
Copy link
Copy Markdown
Contributor Author

I have read the CLA Document and I hereby sign the CLA

@etraut-openai etraut-openai added the oai PRs contributed by OpenAI employees label Mar 5, 2026
@eternal-openai eternal-openai force-pushed the eternal/start-of-hooks branch from ad59823 to dbeba0c Compare March 5, 2026 21:45
turn_id: Option<String>,
parse: fn(&ConfiguredHandler, CommandRunResult, Option<String>) -> ParsedHandler<T>,
) -> Vec<ParsedHandler<T>> {
let results = join_all(
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this means you run everything in parallel. So the ordering of the hooks is not respected, even for "Sync" hooks

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That's on purpose based on my understanding of the existing spec. I'm unclear what sync hooks are

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Would be good to make sure we don't miss something here

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

i have the same confusion. is sync vs. async whether they block the turn, rather than whether they're serialized? this should be clarified in docs/code, otherwise I'd think I could serialize hook execution (hook A formats, hook B runs tests given formatting finished).

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As far as I'm aware, the mechanics are correct here:

  • Hooks execute in parallel to maximize overall processing speed
  • The agent loop waits for the hook to complete in the typical case
  • The spec supports doing async (i.e. hooks run in the background - we'll do that later)
@eternal-openai
Copy link
Copy Markdown
Contributor Author

@jif-oai Thank you so much for the thorough review! Super helpful. I was trying to keep this as minimal as I could, focussed on laying the groundwork, so there's a bunch of stuff not fully implemented even for these two hooks. That'll come as a fast follow. But you've also identified a couple issue here, so I'm about to correct these 🙏

@eternal-openai eternal-openai force-pushed the eternal/start-of-hooks branch 2 times, most recently from 14b625c to efcd98b Compare March 6, 2026 23:24
Copy link
Copy Markdown
Collaborator

@jif-oai jif-oai left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks better to me. I think we should discuss the sandboxing point

should_block = true;
block_reason = parsed.reason.clone();
block_message_for_model = parsed.reason.clone();
if let Some(reason) = parsed.reason {
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How does it work if there is no reason? How is the model supposed to know what to do? What do you restart the turn on?

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

+1, restarting the turn with no additional context seems bad.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Totally agree, great catch, making a change to enforce that reason has to be there if stop-block is true (which is also part of the spec)

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Stop hook (failed)
  warning: Wizard Tower Stop hook reviewed the completed reply (129 chars).
  error: hook returned decision "block" without a non-empty reason

Implemented. Also handling the exit code 2 with black stderr variant of the same issue

.stderr(Stdio::piped())
.kill_on_drop(true);

let mut child = match command.spawn() {
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Discussion point on sandboxing here:
Hooks run outside of our sandboxing which, I think, come from the fact that a hook is user defined. But Stop hooks for example take the assistant message as input. So it means that in practice, a hook can execute something coming from the model. That sounds a bit dangerous to me. Any thoughts on this @bolinfest ?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Would love to get Bolin's thoughts, and yeah, hooks are inherently unsafe -- though I don't think it's on us to provide their security

}
None => match run_result.exit_code {
Some(0) => {
if let Some(parsed) = output_parser::parse_stop(&run_result.stdout) {
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We shouldn't silently ignore if the JSON parsing fails

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

+1

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Great point, thank you, implemented

Copy link
Copy Markdown
Collaborator

@sayan-oai sayan-oai left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

mostly behavioral questions. the overall structure makes sense to me.

turn_id: Option<String>,
parse: fn(&ConfiguredHandler, CommandRunResult, Option<String>) -> ParsedHandler<T>,
) -> Vec<ParsedHandler<T>> {
let results = join_all(
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

i have the same confusion. is sync vs. async whether they block the turn, rather than whether they're serialized? this should be clarified in docs/code, otherwise I'd think I could serialize hook execution (hook A formats, hook B runs tests given formatting finished).

});
}
}
} else if let Some(additional_context) = trimmed_non_empty(&run_result.stdout) {
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

do we want to inject stdout into model context if parsing fails? this also seems different than the behavior for stop hooks, which doesn't inject on parse failure (it seems to silently no-op).

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you, now treating malformed JSON-like output as an error, while preserving intentional plain text output

additional_context_for_model = Some(additional_context);
}
}
let _ = parsed.universal.suppress_output;
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

should we include suppress_output in the schema if we ignore it here and on Stop?

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

im assuming this is deferred, like clear. we should document this in code.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yep, exactly

.await;
}
if stop_outcome.should_block {
if stop_hook_active {
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

maybe this is aspirational/deferred, but is it possible to have a less coarse way of defending against infinite loops? instead of a bool, could we filter by unique reasons, or have a >1 limit? it seems plausible to want to have >1 stop hook run per turn. feel free to push back, not blocking.

Copy link
Copy Markdown
Contributor Author

@eternal-openai eternal-openai Mar 9, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, that's coming in the next PR using the stop_hook_active mechanism

}

#[test]
fn select_handlers_keeps_duplicate_stop_handlers() {
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

do we want to keep duplicated handlers? why not dedupe?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, I had it an earlier version of the PR, but jif pointed out that it's probably a mistake. If the user's config has duplicates defined, it could be for a valid usecase (the script does something different the second time) and generally I think respecting that is probably right. Though if you disagree, happy to put it back

@eternal-openai
Copy link
Copy Markdown
Contributor Author

Appreciate the hardening pass, feels like we're close! Changes are made, please review once more 🙏

# Conflicts:
#	codex-rs/app-server-protocol/schema/typescript/EventMsg.ts
#	codex-rs/app-server-protocol/schema/typescript/ServerNotification.ts
#	codex-rs/app-server/src/bespoke_event_handling.rs
#	codex-rs/core/src/features.rs
#	codex-rs/exec/src/event_processor_with_human_output.rs
@eternal-openai eternal-openai enabled auto-merge (squash) March 9, 2026 21:09
# Conflicts:
#	codex-rs/core/src/state/session.rs
@eternal-openai eternal-openai merged commit 244b2d5 into main Mar 10, 2026
31 checks passed
@eternal-openai eternal-openai deleted the eternal/start-of-hooks branch March 10, 2026 04:11
@github-actions github-actions bot locked and limited conversation to collaborators Mar 10, 2026
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.

Labels

oai PRs contributed by OpenAI employees

4 participants