Skip to content

feat: activate plugin lifecycle hooks (pre/post_llm_call, session start/end)#3542

Merged
teknium1 merged 1 commit intomainfrom
hermes/hermes-135af169
Mar 28, 2026
Merged

feat: activate plugin lifecycle hooks (pre/post_llm_call, session start/end)#3542
teknium1 merged 1 commit intomainfrom
hermes/hermes-135af169

Conversation

@teknium1
Copy link
Copy Markdown
Contributor

Summary

Salvaged from PR #2823 by @nicoloboschi.

Activates the four lifecycle hooks that were defined in the plugin system but never invoked: on_session_start, pre_llm_call, post_llm_call, on_session_end.

This enables external plugins (e.g. memory systems like Hindsight) to integrate as pip-installable plugins that hook every conversation turn, without requiring core changes.

Hook semantics

Hook When Can return context?
on_session_start New session created (first turn) No
pre_llm_call Once per turn, before LLM loop Yes — {"context": "..."} injected into ephemeral system prompt
post_llm_call Once per turn, after LLM loop No
on_session_end End of every run_conversation() call No

Changes from original PR

  • Cherry-picked both contributor commits cleanly onto current main
  • conversation_history passed as a shallow copy (list(messages)) to prevent plugins from mutating the live conversation
  • Added model and platform kwargs to on_session_end for consistency with all other hooks
  • Updated features/plugins.md to remove *(planned)* markers now that all hooks are active
  • Contributor attribution preserved via --author

Files changed

  • hermes_cli/plugins.pyinvoke_hook() now returns List[Any] of non-None results
  • run_agent.py — invoke all four hooks at appropriate lifecycle points
  • tests/test_plugins.py — added tests for return value collection
  • website/docs/guides/build-a-hermes-plugin.md — updated hook reference table
  • website/docs/user-guide/features/plugins.md — removed (planned) markers

Test plan

  • All 19 plugin tests pass (including 2 new ones)
  • Full test suite: 4707 passed, 165 skipped, 0 failed
  • Live PTY test with real plugin exercising all 4 hooks — all fired correctly with correct kwargs
  • Verified tool-calling turns: pre_llm_call fires once before loop, post_llm_call once after
  • Prompt caching safe: plugin context goes into ephemeral system (rebuilt per API call), not cached system prompt
…rt/end)

The plugin system defined six lifecycle hooks but only pre_tool_call and
post_tool_call were invoked.  This activates the remaining four so that
external plugins (e.g. memory systems) can hook into the conversation
loop without touching core code.

Hook semantics:
- on_session_start: fires once when a new session is created
- pre_llm_call: fires once per turn before the tool-calling loop;
  plugins can return {"context": "..."} to inject into the ephemeral
  system prompt (not cached, not persisted)
- post_llm_call: fires once per turn after the loop completes, with
  user_message and assistant_response for sync/storage
- on_session_end: fires at the end of every run_conversation call

invoke_hook() now returns a list of non-None callback return values,
enabling pre_llm_call context injection while remaining backward
compatible (existing hooks that return None are unaffected).

Salvaged from PR #2823.
@teknium1 teknium1 merged commit 455bf2e into main Mar 28, 2026
5 checks passed
teknium1 added a commit that referenced this pull request Mar 28, 2026
… pages

Fixes found by auditing docs against recent PRs/commits:

Critical (misleading):
- hooks.md: Remove stale 'planned — not yet wired' markers for 4 hooks
  that are now active (#3542). Add correct callback signatures.
- security.md: Update tirith verdict behavior — block verdicts now go
  through approval flow instead of hard-blocking (#3428). Add pkill/killall
  self-termination guard and gateway-run backgrounding patterns (#3593).

New feature docs:
- configuration.md: Add tool_use_enforcement section with value table
  (auto/true/false/list) from #3551/#3528.
- configuration.md: Expand auxiliary config with per-task timeouts
  (compression 120s, web_extract 30s, approval 30s) from #3597.
- api-server.md: Add /v1/health alias, Security Headers section,
  CORS details (Max-Age, SSE headers, Idempotency-Key) from
  #3572/#3573/#3576/#3580/#3530.

Stale/incomplete:
- configuration.md: Fix Alibaba model name qwen-plus -> qwen3.5-plus (#3484).
- environment-variables.md: Specify actual DashScope default URL.
- cli-commands.md: Add alibaba to --provider list.
- fallback-providers.md: Add Alibaba/DashScope to provider table.
- email.md: Document noreply/automated sender filtering (#3606).
- toolsets-reference.md: Add 4 missing platform toolsets — matrix,
  mattermost, dingtalk, api-server (#3583).
- skills.md: List default GitHub taps including garrytan/gstack (#3605).
teknium1 added a commit that referenced this pull request Mar 28, 2026
… pages (#3618)

Fixes found by auditing docs against recent PRs/commits:

Critical (misleading):
- hooks.md: Remove stale 'planned — not yet wired' markers for 4 hooks
  that are now active (#3542). Add correct callback signatures.
- security.md: Update tirith verdict behavior — block verdicts now go
  through approval flow instead of hard-blocking (#3428). Add pkill/killall
  self-termination guard and gateway-run backgrounding patterns (#3593).

New feature docs:
- configuration.md: Add tool_use_enforcement section with value table
  (auto/true/false/list) from #3551/#3528.
- configuration.md: Expand auxiliary config with per-task timeouts
  (compression 120s, web_extract 30s, approval 30s) from #3597.
- api-server.md: Add /v1/health alias, Security Headers section,
  CORS details (Max-Age, SSE headers, Idempotency-Key) from
  #3572/#3573/#3576/#3580/#3530.

Stale/incomplete:
- configuration.md: Fix Alibaba model name qwen-plus -> qwen3.5-plus (#3484).
- environment-variables.md: Specify actual DashScope default URL.
- cli-commands.md: Add alibaba to --provider list.
- fallback-providers.md: Add Alibaba/DashScope to provider table.
- email.md: Document noreply/automated sender filtering (#3606).
- toolsets-reference.md: Add 4 missing platform toolsets — matrix,
  mattermost, dingtalk, api-server (#3583).
- skills.md: List default GitHub taps including garrytan/gstack (#3605).
crxssrazr93 added a commit to crxssrazr93/hermes-agent that referenced this pull request Mar 29, 2026
Example plugin demonstrating the lifecycle hooks activated in NousResearch#3542.
Auto-manages a local llama-server (or any OpenAI-compatible server) when
the active model matches a locally configured model name.

Features:
- pre_llm_call hook: auto-starts the correct server on first message
  when hermes is configured with a local model name
- on_session_end hook: kills the server on exit
- switch_local_llm tool: mid-session model switching — the agent swaps
  the server when asked ("switch to the code model")
- Declarative YAML config for model definitions (GGUF paths, context
  sizes, KV cache quantization, sampling params) replacing shell scripts

The plugin is self-contained in docs/llm-switch-plugin-example/ with a
README, example config, and full implementation. Users copy it to
~/.hermes/plugins/llm-switch/ to install.

Complements NousResearch#3360 and NousResearch#3548 which restore /model as a slash command —
once merged, /model custom:write would trigger the pre_llm_call hook
to auto-start the right server seamlessly.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

2 participants