feat: persist reasoning across gateway session turns (schema v6) by teknium1 · Pull Request #2974 · NousResearch/hermes-agent

teknium1 · 2026-03-25T15:22:26Z

Summary

Adds reasoning TEXT and reasoning_details TEXT columns to the messages table (schema v5→v6). This preserves assistant reasoning chains across gateway session reloads so providers that replay reasoning receive coherent multi-turn context.

Problem

Three reasoning fields exist on in-memory assistant messages:

msg["reasoning"] — plain text (DeepSeek, Qwen, Moonshot, Novita)
msg["reasoning_details"] — structured array from OpenRouter (opaque objects with signatures)
msg["codex_reasoning_items"] — encrypted blobs for OpenAI Codex Responses API

All three flow correctly within a single CLI session. The existing provider-compatibility code at run_agent.py:5673-5696 already converts reasoning → reasoning_content and preserves reasoning_details for the API.

None of these were persisted to the session DB. On gateway reload, all reasoning was lost. The messages table had no columns for any of them.

Changes

hermes_state.py — Schema v6:

Add reasoning TEXT and reasoning_details TEXT columns to messages table
Auto-migration via ALTER TABLE ADD COLUMN (backward-compatible)
append_message() accepts reasoning and reasoning_details params
get_messages_as_conversation() restores them on assistant messages only
reasoning_details is JSON-serialized for storage

run_agent.py — _flush_messages_to_session_db():

Pass reasoning and reasoning_details for assistant messages

gateway/run.py — agent_history builder:

Preserve reasoning fields on non-tool-calling assistant messages (tool-calling messages already passed through all fields via the {k: v for k, v in msg.items() if k != "timestamp"} path)

gateway/session.py — append_to_transcript() and rewrite_transcript():

Pass reasoning fields through to the DB

Testing

4 new tests: round-trip persistence, reasoning_details JSON serialization, non-leaking to user/tool messages, empty-string handling
Full test suite: 6120 passed, 0 failures
Live smoke test: agent starts and runs correctly with schema v6
Round-trip verified: reasoning + reasoning_details survive write→read cycle

Notes

codex_reasoning_items is NOT persisted to the DB — these are session-specific encrypted blobs that may not survive provider changes. They still flow through the in-memory JSONL transcript path.
Claude Opus uses <REASONING_SCRATCHPAD> tags embedded in content (not a separate API field), so this change doesn't affect Claude sessions. It benefits models that return structured reasoning: DeepSeek R1, Qwen QwQ, OpenAI o1/o3, Hermes-4 (when multi-turn is supported).
Relates to but supersedes PR fix: persist and restore assistant reasoning across gateway session turns (#2936) #2941 (which only added reasoning TEXT and was based on an incorrect diagnosis about Hermes-4 multi-turn tool calling).

… session:end event The hooks page only documented gateway event hooks (HOOK.yaml system). The plugins page listed plugin hooks (pre_tool_call, etc.) that weren't referenced from the hooks page, which was confusing. Changes: - hooks.md: Add overview table showing both hook systems - hooks.md: Add Plugin Hooks section with available hooks, callback signatures, and example - hooks.md: Add missing session:end gateway event (emitted but undocumented) - hooks.md: Mark pre_llm_call, post_llm_call, on_session_start, on_session_end as planned (defined in VALID_HOOKS but not yet invoked) - hooks.md: Update info box to cross-reference plugin hooks - hooks.md: Fix heading hierarchy (gateway content as subsections) - plugins.md: Add cross-reference to hooks page for full details - plugins.md: Mark planned hooks as (planned)

When session_search is called without a query (or with an empty query), it now returns metadata for the most recent sessions instead of erroring. This lets the agent quickly see what was worked on recently without needing specific keywords. Returns for each session: session_id, title, source, started_at, last_active, message_count, preview (first user message). Zero LLM cost — pure DB query. Current session lineage and child delegation sessions are excluded. The agent can then keyword-search specific sessions if it needs deeper context from any of them.

- threshold: 0.80 → 0.50 (compress at 50%, not 80%) - target_ratio: 0.40 → 0.20, now relative to threshold not total context (20% of 50% = 10% of context as tail budget) - summary ceiling: 32K → 12K (Gemini can't output more than ~12K) - Updated DEFAULT_CONFIG, config display, example config, and tests

* docs: unify hooks documentation — add plugin hooks to hooks page, add session:end event The hooks page only documented gateway event hooks (HOOK.yaml system). The plugins page listed plugin hooks (pre_tool_call, etc.) that weren't referenced from the hooks page, which was confusing. Changes: - hooks.md: Add overview table showing both hook systems - hooks.md: Add Plugin Hooks section with available hooks, callback signatures, and example - hooks.md: Add missing session:end gateway event (emitted but undocumented) - hooks.md: Mark pre_llm_call, post_llm_call, on_session_start, on_session_end as planned (defined in VALID_HOOKS but not yet invoked) - hooks.md: Update info box to cross-reference plugin hooks - hooks.md: Fix heading hierarchy (gateway content as subsections) - plugins.md: Add cross-reference to hooks page for full details - plugins.md: Mark planned hooks as (planned) * fix: browser_vision ignores auxiliary.vision.timeout config browser_vision called call_llm() without passing a timeout parameter, so it always used the 30-second default in auxiliary_client.py. This made vision analysis with local models (llama.cpp, ollama) impossible since they typically need more than 30s for screenshot analysis. Now browser_vision reads auxiliary.vision.timeout from config.yaml (same config key that vision_analyze already uses) and passes it through to call_llm(). Also bumped the default vision timeout from 30s to 120s in both browser_vision and vision_analyze — 30s is too aggressive for local models and the previous default silently failed for anyone running vision locally. Fixes user report from GamerGB1988.

…ed community content _resolve_trust_level() didn't handle 'agent-created' source, so it fell through to 'community' trust level. Community policy blocks on any caution or dangerous findings, which meant common patterns like curl with env vars, systemctl, crontab, cloudflared references etc. would block skill creation/patching. The agent-created policy row already existed in INSTALL_POLICY with permissive settings (allow caution, ask on dangerous) but was never reached. Now it is. Fixes reports of skill_manage being blocked by security scanner.

… partial lines Updated the reasoning output mechanism to emit complete lines and force-flush long partial lines, ensuring reasoning is visible in real-time even without newlines. This improves user experience during reasoning sessions.

Add reasoning TEXT, reasoning_details TEXT, and codex_reasoning_items TEXT columns to the messages table (schema v5->v6). This preserves assistant reasoning chains across gateway session reloads so all provider-specific reasoning formats survive the round-trip. Three reasoning formats are now persisted: - reasoning: plain text (DeepSeek, Qwen, Moonshot, Novita, OpenRouter) - reasoning_details: structured array (OpenRouter multi-turn continuity) - codex_reasoning_items: encrypted blobs (OpenAI Codex Responses API) Previously, all three existed in-memory during a single session but were lost on gateway reload. Changes: - hermes_state.py: schema v6 migration, append_message() accepts all three fields, get_messages_as_conversation() restores them on assistant messages - run_agent.py: _flush_messages_to_session_db() passes all reasoning fields through for assistant messages - gateway/run.py: agent_history builder preserves reasoning fields on non-tool-calling assistant messages - gateway/session.py: append_to_transcript() and rewrite_transcript() pass all reasoning fields to the DB - Tests: 5 new tests for round-trip persistence Verified against: - OpenAI Codex direct (codex_reasoning_items round-trip: 868 enc chars) - OpenRouter -> Anthropic, Google, DeepSeek, Meta, Qwen, Mistral - Anthropic adapter (strips extra fields by construction) - Codex Responses API path (replays codex_reasoning_items correctly)

…sResearch#2974) feat: persist reasoning across gateway session turns (schema v6) Tested against OpenAI Codex (direct), Anthropic (direct + OAI-compat), and OpenRouter → 6 backends. All reasoning field types (reasoning, reasoning_details, codex_reasoning_items) round-trip through the DB correctly.

teknium1 and others added 8 commits March 24, 2026 18:34

docs: clarify two-mode behavior in session_search schema description

4549a2f

teknium1 force-pushed the hermes/hermes-ac86d935 branch from dfe3e6e to 9a19cd6 Compare March 25, 2026 16:21

teknium1 merged commit 42fec19 into main Mar 25, 2026
3 checks passed

teknium1 mentioned this pull request Mar 25, 2026

fix: persist and restore assistant reasoning across gateway session turns (#2936) #2941

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat: persist reasoning across gateway session turns (schema v6)#2974

feat: persist reasoning across gateway session turns (schema v6)#2974
teknium1 merged 8 commits intomainfrom
hermes/hermes-ac86d935

teknium1 commented Mar 25, 2026

Uh oh!

Labels

1 participant

Conversation