Skip to content

feat: persist reasoning across gateway session turns (schema v6)#2974

Merged
teknium1 merged 8 commits intomainfrom
hermes/hermes-ac86d935
Mar 25, 2026
Merged

feat: persist reasoning across gateway session turns (schema v6)#2974
teknium1 merged 8 commits intomainfrom
hermes/hermes-ac86d935

Conversation

@teknium1
Copy link
Copy Markdown
Contributor

Summary

Adds reasoning TEXT and reasoning_details TEXT columns to the messages table (schema v5→v6). This preserves assistant reasoning chains across gateway session reloads so providers that replay reasoning receive coherent multi-turn context.

Problem

Three reasoning fields exist on in-memory assistant messages:

  • msg["reasoning"] — plain text (DeepSeek, Qwen, Moonshot, Novita)
  • msg["reasoning_details"] — structured array from OpenRouter (opaque objects with signatures)
  • msg["codex_reasoning_items"] — encrypted blobs for OpenAI Codex Responses API

All three flow correctly within a single CLI session. The existing provider-compatibility code at run_agent.py:5673-5696 already converts reasoningreasoning_content and preserves reasoning_details for the API.

None of these were persisted to the session DB. On gateway reload, all reasoning was lost. The messages table had no columns for any of them.

Changes

hermes_state.py — Schema v6:

  • Add reasoning TEXT and reasoning_details TEXT columns to messages table
  • Auto-migration via ALTER TABLE ADD COLUMN (backward-compatible)
  • append_message() accepts reasoning and reasoning_details params
  • get_messages_as_conversation() restores them on assistant messages only
  • reasoning_details is JSON-serialized for storage

run_agent.py_flush_messages_to_session_db():

  • Pass reasoning and reasoning_details for assistant messages

gateway/run.py — agent_history builder:

  • Preserve reasoning fields on non-tool-calling assistant messages (tool-calling messages already passed through all fields via the {k: v for k, v in msg.items() if k != "timestamp"} path)

gateway/session.pyappend_to_transcript() and rewrite_transcript():

  • Pass reasoning fields through to the DB

Testing

  • 4 new tests: round-trip persistence, reasoning_details JSON serialization, non-leaking to user/tool messages, empty-string handling
  • Full test suite: 6120 passed, 0 failures
  • Live smoke test: agent starts and runs correctly with schema v6
  • Round-trip verified: reasoning + reasoning_details survive write→read cycle

Notes

  • codex_reasoning_items is NOT persisted to the DB — these are session-specific encrypted blobs that may not survive provider changes. They still flow through the in-memory JSONL transcript path.
  • Claude Opus uses <REASONING_SCRATCHPAD> tags embedded in content (not a separate API field), so this change doesn't affect Claude sessions. It benefits models that return structured reasoning: DeepSeek R1, Qwen QwQ, OpenAI o1/o3, Hermes-4 (when multi-turn is supported).
  • Relates to but supersedes PR fix: persist and restore assistant reasoning across gateway session turns (#2936) #2941 (which only added reasoning TEXT and was based on an incorrect diagnosis about Hermes-4 multi-turn tool calling).
teknium1 and others added 8 commits March 24, 2026 18:34
… session:end event

The hooks page only documented gateway event hooks (HOOK.yaml system).
The plugins page listed plugin hooks (pre_tool_call, etc.) that weren't
referenced from the hooks page, which was confusing.

Changes:
- hooks.md: Add overview table showing both hook systems
- hooks.md: Add Plugin Hooks section with available hooks, callback
  signatures, and example
- hooks.md: Add missing session:end gateway event (emitted but undocumented)
- hooks.md: Mark pre_llm_call, post_llm_call, on_session_start,
  on_session_end as planned (defined in VALID_HOOKS but not yet invoked)
- hooks.md: Update info box to cross-reference plugin hooks
- hooks.md: Fix heading hierarchy (gateway content as subsections)
- plugins.md: Add cross-reference to hooks page for full details
- plugins.md: Mark planned hooks as (planned)
When session_search is called without a query (or with an empty query),
it now returns metadata for the most recent sessions instead of erroring.
This lets the agent quickly see what was worked on recently without
needing specific keywords.

Returns for each session: session_id, title, source, started_at,
last_active, message_count, preview (first user message).
Zero LLM cost — pure DB query. Current session lineage and child
delegation sessions are excluded.

The agent can then keyword-search specific sessions if it needs
deeper context from any of them.
- threshold: 0.80 → 0.50 (compress at 50%, not 80%)
- target_ratio: 0.40 → 0.20, now relative to threshold not total context
  (20% of 50% = 10% of context as tail budget)
- summary ceiling: 32K → 12K (Gemini can't output more than ~12K)
- Updated DEFAULT_CONFIG, config display, example config, and tests
* docs: unify hooks documentation — add plugin hooks to hooks page, add session:end event

The hooks page only documented gateway event hooks (HOOK.yaml system).
The plugins page listed plugin hooks (pre_tool_call, etc.) that weren't
referenced from the hooks page, which was confusing.

Changes:
- hooks.md: Add overview table showing both hook systems
- hooks.md: Add Plugin Hooks section with available hooks, callback
  signatures, and example
- hooks.md: Add missing session:end gateway event (emitted but undocumented)
- hooks.md: Mark pre_llm_call, post_llm_call, on_session_start,
  on_session_end as planned (defined in VALID_HOOKS but not yet invoked)
- hooks.md: Update info box to cross-reference plugin hooks
- hooks.md: Fix heading hierarchy (gateway content as subsections)
- plugins.md: Add cross-reference to hooks page for full details
- plugins.md: Mark planned hooks as (planned)

* fix: browser_vision ignores auxiliary.vision.timeout config

browser_vision called call_llm() without passing a timeout parameter,
so it always used the 30-second default in auxiliary_client.py. This
made vision analysis with local models (llama.cpp, ollama) impossible
since they typically need more than 30s for screenshot analysis.

Now browser_vision reads auxiliary.vision.timeout from config.yaml
(same config key that vision_analyze already uses) and passes it
through to call_llm().

Also bumped the default vision timeout from 30s to 120s in both
browser_vision and vision_analyze — 30s is too aggressive for local
models and the previous default silently failed for anyone running
vision locally.

Fixes user report from GamerGB1988.
…ed community content

_resolve_trust_level() didn't handle 'agent-created' source, so it
fell through to 'community' trust level. Community policy blocks on
any caution or dangerous findings, which meant common patterns like
curl with env vars, systemctl, crontab, cloudflared references etc.
would block skill creation/patching.

The agent-created policy row already existed in INSTALL_POLICY with
permissive settings (allow caution, ask on dangerous) but was never
reached. Now it is.

Fixes reports of skill_manage being blocked by security scanner.
… partial lines

Updated the reasoning output mechanism to emit complete lines and force-flush long partial lines, ensuring reasoning is visible in real-time even without newlines. This improves user experience during reasoning sessions.
Add reasoning TEXT, reasoning_details TEXT, and codex_reasoning_items
TEXT columns to the messages table (schema v5->v6). This preserves
assistant reasoning chains across gateway session reloads so all
provider-specific reasoning formats survive the round-trip.

Three reasoning formats are now persisted:
- reasoning: plain text (DeepSeek, Qwen, Moonshot, Novita, OpenRouter)
- reasoning_details: structured array (OpenRouter multi-turn continuity)
- codex_reasoning_items: encrypted blobs (OpenAI Codex Responses API)

Previously, all three existed in-memory during a single session but
were lost on gateway reload.

Changes:
- hermes_state.py: schema v6 migration, append_message() accepts all
  three fields, get_messages_as_conversation() restores them on
  assistant messages
- run_agent.py: _flush_messages_to_session_db() passes all reasoning
  fields through for assistant messages
- gateway/run.py: agent_history builder preserves reasoning fields
  on non-tool-calling assistant messages
- gateway/session.py: append_to_transcript() and rewrite_transcript()
  pass all reasoning fields to the DB
- Tests: 5 new tests for round-trip persistence

Verified against:
- OpenAI Codex direct (codex_reasoning_items round-trip: 868 enc chars)
- OpenRouter -> Anthropic, Google, DeepSeek, Meta, Qwen, Mistral
- Anthropic adapter (strips extra fields by construction)
- Codex Responses API path (replays codex_reasoning_items correctly)
@teknium1 teknium1 force-pushed the hermes/hermes-ac86d935 branch from dfe3e6e to 9a19cd6 Compare March 25, 2026 16:21
@teknium1 teknium1 merged commit 42fec19 into main Mar 25, 2026
3 checks passed
InB4DevOps pushed a commit to InB4DevOps/hermes-agent that referenced this pull request Mar 25, 2026
…sResearch#2974)

feat: persist reasoning across gateway session turns (schema v6)

Tested against OpenAI Codex (direct), Anthropic (direct + OAI-compat), and OpenRouter → 6 backends. All reasoning field types (reasoning, reasoning_details, codex_reasoning_items) round-trip through the DB correctly.
outsourc-e pushed a commit to outsourc-e/hermes-agent that referenced this pull request Mar 26, 2026
…sResearch#2974)

feat: persist reasoning across gateway session turns (schema v6)

Tested against OpenAI Codex (direct), Anthropic (direct + OAI-compat), and OpenRouter → 6 backends. All reasoning field types (reasoning, reasoning_details, codex_reasoning_items) round-trip through the DB correctly.
StreamOfRon pushed a commit to StreamOfRon/hermes-agent that referenced this pull request Mar 29, 2026
…sResearch#2974)

feat: persist reasoning across gateway session turns (schema v6)

Tested against OpenAI Codex (direct), Anthropic (direct + OAI-compat), and OpenRouter → 6 backends. All reasoning field types (reasoning, reasoning_details, codex_reasoning_items) round-trip through the DB correctly.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

1 participant