feat(gateway): surface session config on /new, /reset, and auto-reset#3321
Merged
feat(gateway): surface session config on /new, /reset, and auto-reset#3321
Conversation
When a new session starts in the gateway (via /new, /reset, or auto-reset), send the user a summary of the detected configuration: ✨ Session reset! Starting fresh. ◆ Model: qwen3.5:27b-q4_K_M ◆ Provider: custom ◆ Context: 8K tokens (config) ◆ Endpoint: http://localhost:11434/v1 This makes misconfigured context length immediately visible — a user running a local 8K model that falls to the 128K default will see: ◆ Context: 128K tokens (default — set model.context_length in config to override) Instead of silently getting no compression and degrading responses. - _format_session_info() resolves model, provider, context length, and endpoint from config + runtime, matching the hygiene code's resolution chain - Local/custom endpoints shown; cloud endpoints hidden (not useful) - Context source annotated: config, detected, or default with hint - Appended to /new and /reset responses, and auto-reset notifications - 9 tests covering all formatting paths and failure resilience Addresses the user-facing side of #2708 — instead of trying to fix every edge case in context detection, surface the values so users can immediately see when something is wrong.
This was referenced Mar 27, 2026
StreamOfRon
pushed a commit
to StreamOfRon/hermes-agent
that referenced
this pull request
Mar 29, 2026
…NousResearch#3321) When a new session starts in the gateway (via /new, /reset, or auto-reset), send the user a summary of the detected configuration: ✨ Session reset! Starting fresh. ◆ Model: qwen3.5:27b-q4_K_M ◆ Provider: custom ◆ Context: 8K tokens (config) ◆ Endpoint: http://localhost:11434/v1 This makes misconfigured context length immediately visible — a user running a local 8K model that falls to the 128K default will see: ◆ Context: 128K tokens (default — set model.context_length in config to override) Instead of silently getting no compression and degrading responses. - _format_session_info() resolves model, provider, context length, and endpoint from config + runtime, matching the hygiene code's resolution chain - Local/custom endpoints shown; cloud endpoints hidden (not useful) - Context source annotated: config, detected, or default with hint - Appended to /new and /reset responses, and auto-reset notifications - 9 tests covering all formatting paths and failure resilience Addresses the user-facing side of NousResearch#2708 — instead of trying to fix every edge case in context detection, surface the values so users can immediately see when something is wrong.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
When a new session starts in the gateway (via
/new,/reset, or auto-reset), the user now sees a summary of the detected configuration:Why
Issue #2708 reported that gateway hygiene compression never fires for local models because context length detection silently falls to the 128K default. Two PRs (#2826, #2720) tried to fix edge cases in the detection logic, but live testing showed neither PR fixes the fundamental probe-failure case — when the local server isn't responding, context detection always falls back to 128K regardless.
Instead of chasing edge cases, this surfaces the detected values so the user immediately sees what's wrong:
Changes
_format_session_info()onGatewayRunner— resolves model, provider, context length, and endpoint from config + runtime (same resolution chain as hygiene code)/newand/resetresponse messagesconfig,detected, ordefaultwith actionable hintTests
9 tests in
tests/gateway/test_session_info.pycovering model name, provider, config context, fallback hint, local vs cloud endpoint visibility, million-token formatting, missing config resilience, and runtime resolution failure.All 1526 gateway tests pass.