Skip to content

feat(gateway): surface session config on /new, /reset, and auto-reset#3321

Merged
teknium1 merged 1 commit intomainfrom
hermes/hermes-64c3ceb2
Mar 27, 2026
Merged

feat(gateway): surface session config on /new, /reset, and auto-reset#3321
teknium1 merged 1 commit intomainfrom
hermes/hermes-64c3ceb2

Conversation

@teknium1
Copy link
Copy Markdown
Contributor

Summary

When a new session starts in the gateway (via /new, /reset, or auto-reset), the user now sees a summary of the detected configuration:

✨ Session reset! Starting fresh.

◆ Model: `qwen3.5:27b-q4_K_M`
◆ Provider: custom
◆ Context: 8K tokens (config)
◆ Endpoint: http://localhost:11434/v1

Why

Issue #2708 reported that gateway hygiene compression never fires for local models because context length detection silently falls to the 128K default. Two PRs (#2826, #2720) tried to fix edge cases in the detection logic, but live testing showed neither PR fixes the fundamental probe-failure case — when the local server isn't responding, context detection always falls back to 128K regardless.

Instead of chasing edge cases, this surfaces the detected values so the user immediately sees what's wrong:

◆ Context: 128K tokens (default — set model.context_length in config to override)

Changes

  • _format_session_info() on GatewayRunner — resolves model, provider, context length, and endpoint from config + runtime (same resolution chain as hygiene code)
  • Appended to /new and /reset response messages
  • Appended to auto-reset notifications (idle timeout, daily reset)
  • Local/custom endpoints shown; cloud endpoints hidden
  • Context source annotated: config, detected, or default with actionable hint

Tests

9 tests in tests/gateway/test_session_info.py covering model name, provider, config context, fallback hint, local vs cloud endpoint visibility, million-token formatting, missing config resilience, and runtime resolution failure.

All 1526 gateway tests pass.

When a new session starts in the gateway (via /new, /reset, or
auto-reset), send the user a summary of the detected configuration:

  ✨ Session reset! Starting fresh.

  ◆ Model: qwen3.5:27b-q4_K_M
  ◆ Provider: custom
  ◆ Context: 8K tokens (config)
  ◆ Endpoint: http://localhost:11434/v1

This makes misconfigured context length immediately visible — a user
running a local 8K model that falls to the 128K default will see:

  ◆ Context: 128K tokens (default — set model.context_length in config to override)

Instead of silently getting no compression and degrading responses.

- _format_session_info() resolves model, provider, context length,
  and endpoint from config + runtime, matching the hygiene code's
  resolution chain
- Local/custom endpoints shown; cloud endpoints hidden (not useful)
- Context source annotated: config, detected, or default with hint
- Appended to /new and /reset responses, and auto-reset notifications
- 9 tests covering all formatting paths and failure resilience

Addresses the user-facing side of #2708 — instead of trying to fix
every edge case in context detection, surface the values so users
can immediately see when something is wrong.
@teknium1 teknium1 merged commit 58ca875 into main Mar 27, 2026
1 of 2 checks passed
StreamOfRon pushed a commit to StreamOfRon/hermes-agent that referenced this pull request Mar 29, 2026
…NousResearch#3321)

When a new session starts in the gateway (via /new, /reset, or
auto-reset), send the user a summary of the detected configuration:

  ✨ Session reset! Starting fresh.

  ◆ Model: qwen3.5:27b-q4_K_M
  ◆ Provider: custom
  ◆ Context: 8K tokens (config)
  ◆ Endpoint: http://localhost:11434/v1

This makes misconfigured context length immediately visible — a user
running a local 8K model that falls to the 128K default will see:

  ◆ Context: 128K tokens (default — set model.context_length in config to override)

Instead of silently getting no compression and degrading responses.

- _format_session_info() resolves model, provider, context length,
  and endpoint from config + runtime, matching the hygiene code's
  resolution chain
- Local/custom endpoints shown; cloud endpoints hidden (not useful)
- Context source annotated: config, detected, or default with hint
- Appended to /new and /reset responses, and auto-reset notifications
- 9 tests covering all formatting paths and failure resilience

Addresses the user-facing side of NousResearch#2708 — instead of trying to fix
every edge case in context detection, surface the values so users
can immediately see when something is wrong.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

1 participant