fix(status): surface degraded gateway health for Telegram runtime failures by ajmeese7 · Pull Request #4393 · NousResearch/hermes-agent

ajmeese7 · 2026-04-01T02:58:26Z

Summary

Fix gateway status reporting so Hermes no longer appears healthy when the gateway process is alive but Telegram runtime health is degraded.

This addresses the case where Telegram polling breaks behind the scenes, the gateway service stays running, and hermes status / hermes gateway status misleadingly suggest everything is fine.

Problem

Previously, Hermes could end up in a bad-but-running state:

the gateway process stayed alive
Telegram polling was degraded or dead behind the scenes
hermes status still looked effectively healthy
stale runtime entries could also poison later health output

That made failures on a primary messaging platform hard to detect and forced manual investigation.

What changed

Runtime health tracking

Mark Telegram polling recovery as reconnecting while reconnect attempts are in progress
Return to connected when polling recovery succeeds
Preserve fatal behavior for real unrecoverable failures

Runtime state hygiene

Reset persisted platform runtime state on gateway startup
Record the enabled platform set for the current process
Prevent stale platform entries from previous runs from affecting current status

Status classification

Distinguish service liveness from runtime health
Only degrade overall status for relevant configured messaging platforms
Treat reconnecting / fatal as meaningful runtime problems
Do not degrade overall health for plain disconnected entries

CLI output

Show platform runtime details in hermes status
Show degraded runtime state in hermes status / hermes gateway status
Render warning icons in yellow at the CLI layer
Keep runtime status helpers presentation-agnostic

Example outcome

Before:

Gateway process running
Telegram broken
Status still looked healthy enough to mislead

After:

Gateway process can be running
Runtime can independently be degraded
Status clearly surfaces Telegram reconnect/failure state when it matters

Tests

Targeted tests added/updated for:

runtime health classification
stale state reset on startup
Telegram reconnect state reporting
CLI degraded status rendering
filtering irrelevant/stale platforms out of health output

Example targeted run:

source venv/bin/activate && python -m pytest \
  tests/gateway/test_status.py \
  tests/gateway/test_telegram_runtime_health.py \
  tests/hermes_cli/test_gateway_runtime_health.py \
  tests/hermes_cli/test_status.py \
  tests/hermes_cli/test_gateway.py \
  tests/hermes_cli/test_status_model_provider.py \
  tests/gateway/test_runner_startup_failures.py -q

All targeted tests passed locally.

Related work

This complements prior Telegram gateway reliability fixes, especially:

fix(telegram): auto-reconnect polling after network interruption #2517 — Telegram polling auto-reconnect after network interruption
Gateway crashes on Telegram Bad Gateway (502) — reconnect loop fails #3173 / fix(telegram): self-reschedule reconnect when start_polling fails after 502 #3268 — Telegram reconnect loop could fail after 502/start_polling errors
Telegram message delivery failure not surfaced to user - appears as 'hang/crash' #2910 / fix(gateway): retry transient send failures and notify user on exhaustion #3288 — send failures were not surfaced clearly to users

This PR focuses on a separate but related problem: making hermes status and hermes gateway status accurately reflect degraded runtime health when the gateway process is still alive.

…lures

…tatus

ajmeese7 added 2 commits March 31, 2026 22:54

fix(status): surface degraded gateway health for Telegram runtime fai…

52966f2

…lures

Merge remote-tracking branch 'origin/main' into fix/telegram-health-s…

a9652e2

…tatus

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

fix(status): surface degraded gateway health for Telegram runtime failures#4393

fix(status): surface degraded gateway health for Telegram runtime failures#4393
ajmeese7 wants to merge 2 commits intoNousResearch:mainfrom
ajmeese7:fix/telegram-health-status

ajmeese7 commented Apr 1, 2026 •

edited

Loading

Labels

1 participant

Conversation

ajmeese7 commented Apr 1, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!