Skip to content

Telegram message delivery failure not surfaced to user - appears as 'hang/crash' #2910

@jxtomxu

Description

@jxtomxu

🐛 Bug Description

When Telegram message sending fails due to network errors (e.g., "httpx.ConnectError"), the failure is silently logged but not surfaced to the user. This creates a confusing UX where the user thinks the agent has "hung" or "crashed", when in fact the response was generated but failed to deliver.

Real-world Impact

  • User waited 1+ hour for a response that was already generated but failed to send
  • User eventually sent "?卡住了?" ("Hung?") because no feedback was provided
  • The agent had actually responded within seconds, but the user never received it

📋 Reproduction Steps

  1. Start a conversation with Hermes via Telegram
  2. Trigger a network interruption during response delivery (e.g., transient connectivity issue)
  3. Observe that:
    • Agent generates response (visible in logs)
    • "[Telegram] Sending response..." appears in logs
    • "[Telegram] Failed to send response: httpx.ConnectError" is logged
    • User receives no notification of the failure
    • User sees no retry attempt

📜 Log Evidence

From ~/.hermes/logs/gateway.log:

# Response was generated and attempt was made to send
2026-03-25 09:50:19,203 INFO gateway.platforms.base: [Telegram] Sending response (222 chars) to 5570584365

# But connection failed
[Telegram] Failed to send response: httpx.ConnectError: 

From session transcript, the response was generated at 08:45:37 but user received nothing until asking at 09:50:19 (1+ hour later).


🔍 Root Cause Analysis

In gateway/platforms/telegram.py (and likely other adapters), when send() fails:

  1. Exception is caught and logged
  2. No retry mechanism is triggered
  3. No user notification is sent
  4. The failure is effectively silent from user's perspective

💡 Proposed Solutions

Option A: Automatic Retry with Exponential Backoff

  • Retry failed sends 2-3 times with exponential backoff
  • Only surface failure to user after retries exhausted

Option B: Immediate User Notification

  • If send fails, immediately notify user:
    "⚠️ Message delivery failed. Retrying..."
  • Then retry or offer manual retry command

Option C: Queue for Later Delivery

  • Store failed sends in a retry queue
  • Attempt redelivery when connection recovers
  • Notify user of queued messages

🎯 Priority

Medium-High — This significantly degrades UX and can cause users to abandon sessions thinking the service is broken.


📝 Environment

  • Hermes Agent version: Latest (commit 2233f76)
  • Platform: Telegram
  • Python: 3.11
  • OS: macOS (from logs)

Related Issues

This may be related to general gateway reliability improvements and error handling enhancements.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions