Skip to content

fix(gateway): persist watcher metadata in checkpoint for crash recovery (salvaged #1573)#1706

Merged
teknium1 merged 1 commit intomainfrom
hermes/hermes-5a9e8a78
Mar 17, 2026
Merged

fix(gateway): persist watcher metadata in checkpoint for crash recovery (salvaged #1573)#1706
teknium1 merged 1 commit intomainfrom
hermes/hermes-5a9e8a78

Conversation

@teknium1
Copy link
Copy Markdown
Contributor

Summary

Salvaged from PR #1573 by @eren-karakus0. Cherry-picked cleanly onto current main with authorship preserved.

Fixes #1143 — Background process notifications were lost after a gateway restart because the checkpoint file didn't persist watcher metadata (platform, chat_id, thread_id, check_interval).

What changed

  • Add watcher_platform, watcher_chat_id, watcher_thread_id, watcher_interval fields to ProcessSession
  • Persist these in _write_checkpoint() and restore in recover_from_checkpoint()
  • Re-enqueue recovered watchers into pending_watchers when watcher_interval > 0
  • Drain pending_watchers at gateway startup (after adapters connect)
  • Store watcher metadata on ProcessSession when watcher is created in terminal_tool.py

Test plan

  • All 46 tests in test_background_process_notifications.py + test_process_registry.py pass
  • 5 new tests: thread_id forwarding, checkpoint persistence, watcher recovery, no-watcher skip

Attribution

Original work by @eren-karakus0 in PR #1573.

Background process watchers lose their platform/chat_id/thread_id context
after a gateway restart because the checkpoint file didn't store watcher
metadata. This caused notifications to never be delivered for recovered
processes (issue #1143).

- Add watcher_platform/chat_id/thread_id/interval fields to ProcessSession
- Persist watcher metadata in checkpoint write/recovery
- Re-enqueue recovered watchers into pending_watchers on startup
- Drain pending_watchers at gateway startup after adapters are ready
- Store watcher metadata on ProcessSession when watcher is created
@teknium1 teknium1 merged commit d87655a into main Mar 17, 2026
1 check failed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

2 participants