Skip to content

fix(agent): prevent AsyncOpenAI/httpx cross-loop deadlock in gateway mode#2701

Merged
teknium1 merged 1 commit intoNousResearch:mainfrom
ctlst:fix/async-tool-crossloop-deadlock
Mar 26, 2026
Merged

fix(agent): prevent AsyncOpenAI/httpx cross-loop deadlock in gateway mode#2701
teknium1 merged 1 commit intoNousResearch:mainfrom
ctlst:fix/async-tool-crossloop-deadlock

Conversation

@ctlst
Copy link
Copy Markdown
Contributor

@ctlst ctlst commented Mar 24, 2026

Summary

  • Fix async tool deadlock in gateway mode where vision_analyze, web_extract, and session_search hang forever because cached AsyncOpenAI clients are reused across different event loops
  • Include event loop identity in the async client cache key so each loop gets its own client instance
  • Replace session_search_tool.py's manual asyncio.run() in ThreadPoolExecutor with the centralized _run_async() bridge

Fixes #2681. Related to #2338.

Relationship to #2682

PR #2682 fixes the same issue but only for vision_analyze by switching to sync call_llm. This PR fixes the root cause in the client cache layer, so all async tools are fixed without modifying each tool individually:

Tool Uses async_call_llm Fixed by #2682 Fixed here
vision_analyze Yes
web_extract Yes
session_search Yes
mixture_of_agents Yes

Both approaches are compatible — #2682's sync switch is a reasonable defense-in-depth for vision specifically, while this PR prevents the class of bug from affecting any current or future async tool.

Root Cause

In gateway mode, _run_async() spawns a new thread with asyncio.run() which creates a fresh event loop. But _get_cached_client() returns an AsyncOpenAI client that was created on (and bound to) a different loop. Since httpx.AsyncClient cannot operate across event loop boundaries, await client.chat.completions.create() hangs indefinitely.

session_search_tool.py had the same bug independently — its own asyncio.run() in a ThreadPoolExecutor created the same cross-loop conflict.

Changes

agent/auxiliary_client.py — Add id(current_loop) to the async client cache key so each event loop gets its own AsyncOpenAI instance. Sync clients (no loop binding) are unaffected.

tools/session_search_tool.py — Replace manual asyncio.run() in ThreadPoolExecutor with _run_async() which properly handles loop lifecycle across CLI, gateway, and worker-thread contexts.

tests/test_crossloop_client_cache.py — 5 new tests:

  • Same loop reuses cached client
  • Different loops get separate clients
  • Sync clients shared globally (not affected)
  • Gateway simulation (asyncio.run in thread gets fresh client)
  • Closed loop client is discarded

How to Test

  1. Run hermes in gateway mode (Telegram) with a multimodal model
  2. Send an image and ask the bot to describe it (triggers vision_analyze)
  3. Ask the bot to search past sessions (triggers session_search)
  4. Both should complete without timeout — previously both would deadlock

Tested On

  • Linux (Docker, Python 3.11) — Telegram gateway with Qwen3.5-27B via llama.cpp
  • macOS (Python 3.14) — unit tests
…mode

In gateway mode, async tools (vision_analyze, web_extract, session_search)
deadlock because _run_async() spawns a thread with asyncio.run(), creating
a new event loop, but _get_cached_client() returns an AsyncOpenAI client
bound to a different loop. httpx.AsyncClient cannot work across event loop
boundaries, causing await client.chat.completions.create() to hang forever.

Fix: include the event loop identity in the async client cache key so each
loop gets its own AsyncOpenAI instance. Also fix session_search_tool.py
which had its own broken asyncio.run()-in-thread pattern — now uses the
centralized _run_async() bridge.
@teknium1 teknium1 merged commit 281100e into NousResearch:main Mar 26, 2026
outsourc-e pushed a commit to outsourc-e/hermes-agent that referenced this pull request Mar 26, 2026
…mode (NousResearch#2701)

In gateway mode, async tools (vision_analyze, web_extract, session_search)
deadlock because _run_async() spawns a thread with asyncio.run(), creating
a new event loop, but _get_cached_client() returns an AsyncOpenAI client
bound to a different loop. httpx.AsyncClient cannot work across event loop
boundaries, causing await client.chat.completions.create() to hang forever.

Fix: include the event loop identity in the async client cache key so each
loop gets its own AsyncOpenAI instance. Also fix session_search_tool.py
which had its own broken asyncio.run()-in-thread pattern — now uses the
centralized _run_async() bridge.
StreamOfRon pushed a commit to StreamOfRon/hermes-agent that referenced this pull request Mar 29, 2026
…mode (NousResearch#2701)

In gateway mode, async tools (vision_analyze, web_extract, session_search)
deadlock because _run_async() spawns a thread with asyncio.run(), creating
a new event loop, but _get_cached_client() returns an AsyncOpenAI client
bound to a different loop. httpx.AsyncClient cannot work across event loop
boundaries, causing await client.chat.completions.create() to hang forever.

Fix: include the event loop identity in the async client cache key so each
loop gets its own AsyncOpenAI instance. Also fix session_search_tool.py
which had its own broken asyncio.run()-in-thread pattern — now uses the
centralized _run_async() bridge.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

2 participants