Skip to content

feat: eager fallback to backup model on rate-limit errors#1730

Merged
teknium1 merged 1 commit intomainfrom
hermes/hermes-835076ca
Mar 17, 2026
Merged

feat: eager fallback to backup model on rate-limit errors#1730
teknium1 merged 1 commit intomainfrom
hermes/hermes-835076ca

Conversation

@teknium1
Copy link
Copy Markdown
Contributor

Summary

Salvage of PR #1413 by @usvimal, cherry-picked onto current main.

When a fallback model is configured, the agent now switches to it immediately upon detecting rate-limit conditions instead of exhausting all retries with exponential backoff.

Two eager-fallback checks added to run_agent.py:

  1. Invalid/empty API responses — common rate-limit symptom. Fallback attempted immediately after first failure, before retry loop.
  2. HTTP 429 / rate-limit errors — detected via status code and error message keywords (rate limit, too many requests, quota, usage limit). Fallback attempted before backoff.

Both paths guarded by _fallback_activated to preserve one-shot semantics.

Test results

5130 passed, 5 pre-existing failures (unrelated test_anthropic_adapter.py), 200 skipped.

Closes #1413.

When a fallback model is configured, switch to it immediately upon
detecting rate-limit conditions (429, quota exhaustion, empty/malformed
responses) instead of exhausting all retries with exponential backoff.

Two eager-fallback checks:
1. Invalid/empty API responses — fallback attempted before retry loop
2. HTTP 429 / rate-limit keyword detection — fallback before backoff

Both guarded by _fallback_activated for one-shot semantics.

Cherry-picked from PR #1413 by usvimal.
@teknium1 teknium1 merged commit 9f81c11 into main Mar 17, 2026
1 check passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

1 participant