Skip to content

feat: add route-aware pricing estimates (salvaged from #1563)#1695

Merged
teknium1 merged 4 commits intomainfrom
hermes/hermes-5a9e8a78
Mar 17, 2026
Merged

feat: add route-aware pricing estimates (salvaged from #1563)#1695
teknium1 merged 4 commits intomainfrom
hermes/hermes-5a9e8a78

Conversation

@teknium1
Copy link
Copy Markdown
Contributor

Summary

Salvaged from PR #1563 by @kshitijk4poor. Cherry-picked their 3 commits onto current main with conflict resolution and follow-up fixes.

Replaces the static MODEL_PRICING dict + keyword heuristic cost calculation with a proper route-aware pricing architecture:

  • Canonical usage normalizationnormalize_usage() handles Anthropic, OpenAI, and Codex API shapes, splitting tokens into input/output/cache_read/cache_write/reasoning buckets
  • Route-aware pricingresolve_billing_route() determines pricing based on provider+model pair, not just model name heuristics
  • Cache-aware billing — separate per-million rates for cache read/write tokens instead of lumping into input
  • Cost status trackingCostStatus (actual/estimated/included/unknown) and CostSource for transparency
  • OpenRouter live pricing — fetches from OpenRouter models API, converts per-token to per-million rates
  • Subscription-included routes — marks openai-codex as included (cost $0)
  • Schema migration v4→v5 — adds cache_read_tokens, cache_write_tokens, billing_provider, billing_base_url, estimated_cost_usd, actual_cost_usd, cost_status, cost_source, pricing_version columns
  • Backward-compatible wrappersget_pricing() and estimate_cost_usd() still work for callers that haven't been updated

Follow-up changes (on top of cherry-pick)

  • Removed speculative forward-looking pricing entries (claude-opus-4.6, gpt-5, gpt-5.4, o4-mini) — show 'unknown' instead of inventing prices
  • Removed cost $$ display from CLI status bar entirely
  • Made OpenRouter metadata pre-warm non-blocking (threaded)
  • Fixed duplicate fetch_model_metadata import
  • Resolved conflicts in hermes_state.py (preserved with self._lock) and run_agent.py (normalize_usage replaces inline Anthropic cache handling)

Test plan

  • python -m pytest tests/ -n0 -q — 4992 passed (16 pre-existing failures unrelated to this PR)
  • Specifically verified: tests/agent/test_usage_pricing.py, tests/test_cli_status_bar.py, tests/test_insights.py, tests/test_hermes_state.py, tests/gateway/test_session.py, tests/gateway/test_status_command.py — all 229 pass

Attribution

Original work by @kshitijk4poor in PR #1563 (3 commits cherry-picked with authorship preserved).

kshitijk4poor and others added 4 commits March 17, 2026 03:23
Cherry-picked from PR #1563 by kshitijk4poor.
Conflicts resolved in hermes_state.py (with self._lock) and run_agent.py (normalize_usage replaces Anthropic cache fix).
…eway tests

Cherry-picked from PR #1563 by kshitijk4poor.
Cherry-picked from PR #1563 by kshitijk4poor.
- Remove speculative forward-looking pricing entries (claude-opus-4.6,
  claude-sonnet-4.6, gpt-5, gpt-5.4, o4-mini) — show 'unknown' instead
  of inventing prices
- Remove cost $$ display from CLI status bar entirely
- Thread the OpenRouter metadata pre-warm (was blocking in __init__)
- Remove duplicate fetch_model_metadata import
- Fix tests for removed models
@teknium1 teknium1 merged commit d417ba2 into main Mar 17, 2026
1 check failed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

2 participants