feat: use endpoint metadata for custom model context and pricing#1906
Merged
feat: use endpoint metadata for custom model context and pricing#1906
Conversation
…nfig(), hoist set constant run_agent.py: - Add base_url property that auto-caches _base_url_lower on every assignment, eliminating 12+ redundant .lower() calls per API cycle across __init__, _build_api_kwargs, _supports_reasoning_extra_body, and the main conversation loop - Consolidate three separate load_config() disk reads in __init__ (memory, skills, compression) into a single call, reusing the result dict for all three config sections model_tools.py: - Hoist _READ_SEARCH_TOOLS set to module level (was rebuilt inside handle_function_call on every tool invocation)
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Salvage of PR #1875 by @kshitijk4poor (cherry-picked with authorship preserved, 2 commits).
Summary
Custom endpoints (Chutes, local llama.cpp, etc.) were getting wrong context lengths because
get_model_context_length()fell through to fuzzy name-matching against hardcoded defaults — e.g.zai-org/GLM-5-TEEon Chutes would match the unrelatedglm-5entry.This PR queries the endpoint's own
/modelsAPI for real metadata instead of guessing.Changes
Commit 1 (perf cleanup):
base_url.lower()via a property setter (_base_url_lower) — eliminates ~15 repeated.lower()calls throughout run_agent.pyload_config()calls in__init__into one_READ_SEARCH_TOOLSset to module level in model_tools.pyCommit 2 (endpoint metadata):
fetch_endpoint_model_metadata()in model_metadata.py — queries/modelson custom OpenAI-compatible endpoints, cached 5 min per base URL/modelsbefore fuzzy name-matching; unknown third-party endpoints skip fuzzy matching entirely (falls back to probe tiers)/modelsget accurate cost estimatesprovider/model-nameentries also get a baremodel-namealias in the cacheTest plan
pytest tests/agent/test_model_metadata.py tests/agent/test_usage_pricing.py tests/agent/test_context_compressor.py— 100 passed