fix: enable json_mode for vllm provider and add explicit JSON schema to deriver prompt#466
Conversation
Two related fixes for using OpenAI-compatible providers (vllm, custom) with json_mode=True: 1. clients.py: vllm was excluded from json_object response_format injection when json_mode=True. This caused providers like MiniMax M2.7 (routed via the vllm provider config) to return plain text instead of JSON, failing the PromptRepresentation pydantic validation. Fix: remove vllm from the exclusion list when no response_model is set. 2. prompts.py: the minimal deriver prompt had no explicit JSON schema instruction. OpenAI and Anthropic infer the correct format from json_mode + the pydantic schema, but other providers (MiniMax, Mistral, Llama variants) need the schema stated explicitly in the prompt. Fix: append the exact expected JSON structure to the prompt. These two fixes together enable the deriver to work correctly with non-OpenAI/Anthropic providers including MiniMax M2.7, Ollama-served models, and other OpenAI-compatible endpoints via the vllm provider.
|
No actionable comments were generated in the recent review. 🎉 ℹ️ Recent review info⚙️ Run configurationConfiguration used: Organization UI Review profile: CHILL Plan: Pro Run ID: 📒 Files selected for processing (1)
✅ Files skipped from review due to trivial changes (1)
WalkthroughThis PR tightens JSON output constraints: the minimal deriver prompt now requires responses that are strictly a JSON object matching Changes
Estimated code review effort🎯 2 (Simple) | ⏱️ ~8 minutes Poem
🚥 Pre-merge checks | ✅ 2 | ❌ 1❌ Failed checks (1 warning)
✅ Passed checks (2 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
There was a problem hiding this comment.
Actionable comments posted: 1
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@src/utils/clients.py`:
- Around line 1898-1899: The current conditional prevents injecting JSON mode
into vllm-backed providers: modify the check around the openai_params assignment
so that when json_mode is True and there is no response_model the code sets
openai_params["response_format"] = {"type": "json_object"} regardless of
provider (remove the provider != "vllm" exclusion); update the logic near the
variables json_mode, provider, response_model and the openai_params assignment
so vllm-routed OpenAI-compatible models receive the JSON response_format.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Organization UI
Review profile: CHILL
Plan: Pro
Run ID: 152b7093-f03a-4734-89af-abc5f8f3c433
📒 Files selected for processing (2)
src/deriver/prompts.pysrc/utils/clients.py
Remove the provider != 'vllm' exclusion so vllm-routed OpenAI-compatible backends (e.g. MiniMax via vllm) receive response_format: json_object when json_mode=True. The existing vllm response_model path (json_schema) is unaffected since it is gated on response_model being set. Co-authored-by: CodeRabbit <coderabbit@coderabbit.ai>
|
Applied the fix from @coderabbitai — removed the The existing vllm Context: this was found while running the deriver against MiniMax M2.7 via a vllm-compatible endpoint. Without this fix, the deriver falls back to unstructured output and observation extraction breaks silently. |
|
Tip For best results, initiate chat on the files or code changes. 🧩 Analysis chainLet me quickly verify the actual change in 🏁 Script executed: Length of output: 16014
What the code now does (Lines 1898–1915): # Generic json_object path — now applies to ALL providers including vllm
if json_mode and not response_model:
openai_params["response_format"] = {"type": "json_object"}
# vllm-specific json_schema path — only fires when response_model is set
if provider == "vllm" and response_model:
openai_params["response_format"] = {
"type": "json_schema",
"json_schema": { ... }
}
...The two branches are cleanly mutually exclusive:
The fix is well-structured and the silent failure mode (deriver returning unstructured text and ✏️ Learnings added
|
Problem
Two related issues when using the
vllmprovider (OpenAI-compatible endpoints like MiniMax, Mistral, Llama variants):1. vllm excluded from json_object mode
The vllm provider is excluded from
json_objectresponse format injection. Whenjson_mode=Truewith noresponse_model, vllm providers receive no JSON format instruction and return plain text — causing the deriver'sPromptRepresentationvalidation to fail.2. Deriver prompt has no explicit JSON schema
The minimal deriver prompt instructs the model to extract observations but doesn't specify the expected JSON structure. OpenAI and Anthropic infer this from
response_format, but other providers need the schema stated explicitly in the prompt itself.Fix
json_mode=Trueand noresponse_modelis setResult
The deriver now works correctly with:
https://api.minimax.io/v1)Tested: deriver successfully extracts and stores observations using MiniMax M2.7.
Summary by CodeRabbit
Bug Fixes
Improvements