Fix: [ISSUES-453] deriver qwen json schema by veedaisme · Pull Request #455 · plastic-labs/honcho

veedaisme · 2026-03-27T18:48:26Z

Problem

The deriver was failing with Pydantic validation errors when using Qwen models (e.g., qwen3.5-plus) via custom OpenAI-compatible APIs like ModelScope.

Error

pydantic_core._pydantic_core.ValidationError: 3 validation errors for PromptRepresentation
explicit.0
  Input should be an object type=model_type, input_value="veeda's name is Veeda", input_type=str

Root Cause

The deriver prompt instructed the LLM to return observations as an array of strings, but the PromptRepresentation schema expects an array of objects with a content field.

Solution

1. Updated Deriver Prompt (`src/deriver/prompts.py`)

Changed the OUTPUT FORMAT section to instruct the LLM to return objects with a content field.

2. Added Qwen-Specific JSON Mode Shim (`src/utils/clients.py`)

When using Qwen models with json_mode=True, a system message is prepended to ensure valid JSON output.

Testing

Added comprehensive test coverage in tests/utils/test_clients.py:

test_qwen_json_mode_adds_system_message
test_qwen_json_mode_not_added_when_hint_exists
test_non_qwen_custom_provider_no_system_message

Impact

Aspect	Details
Affected Users	All users running Qwen models via custom OpenAI-compatible endpoints
Breaking Changes	None — other providers unaffected
Resolves	`RetryError` after 3 failed validation attempts in deriver
Backward Compatible	Yes

Summary by CodeRabbit

Bug Fixes
- Enhanced JSON output format validation and explicit format requirements for AI-generated responses to improve reliability
- Added automatic JSON compliance instructions for specific model configurations to ensure valid output parsing
Tests
- Expanded test coverage for JSON mode behavior across different AI model types and configurations

Problem: - Deriver was failing with Pydantic validation errors when using Qwen models via custom OpenAI-compatible APIs (e.g., ModelScope) - Error: 'Input should be an object [type=model_type, input_value=string]' - Root cause: Prompt instructed LLM to return {"explicit": ["fact1", "fact2"]} but PromptRepresentation schema expects {"explicit": [{"content": "fact1"}, ...]} Fixes: 1. src/deriver/prompts.py: Update OUTPUT FORMAT to specify array of objects with 'content' field instead of array of strings 2. src/utils/clients.py: Add Qwen-specific JSON mode handling - prepend system message instructing model to return valid JSON only Impact: - Fixes deriver for all users running Qwen models via custom OpenAI-compatible endpoints - No breaking changes for other providers (OpenAI, Anthropic, etc.) - Resolves RetryError after 3 failed validation attempts

Add test coverage for the Qwen-specific JSON mode handling: - test_qwen_json_mode_adds_system_message: Verifies Qwen models get a system message instructing valid JSON output - test_qwen_json_mode_not_added_when_hint_exists: Ensures no duplicate system message when JSON hint already present - test_non_qwen_custom_provider_no_system_message: Confirms other custom providers don't receive Qwen-specific handling These tests validate the fix for Pydantic validation errors when using Qwen models via custom OpenAI-compatible APIs (ModelScope).

coderabbitai · 2026-03-27T18:48:35Z

Walkthrough

The changes add explicit JSON output format requirements to the deriver prompt template and implement conditional system message injection for Qwen custom provider models when JSON mode is enabled, with comprehensive test coverage for the new behavior.

Changes

Cohort / File(s)	Summary
Prompt Template Update `src/deriver/prompts.py`	Added explicit OUTPUT FORMAT specification to `minimal_deriver_prompt()` requiring valid JSON structure with keys `{"explicit": [...]}`, including empty-case behavior. Minor whitespace/formatting adjustments.
Client JSON Mode Enhancement `src/utils/clients.py`	Implemented conditional prepending of a "Return valid JSON only" system message in `honcho_llm_call_inner()` and `handle_streaming_response()` for Qwen models when `json_mode=True` and `provider="custom"`, but only when JSON guidance is not already present in the prompt.
Test Coverage `tests/utils/test_clients.py`	Added new `TestCustomProviderQwen` test class with 135 lines covering: JSON hint injection for Qwen models, prevention of duplicate JSON hints, and verification that non-Qwen custom models do not receive Qwen-specific behavior.

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~20 minutes

Poem

🐰 Hop, hop! JSON formats now sing so clear,
Qwen models whisper hints the models hear,
With prompts explicit and tests comprehensive too,
The deriver now knows exactly what to do! ✨

🚥 Pre-merge checks | ✅ 3

✅ Passed checks (3 passed)

Check name	Status	Explanation
Title check	✅ Passed	The PR title clearly summarizes the main change: fixing a Qwen JSON schema issue in the deriver module, directly aligned with the changeset's purpose of addressing Pydantic validation failures with Qwen models.
Docstring Coverage	✅ Passed	Docstring coverage is 87.50% which is sufficient. The required threshold is 80.00%.
Description Check	✅ Passed	Check skipped - CodeRabbit’s high-level summary is enabled.

_{✏️ Tip: You can configure your own custom pre-merge checks in the settings.}

✨ Finishing Touches

🧪 Generate unit tests (beta)

Create PR with unit tests

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

_{Comment @coderabbitai help to get the list of available commands and usage tips.}

coderabbitai

Actionable comments posted: 2

🧹 Nitpick comments (1)

tests/utils/test_clients.py (1)

1179-1311: Add a streaming Qwen JSON-mode test to close coverage gap.

These tests validate non-streaming behavior, but they don’t cover handle_streaming_response(...) for provider="custom" + Qwen + json_mode=True, which is where this PR introduced a regression.

🤖 Prompt for all review comments with AI agents

Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@src/utils/clients.py`:
- Around line 1900-1904: The JSON-hint detection in the has_json_hint check only
inspects m.get("content") when it's a plain string, which misses custom message
content blocks (e.g., list of blocks like [{type:"text", text:...}]) and can
cause duplicate hints; update the logic that computes has_json_hint (iterating
openai_params["messages"] and each m) to normalize/extract text from content
whether it's a str, list of block dicts, or a dict (concatenate text fields or
inner "content" fields), then lower-case and search for "json" across the joined
text; keep the same variable names (has_json_hint, m, openai_params["messages"])
and ensure the check returns True if any message contains a JSON hint in any
supported content shape.
- Around line 2470-2484: The streaming JSON path references undefined provider
and model variables and will raise NameError; update the caller to include
"provider" and "model" in the params dict passed into
handle_streaming_response(), then inside handle_streaming_response() read
provider = params.get("provider") and model = params.get("model") (or use
params["provider"]/params["model"]) where the current checks exist (the block
manipulating openai_params and inserting the system "Return valid JSON only..."
message) so the same Qwen custom-model logic as in honcho_llm_call_inner() uses
params rather than undefined globals.

🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

Push a commit to this branch (recommended)
Create a new PR with the fixes

ℹ️ Review info

⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

Run ID: e219895f-de05-4a25-bbfe-fd24bc35aba7

📥 Commits

Reviewing files that changed from the base of the PR and between 7275372 and dec39bb.

📒 Files selected for processing (3)

src/deriver/prompts.py
src/utils/clients.py
tests/utils/test_clients.py

coderabbitai · 2026-03-27T18:55:38Z

src/utils/clients.py

+                    has_json_hint = any(
+                        isinstance(m.get("content"), str)
+                        and "json" in m["content"].lower()
+                        for m in openai_params["messages"]
+                    )


⚠️ Potential issue | 🟡 Minor

JSON-hint detection is too narrow for custom message content blocks.

Line 1900 only checks content when it is a string. In the custom path, system messages are often converted to list blocks ([{type:"text", text:...}]), so existing JSON hints can be missed and duplicated.

♻️ Suggested hardening

- has_json_hint = any( - isinstance(m.get("content"), str) - and "json" in m["content"].lower() - for m in openai_params["messages"] - ) + def _has_json_hint(msg: dict[str, Any]) -> bool: + content = msg.get("content") + if isinstance(content, str): + return "json" in content.lower() + if isinstance(content, list): + return any( + isinstance(block, dict) + and isinstance(block.get("text"), str) + and "json" in block["text"].lower() + for block in content + ) + return False + + has_json_hint = any(_has_json_hint(m) for m in openai_params["messages"])

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed. In `@src/utils/clients.py` around lines 1900 - 1904, The JSON-hint detection in the has_json_hint check only inspects m.get("content") when it's a plain string, which misses custom message content blocks (e.g., list of blocks like [{type:"text", text:...}]) and can cause duplicate hints; update the logic that computes has_json_hint (iterating openai_params["messages"] and each m) to normalize/extract text from content whether it's a str, list of block dicts, or a dict (concatenate text fields or inner "content" fields), then lower-case and search for "json" across the joined text; keep the same variable names (has_json_hint, m, openai_params["messages"]) and ensure the check returns True if any message contains a JSON hint in any supported content shape.

coderabbitai · 2026-03-27T18:55:38Z

src/utils/clients.py

+                if provider == "custom" and "qwen" in model.lower():
+                    has_json_hint = any(
+                        isinstance(m.get("content"), str)
+                        and "json" in m["content"].lower()
+                        for m in openai_params["messages"]
+                    )
+                    if not has_json_hint:
+                        openai_params["messages"] = [
+                            {
+                                "role": "system",
+                                "content": "Return valid JSON only. The output must be a JSON object.",
+                            },
+                            *openai_params["messages"],
+                        ]
                openai_params["response_format"] = {"type": "json_object"}


⚠️ Potential issue | 🔴 Critical

🧩 Analysis chain

🏁 Script executed:

#!/bin/bash set -euo pipefail # Verify undefined-name errors in this file ruff check --select F821 src/utils/clients.py # Show the problematic streaming block context rg -n -C3 'if provider == "custom"|model.lower\(\)' src/utils/clients.py

Repository: plastic-labs/honcho

Length of output: 1239

🏁 Script executed:

#!/bin/bash set -euo pipefail # Get the function containing line 2470 and its parameters echo "=== Function signature and context around line 2470 ===" sed -n '2420,2500p' src/utils/clients.py | head -n 80 # Find the function definition containing line 2470 echo -e "\n=== Finding function definition ===" rg -n 'def .*streaming' src/utils/clients.py | head -5 # Get more context on params dict and how provider/model are used echo -e "\n=== Checking params dict structure ===" rg -n -B5 -A2 '"provider":|"model":' src/utils/clients.py | head -40

Repository: plastic-labs/honcho

Length of output: 5531

🏁 Script executed:

#!/bin/bash set -euo pipefail # Get the full function signature of handle_streaming_response echo "=== Function signature ===" sed -n '2362,2380p' src/utils/clients.py # Check the non-streaming path for Qwen handling (around line 1899-1912 mentioned in original) echo -e "\n=== Non-streaming Qwen JSON handling ===" sed -n '1890,1920p' src/utils/clients.py # Find where handle_streaming_response is called echo -e "\n=== Call to handle_streaming_response ===" rg -n 'handle_streaming_response' src/utils/clients.py | head -10 # Get the outer function that defines provider and model echo -e "\n=== Outer function context (lines 1640-1670) ===" sed -n '1640,1670p' src/utils/clients.py

Repository: plastic-labs/honcho

Length of output: 3798

🏁 Script executed:

#!/bin/bash set -euo pipefail # Get the full function signature of the outer function echo "=== Outer function signature ===" rg -n 'async def honcho_llm_call_inner' src/utils/clients.py -A 25 | head -35

Repository: plastic-labs/honcho

Length of output: 1543

🏁 Script executed:

#!/bin/bash set -euo pipefail # Show the complete non-streaming AsyncOpenAI path for json_mode with custom Qwen echo "=== Non-streaming AsyncOpenAI path ===" sed -n '1875,1925p' src/utils/clients.py

Repository: plastic-labs/honcho

Length of output: 2476

Undefined names provider and model in streaming Qwen JSON handling will crash at runtime.

Lines 2470-2472 reference provider and model directly, but these variables don't exist in the handle_streaming_response() function scope. This will raise NameError when streaming is enabled with json_mode=True for custom Qwen models.

The non-streaming AsyncOpenAI path (lines 1899-1912) has the same logic but works because provider and model are available as parameters in the outer honcho_llm_call_inner() function. The fix is to add "provider" to the params dict before passing it to the streaming function, then access both values via params.

Proposed fix

diff --git a/src/utils/clients.py b/src/utils/clients.py index .. --- a/src/utils/clients.py +++ b/src/utils/clients.py @@ -1655,6 +1655,7 @@ async def honcho_llm_call_inner( params: dict[str, Any] = { + "provider": provider, "model": model, "max_tokens": max_tokens, "messages": messages, @@ -2450,7 +2451,8 @@ async def handle_streaming_response( } model_name = params["model"] + provider_name = cast(str, params.get("provider", "")) if "gpt-5" in model_name: openai_params["max_completion_tokens"] = params["max_tokens"] if reasoning_effort: @@ -2468,7 +2470,7 @@ async def handle_streaming_response( if response_model: openai_params["response_format"] = response_model elif json_mode: - if provider == "custom" and "qwen" in model.lower(): + if provider_name == "custom" and "qwen" in model_name.lower(): has_json_hint = any( isinstance(m.get("content"), str) and "json" in m["content"].lower()

🧰 Tools

🪛 Ruff (0.15.7)

[error] 2470-2470: Undefined name provider

(F821)

[error] 2470-2470: Undefined name model

(F821)

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed. In `@src/utils/clients.py` around lines 2470 - 2484, The streaming JSON path references undefined provider and model variables and will raise NameError; update the caller to include "provider" and "model" in the params dict passed into handle_streaming_response(), then inside handle_streaming_response() read provider = params.get("provider") and model = params.get("model") (or use params["provider"]/params["model"]) where the current checks exist (the block manipulating openai_params and inserting the system "Return valid JSON only..." message) so the same Qwen custom-model logic as in honcho_llm_call_inner() uses params rather than undefined globals.

veedaisme added 2 commits March 27, 2026 18:45

veedaisme changed the title ~~Fix: [ISSUES-453] deriver qwen json schema clean~~ Mar 27, 2026

veedaisme marked this pull request as ready for review March 27, 2026 18:49

coderabbitai bot reviewed Mar 27, 2026

View reviewed changes

veedaisme mentioned this pull request Mar 28, 2026

[Bug] Custom OpenAI-compatible provider (Qwen/DashScope) fails with 400 in JSON mode due to missing JSON hint in messages #453

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Fix: [ISSUES-453] deriver qwen json schema#455

Fix: [ISSUES-453] deriver qwen json schema#455
veedaisme wants to merge 2 commits intoplastic-labs:mainfrom
veedaisme:fix/deriver-qwen-json-schema-clean

veedaisme commented Mar 27, 2026 •

edited

Loading

coderabbitai bot commented Mar 27, 2026 •

edited

Loading

coderabbitai bot left a comment

coderabbitai bot Mar 27, 2026

coderabbitai bot Mar 27, 2026

Labels

1 participant

Conversation

veedaisme commented Mar 27, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Problem

Error

Root Cause

Solution

1. Updated Deriver Prompt (src/deriver/prompts.py)

2. Added Qwen-Specific JSON Mode Shim (src/utils/clients.py)

Testing

Impact

Summary by CodeRabbit

coderabbitai bot commented Mar 27, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Walkthrough

Changes

Estimated code review effort

Poem

coderabbitai bot left a comment

Choose a reason for hiding this comment

coderabbitai bot Mar 27, 2026

Choose a reason for hiding this comment

coderabbitai bot Mar 27, 2026

Choose a reason for hiding this comment

Labels

1 participant

veedaisme commented Mar 27, 2026 •

edited

Loading

1. Updated Deriver Prompt (`src/deriver/prompts.py`)

2. Added Qwen-Specific JSON Mode Shim (`src/utils/clients.py`)

coderabbitai bot commented Mar 27, 2026 •

edited

Loading