Skip to content

Fix: [ISSUES-453] deriver qwen json schema#455

Open
veedaisme wants to merge 2 commits intoplastic-labs:mainfrom
veedaisme:fix/deriver-qwen-json-schema-clean
Open

Fix: [ISSUES-453] deriver qwen json schema#455
veedaisme wants to merge 2 commits intoplastic-labs:mainfrom
veedaisme:fix/deriver-qwen-json-schema-clean

Conversation

@veedaisme
Copy link
Copy Markdown

@veedaisme veedaisme commented Mar 27, 2026

Problem

The deriver was failing with Pydantic validation errors when using Qwen models (e.g., qwen3.5-plus) via custom OpenAI-compatible APIs like ModelScope.

Error

pydantic_core._pydantic_core.ValidationError: 3 validation errors for PromptRepresentation
explicit.0
  Input should be an object type=model_type, input_value="veeda's name is Veeda", input_type=str

Root Cause

The deriver prompt instructed the LLM to return observations as an array of strings, but the PromptRepresentation schema expects an array of objects with a content field.

Solution

1. Updated Deriver Prompt (src/deriver/prompts.py)

Changed the OUTPUT FORMAT section to instruct the LLM to return objects with a content field.

2. Added Qwen-Specific JSON Mode Shim (src/utils/clients.py)

When using Qwen models with json_mode=True, a system message is prepended to ensure valid JSON output.

Testing

Added comprehensive test coverage in tests/utils/test_clients.py:

  • test_qwen_json_mode_adds_system_message
  • test_qwen_json_mode_not_added_when_hint_exists
  • test_non_qwen_custom_provider_no_system_message

Impact

Aspect Details
Affected Users All users running Qwen models via custom OpenAI-compatible endpoints
Breaking Changes None — other providers unaffected
Resolves RetryError after 3 failed validation attempts in deriver
Backward Compatible Yes

Summary by CodeRabbit

  • Bug Fixes

    • Enhanced JSON output format validation and explicit format requirements for AI-generated responses to improve reliability
    • Added automatic JSON compliance instructions for specific model configurations to ensure valid output parsing
  • Tests

    • Expanded test coverage for JSON mode behavior across different AI model types and configurations
Problem:
- Deriver was failing with Pydantic validation errors when using Qwen models
  via custom OpenAI-compatible APIs (e.g., ModelScope)
- Error: 'Input should be an object [type=model_type, input_value=string]'
- Root cause: Prompt instructed LLM to return {"explicit": ["fact1", "fact2"]}
  but PromptRepresentation schema expects {"explicit": [{"content": "fact1"}, ...]}

Fixes:
1. src/deriver/prompts.py: Update OUTPUT FORMAT to specify array of objects
   with 'content' field instead of array of strings
2. src/utils/clients.py: Add Qwen-specific JSON mode handling - prepend
   system message instructing model to return valid JSON only

Impact:
- Fixes deriver for all users running Qwen models via custom OpenAI-compatible endpoints
- No breaking changes for other providers (OpenAI, Anthropic, etc.)
- Resolves RetryError after 3 failed validation attempts
Add test coverage for the Qwen-specific JSON mode handling:
- test_qwen_json_mode_adds_system_message: Verifies Qwen models get
  a system message instructing valid JSON output
- test_qwen_json_mode_not_added_when_hint_exists: Ensures no
  duplicate system message when JSON hint already present
- test_non_qwen_custom_provider_no_system_message: Confirms other
  custom providers don't receive Qwen-specific handling

These tests validate the fix for Pydantic validation errors when
using Qwen models via custom OpenAI-compatible APIs (ModelScope).
@coderabbitai
Copy link
Copy Markdown
Contributor

coderabbitai bot commented Mar 27, 2026

Walkthrough

The changes add explicit JSON output format requirements to the deriver prompt template and implement conditional system message injection for Qwen custom provider models when JSON mode is enabled, with comprehensive test coverage for the new behavior.

Changes

Cohort / File(s) Summary
Prompt Template Update
src/deriver/prompts.py
Added explicit OUTPUT FORMAT specification to minimal_deriver_prompt() requiring valid JSON structure with keys {"explicit": [...]}, including empty-case behavior. Minor whitespace/formatting adjustments.
Client JSON Mode Enhancement
src/utils/clients.py
Implemented conditional prepending of a "Return valid JSON only" system message in honcho_llm_call_inner() and handle_streaming_response() for Qwen models when json_mode=True and provider="custom", but only when JSON guidance is not already present in the prompt.
Test Coverage
tests/utils/test_clients.py
Added new TestCustomProviderQwen test class with 135 lines covering: JSON hint injection for Qwen models, prevention of duplicate JSON hints, and verification that non-Qwen custom models do not receive Qwen-specific behavior.

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~20 minutes

Poem

🐰 Hop, hop! JSON formats now sing so clear,
Qwen models whisper hints the models hear,
With prompts explicit and tests comprehensive too,
The deriver now knows exactly what to do! ✨

🚥 Pre-merge checks | ✅ 3
✅ Passed checks (3 passed)
Check name Status Explanation
Title check ✅ Passed The PR title clearly summarizes the main change: fixing a Qwen JSON schema issue in the deriver module, directly aligned with the changeset's purpose of addressing Pydantic validation failures with Qwen models.
Docstring Coverage ✅ Passed Docstring coverage is 87.50% which is sufficient. The required threshold is 80.00%.
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

@veedaisme veedaisme changed the title Fix: [ISSUES-453] deriver qwen json schema clean Mar 27, 2026
@veedaisme veedaisme marked this pull request as ready for review March 27, 2026 18:49
Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 2

🧹 Nitpick comments (1)
tests/utils/test_clients.py (1)

1179-1311: Add a streaming Qwen JSON-mode test to close coverage gap.

These tests validate non-streaming behavior, but they don’t cover handle_streaming_response(...) for provider="custom" + Qwen + json_mode=True, which is where this PR introduced a regression.

🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@src/utils/clients.py`:
- Around line 1900-1904: The JSON-hint detection in the has_json_hint check only
inspects m.get("content") when it's a plain string, which misses custom message
content blocks (e.g., list of blocks like [{type:"text", text:...}]) and can
cause duplicate hints; update the logic that computes has_json_hint (iterating
openai_params["messages"] and each m) to normalize/extract text from content
whether it's a str, list of block dicts, or a dict (concatenate text fields or
inner "content" fields), then lower-case and search for "json" across the joined
text; keep the same variable names (has_json_hint, m, openai_params["messages"])
and ensure the check returns True if any message contains a JSON hint in any
supported content shape.
- Around line 2470-2484: The streaming JSON path references undefined provider
and model variables and will raise NameError; update the caller to include
"provider" and "model" in the params dict passed into
handle_streaming_response(), then inside handle_streaming_response() read
provider = params.get("provider") and model = params.get("model") (or use
params["provider"]/params["model"]) where the current checks exist (the block
manipulating openai_params and inserting the system "Return valid JSON only..."
message) so the same Qwen custom-model logic as in honcho_llm_call_inner() uses
params rather than undefined globals.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

Run ID: e219895f-de05-4a25-bbfe-fd24bc35aba7

📥 Commits

Reviewing files that changed from the base of the PR and between 7275372 and dec39bb.

📒 Files selected for processing (3)
  • src/deriver/prompts.py
  • src/utils/clients.py
  • tests/utils/test_clients.py
Comment on lines +1900 to +1904
has_json_hint = any(
isinstance(m.get("content"), str)
and "json" in m["content"].lower()
for m in openai_params["messages"]
)
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor

JSON-hint detection is too narrow for custom message content blocks.

Line 1900 only checks content when it is a string. In the custom path, system messages are often converted to list blocks ([{type:"text", text:...}]), so existing JSON hints can be missed and duplicated.

♻️ Suggested hardening
-                    has_json_hint = any(
-                        isinstance(m.get("content"), str)
-                        and "json" in m["content"].lower()
-                        for m in openai_params["messages"]
-                    )
+                    def _has_json_hint(msg: dict[str, Any]) -> bool:
+                        content = msg.get("content")
+                        if isinstance(content, str):
+                            return "json" in content.lower()
+                        if isinstance(content, list):
+                            return any(
+                                isinstance(block, dict)
+                                and isinstance(block.get("text"), str)
+                                and "json" in block["text"].lower()
+                                for block in content
+                            )
+                        return False
+
+                    has_json_hint = any(_has_json_hint(m) for m in openai_params["messages"])
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/utils/clients.py` around lines 1900 - 1904, The JSON-hint detection in
the has_json_hint check only inspects m.get("content") when it's a plain string,
which misses custom message content blocks (e.g., list of blocks like
[{type:"text", text:...}]) and can cause duplicate hints; update the logic that
computes has_json_hint (iterating openai_params["messages"] and each m) to
normalize/extract text from content whether it's a str, list of block dicts, or
a dict (concatenate text fields or inner "content" fields), then lower-case and
search for "json" across the joined text; keep the same variable names
(has_json_hint, m, openai_params["messages"]) and ensure the check returns True
if any message contains a JSON hint in any supported content shape.
Comment on lines +2470 to 2484
if provider == "custom" and "qwen" in model.lower():
has_json_hint = any(
isinstance(m.get("content"), str)
and "json" in m["content"].lower()
for m in openai_params["messages"]
)
if not has_json_hint:
openai_params["messages"] = [
{
"role": "system",
"content": "Return valid JSON only. The output must be a JSON object.",
},
*openai_params["messages"],
]
openai_params["response_format"] = {"type": "json_object"}
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🔴 Critical

🧩 Analysis chain

🏁 Script executed:

#!/bin/bash
set -euo pipefail

# Verify undefined-name errors in this file
ruff check --select F821 src/utils/clients.py

# Show the problematic streaming block context
rg -n -C3 'if provider == "custom"|model.lower\(\)' src/utils/clients.py

Repository: plastic-labs/honcho

Length of output: 1239


🏁 Script executed:

#!/bin/bash
set -euo pipefail

# Get the function containing line 2470 and its parameters
echo "=== Function signature and context around line 2470 ==="
sed -n '2420,2500p' src/utils/clients.py | head -n 80

# Find the function definition containing line 2470
echo -e "\n=== Finding function definition ==="
rg -n 'def .*streaming' src/utils/clients.py | head -5

# Get more context on params dict and how provider/model are used
echo -e "\n=== Checking params dict structure ==="
rg -n -B5 -A2 '"provider":|"model":' src/utils/clients.py | head -40

Repository: plastic-labs/honcho

Length of output: 5531


🏁 Script executed:

#!/bin/bash
set -euo pipefail

# Get the full function signature of handle_streaming_response
echo "=== Function signature ==="
sed -n '2362,2380p' src/utils/clients.py

# Check the non-streaming path for Qwen handling (around line 1899-1912 mentioned in original)
echo -e "\n=== Non-streaming Qwen JSON handling ==="
sed -n '1890,1920p' src/utils/clients.py

# Find where handle_streaming_response is called
echo -e "\n=== Call to handle_streaming_response ==="
rg -n 'handle_streaming_response' src/utils/clients.py | head -10

# Get the outer function that defines provider and model
echo -e "\n=== Outer function context (lines 1640-1670) ==="
sed -n '1640,1670p' src/utils/clients.py

Repository: plastic-labs/honcho

Length of output: 3798


🏁 Script executed:

#!/bin/bash
set -euo pipefail

# Get the full function signature of the outer function
echo "=== Outer function signature ==="
rg -n 'async def honcho_llm_call_inner' src/utils/clients.py -A 25 | head -35

Repository: plastic-labs/honcho

Length of output: 1543


🏁 Script executed:

#!/bin/bash
set -euo pipefail

# Show the complete non-streaming AsyncOpenAI path for json_mode with custom Qwen
echo "=== Non-streaming AsyncOpenAI path ==="
sed -n '1875,1925p' src/utils/clients.py

Repository: plastic-labs/honcho

Length of output: 2476


Undefined names provider and model in streaming Qwen JSON handling will crash at runtime.

Lines 2470-2472 reference provider and model directly, but these variables don't exist in the handle_streaming_response() function scope. This will raise NameError when streaming is enabled with json_mode=True for custom Qwen models.

The non-streaming AsyncOpenAI path (lines 1899-1912) has the same logic but works because provider and model are available as parameters in the outer honcho_llm_call_inner() function. The fix is to add "provider" to the params dict before passing it to the streaming function, then access both values via params.

Proposed fix
diff --git a/src/utils/clients.py b/src/utils/clients.py
index ..
--- a/src/utils/clients.py
+++ b/src/utils/clients.py
@@ -1655,6 +1655,7 @@ async def honcho_llm_call_inner(
 
     params: dict[str, Any] = {
+        "provider": provider,
         "model": model,
         "max_tokens": max_tokens,
         "messages": messages,
@@ -2450,7 +2451,8 @@ async def handle_streaming_response(
             }
 
             model_name = params["model"]
+            provider_name = cast(str, params.get("provider", ""))
             if "gpt-5" in model_name:
                 openai_params["max_completion_tokens"] = params["max_tokens"]
                 if reasoning_effort:
@@ -2468,7 +2470,7 @@ async def handle_streaming_response(
             if response_model:
                 openai_params["response_format"] = response_model
             elif json_mode:
-                if provider == "custom" and "qwen" in model.lower():
+                if provider_name == "custom" and "qwen" in model_name.lower():
                     has_json_hint = any(
                         isinstance(m.get("content"), str)
                         and "json" in m["content"].lower()
🧰 Tools
🪛 Ruff (0.15.7)

[error] 2470-2470: Undefined name provider

(F821)


[error] 2470-2470: Undefined name model

(F821)

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/utils/clients.py` around lines 2470 - 2484, The streaming JSON path
references undefined provider and model variables and will raise NameError;
update the caller to include "provider" and "model" in the params dict passed
into handle_streaming_response(), then inside handle_streaming_response() read
provider = params.get("provider") and model = params.get("model") (or use
params["provider"]/params["model"]) where the current checks exist (the block
manipulating openai_params and inserting the system "Return valid JSON only..."
message) so the same Qwen custom-model logic as in honcho_llm_call_inner() uses
params rather than undefined globals.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

1 participant