- Overview
- Supported Models
- Prerequisites
- Installation Steps
- Configuration Files
- Transformer Files
- StatusLine Scripts
- Usage Instructions
- Troubleshooting
- Quick Guide: Thinking/Reasoning Issues
- Thinking Not Working
- Claude Code Not Displaying Thinking
- Model Still Thinking Despite Disabling It
reasoning_contentNot Visible in Debug Logs- Empty Thinking Blocks in Claude Code UI
- CCR Won't Start
- Z.ai API Key Invalid
- Tools Not Working Correctly
- Debug Logs Accumulation
- Connection Fails with Specific IP
- Using CCR Built-in Transformers Only
- Additional Resources
This guide explains how to configure Claude Code (CC) to work with Z.ai's OpenAI-Compatible Endpoint using Claude Code Router (CCR) with a custom transformer.
Key Features:
- ✓ Resolves Claude Code's token limitations - Automatically applies correct
max_tokensfor each model (GLM 4.6: 128K, GLM 4.5: 96K, GLM 4.5v: 16K) - ✓ Sampling control guaranteed - Sets
do_sample=trueto ensuretemperatureandtop_palways work - ✓ Native reasoning control - Transformer can pass control to Claude Code (Tab key /
alwaysThinkingEnabledsetting) - ✓ Force Permanent Thinking mode - Nuclear option: Forces thinking on EVERY message (ignores all other settings)
- ✓ User tags for manual control -
<Thinking:On|Off>,<Effort:Low|Medium|High>for precise per-message control - ✓ Keyword detection - Auto-detects analytical keywords (analyze, calculate, explain, etc.) and enhances prompts
- ✓ "Ultrathink" mode - Type "ultrathink" anywhere in your message for maximum reasoning (highlighted in rainbow colors)
- ✓ Global Configuration Overrides - Apply settings across ALL models (max_tokens, temperature, reasoning, keywords, etc.)
- ✓ Cross-platform support - Windows, macOS, Linux
- ✓ Enhanced StatusLine - Git integration with branch tracking and file changes
The transformer supports all Z.ai GLM models with optimized configurations:
| Model | Max Output Tokens | Context Window | Temperature | Top-P | Reasoning |
|---|---|---|---|---|---|
| GLM 4.6 | 128K (131,072) | 200K (204,800) | 1.0 | 0.95 | ✓ |
| GLM 4.6v | 32K (32,768) | 128K (131,072) | 0.6 | 0.95 | ✓ |
| GLM 4.5 | 96K (98,304) | 128K (131,072) | 0.6 | 0.95 | ✓ |
| GLM 4.5-air | 96K (98,304) | 128K (131,072) | 0.6 | 0.95 | ✓ |
| GLM 4.5v | 16K (16,384) | 128K (131,072) | 0.6 | 0.95 | ✓ |
Notes:
- Max Output Tokens: Maximum tokens the model can generate in a single response
- Context Window: Maximum tokens the model can process as input (prompt + history)
- Temperature: Controls randomness (higher = more creative, lower = more focused)
- Top-P: Nucleus sampling parameter (controls diversity)
- Reasoning: All models support native thinking/reasoning capabilities
-
Node.js (v18.0.0 or higher)
- Download: https://nodejs.org/en/download
- Verify installation:
node --version
-
Claude Code
- Official documentation: https://docs.claude.com/en/docs/claude-code/setup
- Verify installation:
claude --version
-
Git (for StatusLine git integration)
- Download: https://git-scm.com/install
- Verify installation:
git --version
-
PowerShell 7.0+ (for statusline.ps1)
- Download: https://github.com/PowerShell/PowerShell/releases
- Verify installation:
pwsh --version
-
jq (macOS/Linux only, for statusline.sh)
- macOS:
brew install jq - Linux:
sudo apt install jqorsudo yum install jq - Verify installation:
jq --version
- macOS:
- Z.ai API Key
- Sign up for an account at https://z.ai
- Navigate to API Management → API Keys (https://z.ai/manage-apikey/apikey-list)
- Create a new API key
- Copy and save it securely
Standard installation (npm):
npm install -g @anthropic-ai/claude-codeNative binary installation (Recommended):
Windows (PowerShell):
irm https://claude.ai/install.ps1 | iexWindows (CMD):
curl -fsSL https://claude.ai/install.cmd -o install.cmd && install.cmd && del install.cmdmacOS/Linux:
curl -fsSL https://claude.ai/install.sh | bashVerify installation:
claude --versionnpm install -g @musistudio/claude-code-routerVerify installation:
ccr versionFile Location:
- Windows:
%USERPROFILE%\.claude-code-router\config.json - macOS:
~/.claude-code-router/config.json - Linux:
~/.claude-code-router/config.json
Windows Example:
{
"LOG": false,
"LOG_LEVEL": "debug",
"HOST": "127.0.0.1",
"PORT": 3456,
"APIKEY": "",
"API_TIMEOUT_MS": "600000",
"PROXY_URL": "",
"CUSTOM_ROUTER_PATH": "",
"transformers": [
{
"path": "C:\\Users\\<Your-Username>\\.claude-code-router\\plugins\\zai.js",
"options": {}
},
{
"path": "C:\\Users\\<Your-Username>\\.claude-code-router\\plugins\\zai-debug.js",
"options": {}
}
],
"Providers": [
{
"name": "ZAI",
"api_base_url": "https://api.z.ai/api/coding/paas/v4/chat/completions",
"api_key": "YOUR_ZAI_API_KEY_HERE",
"models": [
"glm-4.6",
"glm-4.5",
"glm-4.5v",
"glm-4.5-air"
],
"transformer": {
"use": [
"zai", // ← Change to "zai-debug" to enable debug logging
"reasoning" // ← Converts OpenAI's reasoning_content to Anthropic's thinking format
// ← Generates signatures to display thinking in Claude Code
]
}
}
],
"Router": {
"default": "ZAI,glm-4.6",
"background": "ZAI,glm-4.6",
"think": "ZAI,glm-4.6",
"longContext": "ZAI,glm-4.6",
"longContextThreshold": 204800,
"webSearch": "ZAI,glm-4.6",
"image": "ZAI,glm-4.6v"
}
}macOS Path Example:
"transformers": [
{
"path": "/Users/<Your-Username>/.claude-code-router/plugins/zai.js",
"options": {}
},
{
"path": "/Users/<Your-Username>/.claude-code-router/plugins/zai-debug.js",
"options": {}
}
]Linux Path Example:
"transformers": [
{
"path": "/home/<Your-Username>/.claude-code-router/plugins/zai.js",
"options": {}
},
{
"path": "/home/<Your-Username>/.claude-code-router/plugins/zai-debug.js",
"options": {}
}
]To switch between transformers:
Simply change the "use" value in the provider configuration:
- Use
"zai"for production (no logging) - Use
"zai-debug"for debugging (with detailed logs)
Important Notes:
-
Replace
YOUR_ZAI_API_KEY_HEREwith your actual Z.ai API key -
Reasoning Control Issues: If you experience problems with thinking/reasoning not working as expected, or want absolute control over all options (including
<Thinking:Off>), see Quick Guide: Thinking/Reasoning Issues for the solution. -
Authentication & Network Access:
Option A: Using
ccr code- No configuration needed -
APIKEYcan be empty ANTHROPIC_AUTH_TOKENis NOT needed (autoconfigured byccr code)- CCR and Claude Code CLI run in a single process
Option B: Using
ccr start+claudeOption C: Using
ccr start+ VS Code ExtensionClaude Code Router config (
~/.claude-code-router/config.json):{ "APIKEY": "", // ← Empty = No authentication required "HOST": "127.0.0.1", // ← Forced automatically when APIKEY is empty "PORT": 3456 }Claude Code config (
~/.claude/settings.json) orOS environment variables:{ "env": { "ANTHROPIC_BASE_URL": "http://127.0.0.1:3456", "ANTHROPIC_AUTH_TOKEN": "any-value" // ← Can be ANY value when APIKEY is empty, needed for ccr start + claude and ccr start + VS Code Extension } }Requirements:
- When
APIKEYis empty,ANTHROPIC_AUTH_TOKENcan have ANY value (but MUST be present) - Claude Code Router only accepts connections from localhost (127.0.0.1)
- Claude Code Router will display
⚠️ API key is not set. HOST is forced to 127.0.0.1.
Reference: This behavior is implemented in
src/index.tsandsrc/middleware/auth.ts- https://github.com/musistudio/claude-code-routerOption A: Using
ccr start+claudeOption B: Using
ccr start+ VS Code ExtensionCCR config (
~/.claude-code-router/config.json):{ "APIKEY": "your-secret-key", // ← REQUIRED for network exposure "HOST": "0.0.0.0", // ← Listen on all network interfaces (or use your LAN IP instead of 0.0.0.0) "PORT": 3456 }Claude Code config (
~/.claude/settings.json) orOS environment variables:{ "env": { // https://domain.tld or https://subdomain.domain.tld "ANTHROPIC_BASE_URL": "http://<LAN-IP>:<CCR-PORT>", "ANTHROPIC_AUTH_TOKEN": "your-secret-key" // ← MUST match APIKEY in CCR config } }Requirements:
APIKEYin CCR config is mandatory when exposing to networkANTHROPIC_AUTH_TOKENin Claude Code or Environment Variable MUST match theAPIKEYvalueHOST: "0.0.0.0"listens on all network interfaces- Alternatively, use your specific LAN IP (e.g.,
10.10.10.10or192.168.1.10)
Use cases:
- Access CCR from other computers
- Expose via reverse proxy (IIS, Nginx, etc.)
- No configuration needed -
-
Usage Scenarios:
Scenario APIKEY in CCR ANTHROPIC_AUTH_TOKEN Works? Using ccr codeEmpty Not needed Yes Using ccr codeAny Not needed Yes Using ccr start+claudeEmpty Any value Yes Using ccr start+claudeEmpty Not configured No Using ccr start+claudeConfigured MUST match APIKEY Yes Using ccr start+claudeConfigured Different value 401 Error Using ccr start+ VS Code ExtensionConfigured Different value 401 Error Using ccr start+ VS Code ExtensionEmpty Any value Yes Using ccr start+ VS Code ExtensionConfigured MUST match APIKEY Yes* *Note: VS Code Extension may show "How do you want to log in" screen when APIKEY is configured. Click "Anthropic Console" button to bypass.
-
Connection Configuration:
- The
HOSTandPORTin CCR'sconfig.jsonmust match EXACTLY theANTHROPIC_BASE_URLin Claude Code'ssettings.jsonor your OS environment variables. - Important: When CCR is configured with a specific IP in HOST (e.g.,
"HOST": "10.10.10.10"), the server will ONLY listen on that exact IP address.- What this means: CCR will NOT respond to requests sent to
127.0.0.1orlocalhost, even if Claude Code is running on the same computer. - Impact: ALL connection methods will fail:
ccr codecommand → Will not workccr start+claude→ Will not workccr start+ VS Code Extension → Will not work
- What this means: CCR will NOT respond to requests sent to
Examples:
CCR ( "PORT": 3456)Environment variable Works? "HOST": "127.0.0.1""ANTHROPIC_BASE_URL": "http://127.0.0.1:3456"✓ Yes "HOST": "0.0.0.0""ANTHROPIC_BASE_URL": "http://127.0.0.1:3456"✓ Yes (0.0.0.0 listens on all interfaces) "HOST": "0.0.0.0""ANTHROPIC_BASE_URL": "http://10.10.10.10:3456"✓ Yes (0.0.0.0 listens on all interfaces) "HOST": "10.10.10.10""ANTHROPIC_BASE_URL": "http://10.10.10.10:3456"✓ Yes "HOST": "10.10.10.10""ANTHROPIC_BASE_URL": "http://127.0.0.1:3456"✗ No (CCR not listening on 127.0.0.1) - The
-
"LOG": false- CCR's own logging (default istrue, but set tofalsein this example for better performance)- Important: The debug transformer (
zai-debug.js) works independently and will still log to~/.claude-code-router/logs/even withLOG: false - CCR's logging is useful if you want to see raw response chunks from Z.ai, but it's separate from the transformer's debug logs
- To enable CCR logs, set
"LOG": true- logs will appear in CCR's console output
- Important: The debug transformer (
-
Transformer name matching:
- The value in
"use": ["zai"]MUST match thenameproperty in the transformer class zai.jsexportsname = "zai"zai-debug.jsexportsname = "zai-debug"- If you rename the transformer class's
nameproperty, you must update this value accordingly
- The value in
The transformer supports optional configuration parameters in the "options" object. These global overrides apply to ALL models and take the highest priority over model-specific settings.
Basic configuration (default):
"transformers": [
{
"path": "C:\\Users\\<Your-Username>\\.claude-code-router\\plugins\\zai.js",
"options": {}
}
]Advanced configuration with global overrides:
"transformers": [
{
"path": "C:\\Users\\<Your-Username>\\.claude-code-router\\plugins\\zai.js",
"options": {
"overrideMaxTokens": 100000,
"overrideTemperature": 0.7,
"overrideTopP": 0.9,
"overrideReasoning": true,
"overrideKeywordDetection": true,
"customKeywords": ["design", "plan", "architect"],
"overrideKeywords": false
}
}
]Available Options:
| Option | Type | Default | Description |
|---|---|---|---|
forcePermanentThinking |
boolean |
false |
Level 0 - Maximum Priority. Nuclear option for forcing thinking permanently (see note 5 below). |
overrideMaxTokens |
number |
null |
Apply max_tokens for ALL models (replaces model defaults) |
overrideTemperature |
number |
null |
Apply temperature for ALL models (replaces model defaults) |
overrideTopP |
number |
null |
Apply top_p for ALL models (replaces model defaults) |
overrideReasoning |
boolean |
null |
Enable reasoning on/off for ALL models (replaces model defaults) |
overrideKeywordDetection |
boolean |
null |
Enable keyword detection on/off for ALL models (replaces model defaults) |
customKeywords |
array |
[] |
Additional keywords to trigger automatic reasoning enhancement |
overrideKeywords |
boolean |
false |
If true, ONLY customKeywords are used (default list ignored). If false, customKeywords are ADDED to default list |
Important Notes:
- Global Overrides Priority: When set, these options override ALL model-specific configurations
- Custom Keywords Behavior:
overrideKeywords: false(default) → Adds custom keywords to the default list (51 keywords)overrideKeywords: true→ Replaces ALL default keywords with ONLY your custom keywords
- Keyword Detection Requirements: Custom keywords only work when BOTH
reasoning=trueANDkeywordDetection=true - Null vs False: Use
null(omit the option) to use model defaults, usefalseto explicitly disable - forcePermanentThinking: Forces
reasoning=true+effort=highon EVERY message. Overrides ALL other settings (Ultrathink, User Tags, Global Overrides). User Tags like<Thinking:Off>,<Effort:Low>,<Effort:Medium>are completely ignored. Use ONLY when you want thinking 100% of the time. (Note: Inline user tags normally have HIGHER priority than global overrides, except when this option is active)
Examples:
// Example 1: Force thinking permanently (Nuclear Option)
"options": {
"forcePermanentThinking": true
}
// Example 2: Apply lower temperature for ALL models
"options": {
"overrideTemperature": 0.5
}
// Example 3: Disable reasoning for ALL models
"options": {
"overrideReasoning": false
}
// Example 4: Add custom keywords to default list
"options": {
"customKeywords": ["blueprint", "strategy", "roadmap"]
}
// Example 5: Use ONLY custom keywords (ignore defaults)
"options": {
"customKeywords": ["think", "analyze", "reason"],
"overrideKeywords": true
}
// Example 6: Complete custom setup
"options": {
"forcePermanentThinking": false,
"overrideMaxTokens": 90000,
"overrideTemperature": 0.8,
"overrideTopP": 0.95,
"overrideReasoning": true,
"overrideKeywordDetection": true,
"customKeywords": ["design", "plan"],
"overrideKeywords": false
}File Location:
- Windows:
%USERPROFILE%\.claude\settings.json - macOS:
~/.claude/settings.json - Linux:
~/.claude/settings.json
Example (Windows):
{
"env": {
"ANTHROPIC_BASE_URL": "http://127.0.0.1:3456",
"ANTHROPIC_AUTH_TOKEN": "Dummy",
"ANTHROPIC_DEFAULT_HAIKU_MODEL": "glm-4.6",
"ANTHROPIC_DEFAULT_SONNET_MODEL": "glm-4.6",
"ANTHROPIC_DEFAULT_OPUS_MODEL": "glm-4.6",
"MAX_THINKING_TOKENS": "32768",
"MAX_MCP_OUTPUT_TOKENS": "32768"
},
"model": "sonnet",
"enableAllProjectMcpServers": true,
"statusLine": {
"type": "command",
"command": "pwsh -NoProfile -ExecutionPolicy Bypass -Command \"& { $p = Join-Path $env:USERPROFILE '.claude\\statusline.ps1'; & $p }\""
},
"alwaysThinkingEnabled": true
}StatusLine Example (macOS/Linux):
"statusLine": {
"type": "command",
"command": "~/.claude/statusline.sh"
}Don't forget to make the script executable:
chmod +x ~/.claude/statusline.shFor the complete code, see the StatusLine Scripts section.
This project includes two transformer versions. You can install both and choose which one to use per model.
Transformer name: zai
Purpose: Optimized for production use. Use when you want maximum performance and don't need logging overhead.
Features:
- Optimized for production use
- No logging overhead
- Minimal memory footprint
- All reasoning features included (keywords, User Tags, Ultrathink, 6-level hierarchy with priorities 0-5)
When to use this transformer:
- ✓ Production models where performance matters
- ✓ Everything is working correctly, and you don't need logs
- ✓ You want the smallest memory footprint
- ✓ Stable models that don't require troubleshooting
Installation Location:
- Windows:
%USERPROFILE%\.claude-code-router\plugins\zai.js - macOS/Linux:
~/.claude-code-router/plugins/zai.js
Transformer name: zai-debug
Purpose: Troubleshooting and debugging transformer. Use when you need to understand what's happening inside the reasoning system or diagnose problems with model behavior.
Features:
- Complete logging to
~/.claude-code-router/logs/zai-transformer-[timestamp].log - Automatic log rotation at 10 MB
- Records all decisions, transformations, and reasoning detection
- Shows request/response flow with detailed annotations
- Tracking of keyword detection, Ultrathink mode, and prompt enhancements
- Helps diagnose: why reasoning isn't triggering, tool calling issues, configuration problems
When to use this transformer:
- ✓ Model not entering think mode when expected
- ✓ Debugging keyword detection (is my keyword being recognized?)
- ✓ Investigating Ultrathink behavior
- ✓ Verifying User Tags are working (
<Thinking:On>,<Effort:High>) - ✓ Checking if global config overrides are applied correctly
- ✓ Analyzing why certain prompts behave differently
- ✓ Preparing detailed logs for bug reports or support
Installation Location:
- Windows:
%USERPROFILE%\.claude-code-router\plugins\zai-debug.js - macOS/Linux:
~/.claude-code-router/plugins/zai-debug.js
You can install both transformers and call them individually per model. This allows you to:
- Use production transformer (
zai) for models you trust and want maximum performance - Use debug transformer (
zai-debug) for models where you're troubleshooting issues, testing new configurations, or need detailed logs
When to use zai (production):
- ✓ Stable models in production environment
- ✓ Performance is critical
- ✓ You don't need logging overhead
- ✓ Everything works as expected
When to use zai-debug (debug):
- ✓ Troubleshooting reasoning issues (why model isn't using think mode)
- ✓ Debugging tool calling problems (tools not being invoked correctly)
- ✓ Testing new keyword detection or Ultrathink mode
- ✓ Investigating unexpected model behavior
- ✓ Analyzing request/response transformations
- ✓ Verifying configuration changes are working
- ✓ Log files needed for issue reporting
Configuration example:
{
"transformers": [
{
"path": "C:\\Users\\<Your-Username>\\.claude-code-router\\plugins\\zai.js",
"options": {}
},
{
"path": "C:\\Users\\<Your-Username>\\.claude-code-router\\plugins\\zai-debug.js",
"options": {}
}
],
"Providers": [
{
"name": "ZAI",
"transformer": {
"glm-4.6": {
"use": [
"zai", // ← Uses production transformer (no logging)
"reasoning"
]
},
"glm-4.5": {
"use": [
"zai-debug", // ← Uses debug transformer (with logging)
"reasoning"
]
}
}
}
]
}Important: The transformer name in "use" must match the name property in the transformer class:
zai.jsexportsname = "zai"zai-debug.jsexportsname = "zai-debug"
Typical setup strategies:
- Performance-first: Use
"zai"on all models except when actively debugging - Debug-first: Use
"zai-debug"on experimental models,"zai"on proven ones - Hybrid: Use
"zai-debug"on one model for testing,"zai"on others for comparison
Log Structure Example:
❯ ccr start
Loaded JSON config from: C:\Users\Bedolla\.claude-code-router\config.json
[START] Z.ai Transformer (Debug) initialized
[CONFIG] Log file: C:\Users\Bedolla\.claude-code-router\logs\zai-transformer-2025-10-25T05-25-55.log
[CONFIG] Maximum size per file: 10.0 MB
╔═══════════════════════════════════════════════════════════════════════════════════════════════════╗
[STAGE 1/3] INPUT: Claude Code → CCR → transformRequestIn() [Request #1]
Request RECEIVED from Claude Code, BEFORE sending to provider
╚═══════════════════════════════════════════════════════════════════════════════════════════════════╝
[PROVIDER] LLM destination information:
name: "ZAI"
baseUrl: "https://api.z.ai/api/coding/paas/v4/chat/completions"
models: [
"glm-4.6",
"glm-4.5",
"glm-4.5v",
"glm-4.5-air"
]
transformer: [
"zai-debug"
]
[CONTEXT] HTTP request from client:
req: [_Request]
[INPUT] Request received from Claude Code:
model: "glm-4.6"
max_tokens: 32769
stream: true
messages: 2 messages
[0] system: 14154 chars - "You are Claude Code, Anthropic's official CLI for ..."
[1] user: 3426 chars - "<system-reminder> This is a reminder that your tod..."
tools: 53 tools
└─ [Task, Bash, Glob, Grep, ExitPlanMode, Read, Edit, Write, NotebookEdit, WebFetch, ... +43 more]
tool_choice: undefined
reasoning: {
"effort": "high",
"enabled": true
}
[OVERRIDE] Original max_tokens: 32769 → Override to 131072
[CUSTOM TAGS] Searching for tags in user messages...
[SYSTEM] Message 1 ignored (system-reminder)
[INFO] No custom tags detected in messages
[REASONING] Determining effective configuration...
[PRIORITY 4] Model config: reasoning=true → reasoning=true, effort=high
[RESULT] Effective reasoning=true, effort level=high
[CLEANUP] Removing tags from messages...
[INFO] No tags found to remove
[REASONING FIELD] Adding reasoning field to request...
[INFO] Active conditions detected, overriding reasoning
reasoning.enabled = true
reasoning.effort = "high"
[THINKING] Z.AI format applied
[KEYWORDS] Checking for analytical keywords in ALL user messages...
[MESSAGE 1] system-reminder ignored
[RESULT] No keywords detected in any message
╚═══════════════════════════════════════════════════════════════════════════════════════════════════╝
╔═══════════════════════════════════════════════════════════════════════════════════════════════════╗
[STAGE 2/3] OUTPUT: transformRequestIn() → CCR → LLM Provider [Request #1]
OPTIMIZED request to be sent to provider
╚═══════════════════════════════════════════════════════════════════════════════════════════════════╝
[OUTPUT] Body to be sent to provider:
model: "glm-4.6"
max_tokens: 131072
temperature: 1
top_p: 0.95
do_sample: true
stream: true
messages: 2 messages
[0] system: 14154 chars - "You are Claude Code, Anthropic's official CLI for ..."
[1] user: 3426 chars - "<system-reminder> This is a reminder that your tod..."
tools: 53 tools
└─ [Task, Bash, Glob, Grep, ExitPlanMode, Read, Edit, Write, NotebookEdit, WebFetch, ... +43 more]
tool_choice: undefined
thinking: {
"type": "enabled"
}
[EXTRAS]: reasoning
└─ reasoning: {
"enabled": true,
"effort": "high"
}
╚═══════════════════════════════════════════════════════════════════════════════════════════════════╝
╔═══════════════════════════════════════════════════════════════════════════════════════════════════╗
[STAGE 3/3] LLM Provider → CCR → transformResponseOut() [Request #1]
Response RECEIVED from provider, BEFORE sending to Claude Code
╚═══════════════════════════════════════════════════════════════════════════════════════════════════╝
[INFO] Response for Request #1 | Response Object ID: 1762148421903-3
[RESPONSE OBJECT DETECTED]
Response.ok: true
Response.status: 200 OK
Response.url: https://api.z.ai/api/coding/paas/v4/chat/completions
Response.bodyUsed: false
Content-Type: text/event-stream;charset=UTF-8
NOTE: This is the original Response BEFORE CCR parsing.
CCR will read the stream and convert it to Anthropic format for Claude Code.
╚═══════════════════════════════════════════════════════════════════════════════════════════════════╝
╔═══════════════════════════════════════════════════════════════════════════════════════════════════╗
[STREAMING] Reading first chunks from Response
RAW stream content BEFORE CCR parses it
╚═══════════════════════════════════════════════════════════════════════════════════════════════════╝
[CHUNK 1] 177 bytes [THINKING] → {role:"assistant", content:"", reasoning_content:"↵"}
[CHUNK 2] 165 bytes [THINKING] → {role:"assistant", reasoning_content:"The"}
[CHUNK 3] 335 bytes [THINKING] → {role:"assistant", reasoning_content:" user"}
[CHUNK 4] 165 bytes [THINKING] → {role:"assistant", reasoning_content:" me"}
[CHUNK 5] 165 bytes [THINKING] → {role:"assistant", reasoning_content:" to"}
[CHUNK 6] 170 bytes [THINKING] → {role:"assistant", reasoning_content:" analyze"}
[CHUNK 7] 166 bytes [THINKING] → {role:"assistant", reasoning_content:" the"}
[CHUNK 8] 333 bytes [THINKING] → {role:"assistant", reasoning_content:" code"}
[CHUNK 9] 334 bytes [THINKING] → {role:"assistant", reasoning_content:" find"}
[CHUNK 10] 163 bytes [THINKING] → {role:"assistant", reasoning_content:"."}
[CHUNK 11] 167 bytes [THINKING] → {role:"assistant", reasoning_content:" They"}
[CHUNK 12] 167 bytes [THINKING] → {role:"assistant", reasoning_content:" also"}
[CHUNK 13] 172 bytes [THINKING] → {role:"assistant", reasoning_content:" mentioned"}
[CHUNK 14] 165 bytes [THINKING] → {role:"assistant", reasoning_content:" ""}
[CHUNK 15] 328 bytes [THINKING] → {role:"assistant", reasoning_content:"U"}
[CHUNK 16] 165 bytes [THINKING] → {role:"assistant", reasoning_content:"ath"}
[CHUNK 17] 165 bytes [THINKING] → {role:"assistant", reasoning_content:"ink"}
[CHUNK 18] 164 bytes [THINKING] → {role:"assistant", reasoning_content:"""}
[CHUNK 19] 336 bytes [THINKING] → {role:"assistant", reasoning_content:" which"}
[CHUNK 20] 165 bytes [THINKING] → {role:"assistant", reasoning_content:" be"}
[STREAM] Limit of 20 chunks reached (more data exists)
[SUCCESS] Reading completed - Original Response was NOT consumed
╚═══════════════════════════════════════════════════════════════════════════════════════════════════╝
Note: If you use the built-in reasoning transformer with zai-debug.js, you may not see reasoning_content in logs because the reasoning transformer processes and removes it first. See solution at: reasoning_content Not Visible in Debug Logs
File Location: %USERPROFILE%\.claude\statusline.ps1
Notes:
- ✓ Compatible with PowerShell 7+
- ✓ Git status with detailed indicators (staged, modified, untracked, etc.)
- ✓ Branch tracking with ahead/behind commits
- ✓ Model name formatting (GLM 4.6, Claude, GPT, etc.)
- ✓ Cost tracking in USD
- ✓ Session duration formatting
- ✓ Code lines added/removed with net change
- ✓ Cross-platform emoji support
Output Example:
🗂️ ~/Projects/MyApp | 🍃 main ✅2 ✏️1 ⬆️1 | 🤖 GLM 4.6
💵 $0.15 USD | ⏱️ 5m 23s | ✏️ +127/-45 (Net: 82)
For the complete code, refer to statusline.ps1 in the repository.
File Location:
- macOS/Linux:
~/.claude/statusline.sh
Notes:
- ✓ Same features as the PowerShell version
- ✓ Compatible with bash 4.0+ and zsh 5.0+
Output Example:
🗂️ ~/Projects/MyApp | 📦 No Git | 🤖 GLM 4.6
💵 $0.00 USD | ⏱️ 1s | ✏️ +0/-0 (Net: 0)
Make script executable:
chmod +x ~/.claude/statusline.shFor the complete code, refer to statusline.sh in the repository.
There are three ways to use Claude Code with CCR. Choose the method that best fits your workflow:
Single command - CCR handles everything automatically:
cd your-project-directory
ccr codeAdvantages:
- No need to set
ANTHROPIC_AUTH_TOKEN - CCR automatically configures authentication
- Works immediately after installation
- Simplest method for local development
- No need to configure
APIKEYin CCR config - Can run multiple Claude Code CLI instances using
ccr codecommand
How it works:
ccr codestarts CCR server in the background- Automatically configures
ANTHROPIC_BASE_URLandANTHROPIC_AUTH_TOKEN - Launches Claude Code CLI with correct settings
- Everything runs in a single process
Important: This method ONLY works with Claude Code CLI. It does NOT work with the VS Code Extension. For VS Code Extension, use Method 3.
Two separate processes - More control and flexibility:
Terminal 1 (Run CCR Server):
ccr startExpected output:
Loaded JSON config from: C:\Users\<Your-Username>\.claude-code-router\config.json
Terminal 2 (Run Claude Code):
cd your-project-directory
claudeRequirements:
- Must configure
ANTHROPIC_AUTH_TOKENin~/.claude/settings.jsonorOS environment variables - If
APIKEYis set,ANTHROPIC_AUTH_TOKENmust matchAPIKEYin CCR config - If
APIKEYis empty,ANTHROPIC_AUTH_TOKENcan be set to any value
Advantages:
- CCR runs independently (can restart Claude Code without restarting CCR)
- Better for debugging (separate logs)
- Can run multiple Claude Code CLI instances using
claudecommand
Configuration needed in ~/.claude/settings.json or OS environment variables:
{
"env": {
"ANTHROPIC_BASE_URL": "http://127.0.0.1:3456",
"ANTHROPIC_AUTH_TOKEN": "your-value-here" // If APIKEY is set, must match APIKEY in CCR config
}
}Important: The ccr code command does NOT work with VS Code Extension. You MUST use ccr start separately.
CCR + VS Code Extension - GUI experience:
Step 1: Start CCR Server
ccr startStep 2: Configure VS Code Extension
- Install Claude Code Extension in VS Code
- Configure in
~/.claude/settings.jsonorOS environment variables:{ "env": { "ANTHROPIC_BASE_URL": "http://127.0.0.1:3456", "ANTHROPIC_AUTH_TOKEN": "your-value-here" // If APIKEY is not empty, must match APIKEY in CCR config } }
Step 3: Bypass Login Screen (if shown)
If the extension shows "How do you want to log in" screen:
- Click "Anthropic Console" button
- Extension will use the configured
ANTHROPIC_AUTH_TOKEN - This bypasses the normal login flow
Note: The login screen only appears when CCR has APIKEY configured. With empty APIKEY, the extension connects directly.
Advantages:
- Full VS Code integration
- Visual interface
- Side-by-side code and chat
Claude Code is also available for JetBrains IDEs (IntelliJ IDEA, PyCharm, WebStorm, Rider, etc.)
Plugin Link: Claude Code [Beta] - JetBrains Marketplace
How it works:
- Install the plugin from JetBrains Marketplace
- A Claude Code button appears in your IDE toolbar
- Click the button to launch Claude Code in the IDE's integrated terminal
- The extension automatically:
- Configures Claude Code for the current workspace
- Pre-configures the IDE integration (no need for
/idecommand) - Opens Claude Code ready to work with your project
Advantages:
- Pre-configured Claude Code experience
- Integrated terminal within your JetBrains IDE
- Automatic workspace detection
- No manual IDE configuration needed
- One-click launch
Setup:
- Start CCR server:
ccr start - Configure
~/.claude/settings.jsonor OS environment variables (same as Method 2/3) - Install the JetBrains plugin
- Click the Claude Code button in your IDE
Note: This is similar to opening your JetBrains IDE's integrated terminal and running claude + configuring the IDE with /ide command, but fully automated.
| Feature | ccr code |
ccr start + claude |
ccr start + VS Code Extension |
|---|---|---|---|
| Configuration | None needed | Manual | Manual |
| ANTHROPIC_AUTH_TOKEN | Auto-configured | Required | Required |
The transformer implements a 6-level reasoning hierarchy with clear priorities. Higher levels override lower levels.
Configuration-based permanent override - Set forcePermanentThinking: true in transformer options.
Behavior:
- ✓ Forces
reasoning = true - ✓ Forces
effort = high - ✓ Overrides EVERYTHING (Ultrathink, User Tags, Global Overrides, Model Config, Default)
- ✓ No way to disable thinking once enabled (not even Ultrathink can override it)
- ✗ User Tags like
<Thinking:Off>,<Effort:Low>,<Effort:Medium>are completely ignored - ✗ No method can disable or reduce thinking when this option is active
Configuration example:
"transformers": [
{
"path": "~/.claude-code-router/plugins/zai.js",
"options": {
"forcePermanentThinking": true // ← Maximum priority, overrides ALL other settings
}
}
]When to use:
- Development/testing environments where you always want full reasoning traces
- Research scenarios where consistent reasoning is required
- Debugging complex problems where thinking should never be skipped
When NOT to use:
- Production environments (no flexibility)
- When you need to toggle thinking on/off
- When you want User Tags or Ultrathink to work
- When you want control over reasoning behavior
Implementation: Applied immediately in transformer logic, before any other checks.
User-triggered via Claude Code CLI - Type "ultrathink" anywhere in your message. Claude Code highlights this keyword with rainbow colors to indicate activation.
Behavior:
- ✓ Forces
reasoning = true - ✓ Forces
effort = high - ✓ Adds reasoning instructions to prompt
- ✓ Overrides ALL other settings (global overrides, model config, default, user tags)
- ✓ Keyword is kept in the message (visible to the model)
- ✗ Cannot override Force Permanent Thinking (Level 0)
Examples:
ultrathink: How many letters are in "strawberry"?
How many distinct letters are in "Mississippi"? Ultrathink
Please ULTRATHINK this problem before answering.
Implementation: The keyword is kept in the message (visible to the model).
Manual control via inline tags - Add simple tags anywhere in your message to control reasoning behavior.
Available tags:
<Thinking:On> <!-- Enables reasoning -->
<Thinking:Off> <!-- Disables reasoning -->
<Effort:Low> <!-- Low effort -->
<Effort:Medium> <!-- Medium effort -->
<Effort:High> <!-- High effort -->
Key points:
- Tags are case insensitive (
<Thinking:On>,<thinking:on>,<THINKING:ON>are equivalent) - Tags are inline and self-contained (they don't wrap content)
- Can be placed anywhere in your message (beginning, middle, or end)
- Can be combined in any order
- Are removed from the message before sending to the model (not visible to the model)
Important Hierarchy Within Level 2:
<Effort>tags have HIGHER priority than<Thinking:Off>- If you combine
<Thinking:Off>+ any<Effort>tag, the Effort tag wins and reasoning is enabled - This allows fine-grained control: disable thinking by default but override with effort level
- Examples of this hierarchy:
<Thinking:Off>alone → reasoning disabled ✗<Thinking:Off> <Effort:High>together → reasoning enabled ✓ (Effort overrides Thinking:Off)<Thinking:Off> <Effort:Low>together → reasoning enabled ✓ (Effort overrides Thinking:Off)<Effort:High>alone → reasoning enabled ✓
Examples:
<Thinking:On> Analyze the performance of this sorting algorithm.
Calculate the anagrams for "Mississippi" <Effort:High>
<Effort:Low><Thinking:On> Quick summary of this code please.
<Thinking:On> What's the time complexity? <Effort:Medium>
Just give me a quick answer <Thinking:Off>
Behavior:
- ✓ Overrides global configuration (Level 3)
- ✓ Overrides model-specific config (Level 4)
- ✓ Overrides default behavior (Level 5)
- ✗ Cannot override Force Permanent Thinking (Level 0)
- ✗ Cannot override Ultrathink (Level 1)
Tag Removal Behavior:
- User Tags are removed:
<Thinking:On>,<Thinking:Off>,<Effort:Low|Medium|High>are stripped from the message before sending to the model - Ultrathink keyword is kept: The word "ultrathink" remains in the message and is visible to the model
Configuration in config.json - Apply settings across ALL models.
Example:
"transformers": [
{
"path": "~/.claude-code-router/plugins/zai.js",
"options": {
"overrideReasoning": true,
"overrideKeywordDetection": true
}
}
]Behavior:
- ✓ Overrides Model Config (Level 4)
- ✗ Cannot override Force Permanent Thinking (Level 0)
- ✗ Cannot override Ultrathink (Level 1)
- ✗ Cannot override User Tags (Level 2)
Per-model settings - Configure reasoning per model in transformer code.
Default configurations for Z.AI GLM models:
modelConfigurations = {
"glm-4.6": { reasoning: true, keywordDetection: true },
"glm-4.5": { reasoning: true, keywordDetection: true },
"glm-4.5-air": { reasoning: true, keywordDetection: true },
"glm-4.5v": { reasoning: true, keywordDetection: true }
}Behavior:
- ✗ Cannot override Force Permanent Thinking (Level 0)
- ✗ Cannot override Ultrathink (Level 1)
- ✗ Cannot override User Tags (Level 2)
- ✗ Cannot override Global Configuration (Level 3)
Note: With reasoning: true (default), the transformer always enables thinking. To allow Claude Code's native toggle to work, set reasoning: false for all models.
Passthrough mode - Transformer passes control to Claude Code when no other levels are active.
Behavior:
- ✓ Transformer passes
request.reasoningfrom Claude Code unchanged - ✓ Claude Code's Tab key works (toggle thinking on/off)
- ✓
alwaysThinkingEnabledsetting in~/.claude/settings.jsonworks - ✓ Model decides when to use reasoning based on Claude Code's request
- ✗ Cannot override Force Permanent Thinking (Level 0)
- ✗ Cannot override Ultrathink (Level 1)
- ✗ Cannot override User Tags (Level 2)
- ✗ Cannot override Global Configuration (Level 3)
- ✗ Cannot override Model Config (Level 4)
Important: With default configuration (reasoning: true for all models), Level 4 is always active, so Level 5 is never reached. The transformer always controls reasoning, and Claude Code's native toggle does not work.
To enable Level 5 (Native Control):
Edit transformer code and set reasoning: false for all models:
'glm-4.6': {
reasoning: false, // ← Change from true to false
keywordDetection: true,
// ...
}Trade-offs:
- ✓ Claude Code's Tab key and settings work
- ✗ Lose automatic reasoning activation for keywords and Ultrathink
- ✗ Lose User Tags control (
<Thinking:On>,<Effort:High>) - ✗ Lose Global Override control
Note: With default configuration (reasoning: true for all models), Level 4 is always active, preventing Level 5 from being reached. Claude Code's native toggle (Tab key / alwaysThinkingEnabled setting) does not have effect unless you set reasoning: false for all models.
| Priority | Level | Source | Overrides | Can be overridden by | Active by Default? |
|---|---|---|---|---|---|
| 0 (Maximum) | Force Permanent Thinking | forcePermanentThinking: true in options |
All (1-5) | None (Nuclear Option - ignores all tags) | No |
| 1 (Highest) | Ultrathink | Message keyword "ultrathink" | 2-5 | 0 | No (user-triggered) |
| 2 | User Tags | <Thinking:On|Off>, <Effort:...> |
3-5 | 0-1 | No (user-triggered) |
| 3 | Global Override | config.json options |
4-5 | 0-2 | No (optional config) |
| 4 | Model Config | Transformer code (reasoning: true by default) |
5 | 0-3 | YES (always active) |
| 5 (Lowest) | Native Control | Claude Code's request.reasoning |
None | 0-4 | NO (unreachable by default) |
Note: This is NOT a priority level, but a feature that activates when conditions are met.
Activation requirements (ALL must be true):
- ✓
reasoning = true(from any level 0-5) - ✓
keywordDetection = true(from any level 0-5) - ✓ Keywords detected in message text
When activated: Transformer automatically enhances prompt with reasoning instructions.
Keywords that trigger enhancement (51 English keywords):
The complete list of all 51 keywords implemented in the transformer code:
| Category | Keywords |
|---|---|
| Counting questions | how many, how much, count, number of, total of, amount of |
| Analysis and reasoning | analyze, analysis, reason, reasoning, think, thinking, deduce, deduction, infer, inference |
| Calculations and problem-solving | calculate, calculation, solve, solution, determine |
| Detailed explanations | explain, explanation, demonstrate, demonstration, detail, detailed, step by step, step-by-step |
| Identification and search | identify, find, search, locate, enumerate, list |
| Precision-requiring words | letters, characters, digits, numbers, figures, positions, position, index, indices |
| Comparisons and evaluations | compare, comparison, evaluate, evaluation, verify, verification, check |
Example:
Can you analyze this code and explain how it works?
Transformer detects "analyze" and "explain" → Adds reasoning instructions to prompt.
The transformer automatically ensures sampling parameters work correctly:
Automatic Configuration:
- Sets
do_sample=trueon every request (Z.ai API default istrue, but explicitly set to guarantee consistent behavior)- Reference: Z.ai Chat Completion API Documentation
- Applies model-specific temperature:
- GLM 4.6: 1.0 (more creative)
- GLM 4.5 variants: 0.6 (balanced)
- Applies top_p: 0.95 (the model considers only the most probable word choices that collectively represent 95% likelihood, ignoring very unlikely options)
If you experience any problems with thinking or reasoning:
| Problem | See Section |
|---|---|
| Model not thinking when expected | Thinking Not Working |
| Thinking blocks not visible in Claude Code | Claude Code Not Displaying Thinking |
| Model keeps thinking despite disabling it | Model Still Thinking Despite Disabling It |
Common Solution for Mixed Transformer Issues:
If you're using both zai and reasoning transformers together and experiencing conflicts, use only zai:
"transformer": {
"glm-4.6": {
"use": [
"zai" // ← Only use the custom transformer (removes reasoning conflicts)
]
}
}Result: Everything works perfectly (reasoning control, keywords, Ultrathink, User Tags, all features).
Trade-off: Thinking blocks won't display in Claude Code UI (but thinking still happens normally).
Alternative: If you must see thinking blocks displayed, use CCR Built-in Transformers Only instead of custom transformers.
Problem: Model isn't thinking/reasoning when you expect it to.
Common Causes & Solutions:
-
Model has
reasoning: trueenabled (default)- With default configuration, reasoning is always active
- The transformer enables thinking automatically
- Solution: This is expected behavior. Reasoning should work by default.
-
Using
overrideReasoning: falsein global options- Check
config.jsontransformer options - If
overrideReasoning: falseis set, reasoning is disabled globally - Solution: Remove this option or set to
true
- Check
-
Transformer not loaded properly
- Check CCR startup log for transformer loading confirmation
- Verify path in
config.jsonpoints to correct transformer file - Solution: Fix path and restart CCR
-
Using
<Thinking:Off>tag- User Tags override model configuration
- Solution: Remove tag or use
<Thinking:On>
-
Force activation methods:
- Use
ultrathinkkeyword in your message - Use
<Thinking:On>tag - Use
<Effort:High>tag
- Use
Problem: Claude Code doesn't show the model's reasoning/thinking process, even when reasoning is active.
Cause: OpenAI-compatible providers (like Z.ai) send reasoning_content but don't include the signature field that Claude Code requires to display thinking blocks.
Solution: Add CCR's internal "reasoning" transformer to the provider configuration.
Configuration in config.json:
{
"name": "ZAI",
"api_base_url": "https://api.z.ai/api/coding/paas/v4/chat/completions",
"api_key": "YOUR_ZAI_API_KEY_HERE",
"models": [
"glm-4.6",
"glm-4.5",
"glm-4.5v",
"glm-4.5-air"
],
"transformer": {
"use": [
"zai",
"reasoning" // ← Converts OpenAI's reasoning_content to Anthropic's thinking format
// ← Generates signatures to display thinking in Claude Code
]
}
}What does ReasoningTransformer do?
- Reads
reasoning_contentblocks from the provider's response - Auto-generates signatures (
signature = Date.now().toString())
Reference: Source code at https://github.com/musistudio/llms/blob/main/src/transformer/reasoning.transformer.ts
- Converts OpenAI format to Anthropic format with signatures
- Enables Claude Code to display:
∴ Thought for 1s (ctrl+o to show thinking)
Result: Thinking blocks are now visible in Claude Code.
References:
- ReasoningTransformer (source code): https://github.com/musistudio/llms/blob/main/src/transformer/reasoning.transformer.ts
- Anthropic Extended Thinking: https://anthropic.mintlify.app/en/docs/build-with-claude/extended-thinking
Problem: You want to disable thinking but the model continues to use reasoning.
Most Common Cause: Using CCR's built-in reasoning transformer alongside zai
If you have this configuration:
"transformer": {
"glm-4.6": {
"use": [
"zai",
"reasoning" // ← This can override your reasoning settings
]
}
}What CCR's reasoning transformer does:
- Converts reasoning responses to display thinking blocks in Claude Code
- Side effect: When
enable: true(default), it forcesreasoning: { enabled: true }on every request - This overrides the custom transformer's reasoning settings
Quick Solution: Remove "reasoning" from the use array:
"transformer": {
"glm-4.6": {
"use": [
"zai" // ← Only use the custom transformer
]
}
}Result:
- ✓ All reasoning control works perfectly (Level hierarchy, keywords, Ultrathink, User Tags)
- ✓ Model respects your reasoning settings correctly
- ✗ You won't see thinking blocks displayed in Claude Code UI (but thinking still happens if enabled)
Other Causes and Solutions:
Root Cause: The custom transformer (zai.js / zai-debug.js) has reasoning: true enabled by default for all GLM models:
modelConfigurations = {
"glm-4.6": { reasoning: true, ... },
"glm-4.5": { reasoning: true, ... },
"glm-4.5-air": { reasoning: true, ... },
"glm-4.5v": { reasoning: true, ... }
}This makes the transformer always control reasoning (Level 4 always active), preventing Level 5 (Native Control) from being reached. Claude Code's native toggle (Tab key / alwaysThinkingEnabled setting) does not work.
Why Claude Code's toggle doesn't work:
When reasoning: true is set in model configuration (Level 4), the transformer always has "active conditions":
const hasActiveConditions = ... || config.reasoning === true; // Always trueThis causes the transformer to always control reasoning and prevents Level 5 (Native Control) from activating. The transformer never passes request.reasoning from Claude Code to the model.
Alternative Solutions:
Option 1: Use User Tags to disable thinking per message
Use <Thinking:Off> in your messages:
<Thinking:Off> Just give me a quick answer without deep reasoning
User Tags (Level 2) override Model Config (Level 4).
Option 2: Set global override to disable reasoning
In config.json, add overrideReasoning: false:
"transformers": [
{
"path": "~/.claude-code-router/plugins/zai.js",
"options": {
"overrideReasoning": false // ← Disables reasoning for ALL models
}
}
]Global Override (Level 3) takes precedence over Model Config (Level 4).
Option 3: Modify transformer code to enable Level 5 (Native Control)
Edit zai.js and change all models to reasoning: false:
'glm-4.6': {
reasoning: false, // ← Change from true to false
// ...
}Result: Level 4 becomes inactive, allowing Level 5 (Native Control) to activate. Claude Code's Tab key and alwaysThinkingEnabled setting will work.
Trade-offs:
- ✓ Claude Code's native toggle works
- ✗ Lose automatic reasoning activation for keywords and Ultrathink mode
- ✗ Lose User Tags control (
<Thinking:On>,<Effort:High>) - ✗ Lose Global Override control
Note: The model can still think/reason internally based on your settings. You just won't see the visual thinking blocks in the Claude Code interface.
Problem: When using zai-debug.js for debugging, the reasoning_content field doesn't appear in logs, only processed thinking blocks are visible.
Cause: CCR's built-in reasoning transformer processes and removes the original reasoning_content field before your custom transformers can see it.
How does the transformer flow work?
During the response phase, CCR executes transformers in this order:
LLM Response (with original reasoning_content)
↓
Model-specific transformers (reverse order)
↓
Provider-level transformers (reverse order)
↓
Final response to Claude Code
The reasoning transformer (implemented in ReasoningTransformer, DeepseekTransformer, etc.) does this during transformResponseOut:
- Extracts the
reasoning_contentfield from the stream - Converts it to standardized
thinkingblocks with signature - REMOVES the original
reasoning_contentfield
Why doesn't it appear in your logs?
If the reasoning transformer is configured in:
provider.transformer.use(provider-level) → executes BEFORE for all modelsprovider.transformer.model_name.use(model-specific) → executes BEFORE for that model
When zai-debug.js executes afterward, the reasoning_content has already been consumed and removed, leaving only processed thinking blocks.
Example of problematic configuration:
"transformer": {
"use": [
"zai-debug",
"reasoning" // ← Executes AFTER zai-debug, but already processed the response
]
}Solution:
To see original reasoning_content in zai-debug.js logs:
Option 1 (Recommended): Don't use the reasoning transformer
"transformer": {
"glm-4.6": {
"use": [
"zai-debug" // ← Only use the debug transformer
]
}
}Result:
- ✓ You'll see original
reasoning_contentin logs - ✓ Complete debugging of data flow
- ✗ Won't see thinking blocks in Claude Code interface
Option 2: Use only the reasoning transformer if you don't need detailed logs
"transformer": {
"glm-4.6": {
"use": [
"zai", // ← Production version (no logs)
"reasoning" // ← Shows thinking in Claude Code
]
}
}Result:
- ✓ Thinking blocks visible in Claude Code
- ✗ Won't see
reasoning_contentin logs (already processed)
Option 3: Debug only when needed
Alternate between configurations as needed:
// For debugging: Only zai-debug
"use": ["zai-debug"]
// For production: zai + reasoning
"use": ["zai", "reasoning"]Technical references:
- Transformer pipeline: musistudio/llms - Transformer Architecture
- ReasoningTransformer code: reasoning.transformer.ts
- Provider configuration: musistudio/claude-code-router - Provider Configuration
Problem: Claude Code shows "Thought for 1s" but when you open the thinking block, it appears empty or shows only whitespace.
Cause: The LLM provider (Z.ai) sends reasoning_content chunks that contain only newline characters ("\n") or whitespace. The reasoning transformer processes these chunks as valid thinking content and creates thinking blocks containing only those characters.
Why does this happen?
The ReasoningTransformer processes streaming responses and accumulates reasoning_content:
- When a chunk like
{"choices":[{"delta":{"reasoning_content":"\n"}}]}arrives - The transformer appends
"\n"to the accumulatedreasoningContentbuffer - It immediately creates a
thinkingblock withcontent: "\n" - This block is sent to Claude Code
- Claude Code displays the thinking indicator ("Thought for 1s") but the content is empty/whitespace
Example from logs:
[CHUNK 1] 177 bytes [THINKING] → {role:"assistant", content:"", reasoning_content:"↵"}
This shows the LLM sent a reasoning_content chunk containing only a newline character.
Why doesn't the transformer filter these out?
The ReasoningTransformer does NOT validate or trim reasoning_content before creating thinking blocks. It directly uses whatever the LLM sends:
// From reasoning.transformer.ts
reasoningContent += data.choices?.[0]?.delta?.reasoning_content;
// No trim() or validation - directly creates thinking blockThe transformer only trims whitespace when processing leftover buffer data at stream end, not during the main reasoning_content transformation.
Is this a bug?
No, it's expected behavior. The transformer faithfully converts whatever reasoning_content the LLM sends. If the LLM sends whitespace/newlines, those become thinking blocks.
Solutions:
Option 1: Use only custom transformers (no reasoning)
"transformer": {
"glm-4.6": {
"use": ["zai"] // ← Only custom transformer
}
}Result:
- ✓ No empty thinking blocks (custom transformers don't create them)
- ✗ Won't see thinking blocks in Claude Code UI
Option 2: Accept the behavior
Some LLMs send these "breathing" newlines during thinking. It's part of their output pattern. The empty thinking blocks are harmless - they indicate the model is processing but hasn't generated meaningful reasoning content yet.
Option 3: Filter in Claude Code (not possible)
Claude Code doesn't filter thinking block content. It displays whatever it receives. This would require changes to Claude Code itself.
Technical explanation:
The issue occurs in the ReasoningTransformer's stream processing:
// Simplified from reasoning.transformer.ts
if (data.choices?.[0]?.delta?.reasoning_content) {
reasoningContent += data.choices[0].delta.reasoning_content; // ← Accumulates "\n"
const thinkingChunk = {
choices: [{
delta: {
thinking: { content: reasoningContent } // ← Creates block with "\n"
}
}]
};
controller.enqueue(thinkingChunk); // ← Sent to Claude Code
}No validation happens, so "\n" becomes a valid thinking block.
Related transformers:
Similar behavior exists in:
DeepseekTransformer- Also processesreasoning_contentwithout trimmingOpenrouterTransformer- Also convertsreasoningfield without validation
Technical references:
- ReasoningTransformer source: reasoning.transformer.ts
- DeepseekTransformer source: deepseek.transformer.ts
Problem: ccr: command not found
Solution:
npm install -g @musistudio/claude-code-routerVerify: ccr version
Problem: 401 Unauthorized or Invalid API Key
Solution:
- Verify API key at https://z.ai/manage-apikey/apikey-list
- Update
config.jsonwith correct key ("api_key": "YOUR_ZAI_API_KEY_HERE") - Restart CCR
Problem: Model doesn't call tools when it should, gets stuck in tool-calling loops, or generates "invalid JSON" errors in tool arguments.
Root Causes:
- Model doesn't understand when to use tools
- Model doesn't know how to exit tool mode
- Model generates malformed JSON for tool arguments
- Streaming responses fragment tool call data
Solution: Use CCR's built-in tooluse and enhancetool transformers.
tooluse (Tool Mode Manager):
- Problem it solves: Models sometimes don't call tools when they should, or get stuck in infinite tool-calling loops.
- Solution: Injects system instructions for tool mode, forces the model to call a tool (
tool_choice: required), and automatically adds a specialExitToolthat allows the model to exit tool mode when it completes its task.
enhancetool (JSON Argument Fixer):
- Problem it solves: Models sometimes generate malformed JSON in tool arguments (missing quotes, extra commas, etc.), causing parsing errors.
- Solution: Implements a 3-level cascade parsing system:
- Standard JSON: Tries normal parsing
- JSON5: If it fails, uses JSON5 which tolerates relaxed syntax (trailing commas, comments, etc.)
- jsonrepair: If it still fails, attempts to automatically repair JSON (adds missing quotes, fixes structures, etc.)
Add to your config.json:
{
"Providers": [
{
"name": "ZAI",
"transformer": {
"glm-4.6": {
"use": [
"zai", // ← The custom transformer
"reasoning",
"tooluse", // ← Manages tool mode lifecycle
"enhancetool" // ← Parses tool arguments robustly
]
}
}
}
]
}Order matters: tooluse should come before enhancetool.
-
ExitTooldepends on the model: The effectiveness oftoolusedepends on the model correctly understanding and usingExitTool. Most models handle this well, but some may occasionally fail to invoke it or invoke it prematurely. -
tool_choice: requiredforces calls: Thetoolusetransformer forces the model to call a tool. This is crucial for tool mode but could lead to unexpected behavior if the model struggles to find a suitable tool. -
Parsing fallback: While
enhancetoolis very robust, extremely malformed arguments might still fail. In such cases, the system logs an error and returns an empty JSON object as fallback. -
Universal compatibility: These transformers work with any OpenAI-compatible provider (Z.AI, nVidia, OpenRouter, etc.).
Use these transformers when:
- Model doesn't call tools when it should
- Model gets stuck in infinite tool-calling loops
- You see "invalid JSON in tool arguments" errors
- Tool calls work inconsistently across different models
- You want a guaranteed tool mode exit mechanism
- CCR Repository: https://github.com/musistudio/llms
- Source Code:
Problem: Multiple rotated log files accumulating in ~/.claude-code-router/logs/ directory
Cause: The debug transformer automatically rotates logs when a file reaches 10 MB. Each session creates files named:
zai-transformer-[timestamp].log(current, up to 10 MB)zai-transformer-[timestamp]-part1.log(rotated, 10 MB)zai-transformer-[timestamp]-part2.log(rotated, 10 MB)- And so on...
Solutions:
Option 1: Delete old logs manually
# Windows (PowerShell)
Remove-Item "$env:USERPROFILE\.claude-code-router\logs\zai-transformer-*.log"
# macOS/Linux
rm ~/.claude-code-router/logs/zai-transformer-*.logOption 2: Reduce rotation size (creates smaller files, but more frequently)
- Edit
zai-debug.js - Find line:
this.maxLogSize = this.options.maxLogSize || 10 * 1024 * 1024; - Change
10to desired MB limit (e.g.,5for 5 MB) - Restart CCR
Note: Reducing the rotation size does NOT prevent accumulation, it only creates smaller files more frequently.
Problem: Claude Code can't connect to CCR even though CCR is running
Cause: CCR is bound to a specific IP address and Claude Code is trying to connect via 127.0.0.1
Solution:
- Check CCR's
config.jsonforHOSTandPORTvalues - If
HOSTuses a specific IP (e.g.,"HOST": "10.10.10.10") and"PORT": 3456, updateANTHROPIC_BASE_URLto match:"ANTHROPIC_BASE_URL": "http://10.10.10.10:3456"
- Never use
127.0.0.1inANTHROPIC_BASE_URLif CCR is bound to a specific LAN IP - Alternatively, set CCR's
HOSTto"0.0.0.0"to listen on all interfaces
Example:
- ✗ CCR:
"HOST": "10.10.10.10"and"PORT": 3456+"ANTHROPIC_BASE_URL": "http://127.0.0.1:3456"→ Fails - ✓ CCR:
"HOST": "10.10.10.10"and"PORT": 3456+"ANTHROPIC_BASE_URL": "http://10.10.10.10:3456"→ Works
Problem: You want to use Z.ai with Claude Code but don't want to use the custom transformer (zai.js).
Solution: Use Claude Code Router's built-in transformers (maxtoken, sampling, reasoning) with model-specific configurations.
Configuration in config.json:
{
"LOG": false,
"LOG_LEVEL": "debug",
"HOST": "127.0.0.1",
"PORT": 3456,
"APIKEY": "",
"API_TIMEOUT_MS": "600000",
"PROXY_URL": "",
"CUSTOM_ROUTER_PATH": "",
"transformers": [],
"Providers": [
{
"name": "ZAI",
"api_base_url": "https://api.z.ai/api/coding/paas/v4/chat/completions",
"api_key": "YOUR_ZAI_API_KEY_HERE",
"models": [
"glm-4.6",
"glm-4.5",
"glm-4.5v",
"glm-4.5-air"
],
"transformer": {
"glm-4.6": {
"use": [
["reasoning", {"enable": true}],
["maxtoken", {"max_tokens": 131072}],
["sampling", {"temperature": 1.0, "top_p": 0.95}]
]
},
"glm-4.5": {
"use": [
["reasoning", {"enable": true}],
["maxtoken", {"max_tokens": 98304}],
["sampling", {"temperature": 0.6, "top_p": 0.95}]
]
},
"glm-4.5-air": {
"use": [
["reasoning", {"enable": true}],
["maxtoken", {"max_tokens": 98304}],
["sampling", {"temperature": 0.6, "top_p": 0.95}]
]
},
"glm-4.5v": {
"use": [
["reasoning", {"enable": true}],
["maxtoken", {"max_tokens": 16384}],
["sampling", {"temperature": 0.6, "top_p": 0.95}]
]
}
}
}
],
"Router": {
"default": "ZAI,glm-4.6",
"background": "ZAI,glm-4.6",
"think": "ZAI,glm-4.6",
"longContext": "ZAI,glm-4.6",
"longContextThreshold": 204800,
"webSearch": "ZAI,glm-4.6",
"image": "ZAI,glm-4.6v"
}
}What this configuration does:
-
Provider level (applies to GLM 4.6 by default):
maxtoken: Sets max output to 128K (131,072 tokens)sampling: Sets temperature to 1.0 and top_p to 0.95reasoning: Generates signatures for thinking blocks
-
Model-specific overrides:
- GLM 4.5 / 4.5-air: 96K output (98,304 tokens), temperature 0.6
- GLM 4.5v: 16K output (16,384 tokens), temperature 0.6
- Claude Code Router: https://github.com/musistudio/claude-code-router
- Claude Code Setup: https://docs.claude.com/en/docs/claude-code/setup
- Z.ai API Reference: https://docs.z.ai/api-reference/llm/chat-completion