-
Notifications
You must be signed in to change notification settings - Fork 2.6k
bug: _strip_think_blocks regex strips visible content when model discusses <think> tags literally #786
Description
Bug
_strip_think_blocks and _has_content_after_think_block in run_agent.py use a global re.sub that strips all occurrences of <think>.*?</think> in the content, not just leading reasoning wrappers.
This causes a false positive when the model's visible response contains the literal text <think> and </think> as part of its answer — for example, when discussing how reasoning models work.
Reproduction
With a reasoning model like Kimi K2.5, ask it a question where the answer includes the phrase `````` blocks (e.g. "what issues should I watch for with reasoning models?"). The model may respond with visible text like:
"watch for responses that only contain
<think>blocks with no visible output"
The stripping regex matches the literal <think> in that sentence to the next </think> in the text, silently eating a chunk of the visible response. The user receives a truncated reply starting mid-sentence.
Root cause
# run_agent.py line ~643
return re.sub(r'<think>.*?</think>', '', content, flags=re.DOTALL)This is applied globally across the full content string, not anchored to the start where actual reasoning blocks appear.
Fix
Anchor the strip to the beginning of the content only, and loop to handle multiple leading blocks:
def _strip_think_blocks(self, content: str) -> str:
if not content:
return ""
result = content
while True:
stripped = re.sub(r'^\s*<think>.*?</think>\s*', '', result, flags=re.DOTALL)
if stripped == result:
break
result = stripped
return resultThis correctly strips <think>...</think> at the top of the response (the model's reasoning wrapper) while leaving any mid-content references to <think> tags untouched.