feat: pre-call sanitization and post-call tool guardrails#1732
Merged
Conversation
Salvage of PR #1321 by @alireza78a (cherry-picked concept, reimplemented against current main). Phase 1 — Pre-call message sanitization: _sanitize_api_messages() now runs unconditionally before every LLM call. Previously gated on context_compressor being present, so sessions loaded from disk or running without compression could accumulate dangling tool_call/tool_result pairs causing API errors. Phase 2a — Delegate task cap: _cap_delegate_task_calls() truncates excess delegate_task calls per turn to MAX_CONCURRENT_CHILDREN. The existing cap in delegate_tool.py only limits the task array within a single call; this catches multiple separate delegate_task tool_calls in one turn. Phase 2b — Tool call deduplication: _deduplicate_tool_calls() drops duplicate (tool_name, arguments) pairs within a single turn when models stutter. All three are static methods on AIAgent, independently testable. 29 tests covering happy paths and edge cases.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
Salvage of PR #1321 by @alireza78a — reimplemented against current main.
Phase 1 — Pre-call message sanitization
_sanitize_api_messages()now runs unconditionally before every LLM call. Previously gated oncontext_compressorbeing present (line 4998), so sessions loaded from disk or running without compression could silently accumulate dangling tool_call/tool_result pairs — causing "No tool call found for call_id" API errors.Phase 2a — Delegate task cap
_cap_delegate_task_calls()truncates excessdelegate_taskcalls per turn toMAX_CONCURRENT_CHILDREN. The existing cap indelegate_tool.pyonly limits the task array within a single call; this catches multiple separatedelegate_tasktool_calls in one turn.Phase 2b — Tool call deduplication
_deduplicate_tool_calls()drops duplicate(tool_name, arguments)pairs within a single turn when models stutter.All three are static methods on AIAgent, independently testable.
Tests
29 tests in
tests/test_agent_guardrails.pycovering all three phases — orphaned result removal, stub injection, mixed orphans, delegate cap with interleaved ordering, dedup first-occurrence preservation, input mutation safety, empty list edge cases, SDK object vs dict format handling.Closes #626