fix: detect, warn, and block file re-read/search loops after context compression#705
Conversation
When context compression summarizes conversation history, the agent loses track of which files it already read and re-reads them in a loop. Users report the agent reading the same files endlessly without writing. Root cause: context compression is lossy — file contents and read history are lost in the summary. After compression, the model thinks it hasn't examined the files yet and reads them again. Fix (two-part): 1. Track file reads per task in file_tools.py. When the same file region is read again, include a _warning in the response telling the model to stop re-reading and use existing information. 2. After context compression, inject a structured message listing all files already read in the session with explicit "do NOT re-read" instruction, preserving read history across compression boundaries. Adds 16 tests covering warning detection, task isolation, summary accuracy, tracker cleanup, and compression history injection.
…ted todos - Block file reads after 3+ re-reads of same region (no content returned) - Track search_files calls and block repeated identical searches - Filter completed/cancelled todos from post-compression injection to prevent agent from re-doing finished work - Add 10 new tests covering all three fixes
Completed/cancelled items are now filtered from format_for_injection() output. Update the existing test to verify active items appear and completed items are excluded.
Combine read/search loop detection with main's redact_sensitive_text and truncation hint features. Add tracker reset to TestSearchHints to prevent cross-test state leakage.
_FakeReadResult and _FakeSearchResult now expose the attributes that read_file_tool/search_tool access after the redact_sensitive_text integration from main.
…ds, fix bugs Follow-up to PR #705 (merged from 0xbyt4). Addresses several issues: 1. CONSECUTIVE-ONLY TRACKING: Redesigned the read/search tracker to only warn/block on truly consecutive identical calls. Any other tool call in between (write, patch, terminal, etc.) resets the counter via notify_other_tool_call(), called from handle_function_call() in model_tools.py. This prevents false blocks in read→edit→verify flows. 2. THRESHOLD ADJUSTMENT: Warn on 3rd consecutive (was 2nd), block on 4th+ consecutive (was 3rd+). Gives the model more room before intervening. 3. TUPLE UNPACKING BUG: Fixed get_read_files_summary() which crashed on search keys (5-tuple) when trying to unpack as 3-tuple. Now uses a separate read_history set that only tracks file reads. 4. WEB_EXTRACT DOCSTRING: Reverted incorrect removal of 'title' from web_extract return docs in code_execution_tool.py — the field IS returned by web_tools.py. 5. TESTS: Rewrote test_read_loop_detection.py (35 tests) to cover consecutive-only behavior, notify_other_tool_call, interleaved read/search, and summary-unaffected-by-searches.
|
Merged! Thanks for the contribution @0xbyt4 — the read-loop detection and todo injection filtering are great additions. I pushed a follow-up commit (a458b53) on top with several improvements:
|
Applied Karpathy's autoresearch pattern to autonomously optimize the context compressor. 50 experiments run, 8 improvements kept. - _SUMMARY_RATIO 0.20 → 0.30 (more budget for summaries) - _MIN_SUMMARY_TOKENS 2000 → 500 (no inflation on short conversations) - _MAX_SUMMARY_TOKENS 8000 → 4000 (tighter cap) - _DEFAULT_TAIL_TOKEN_BUDGET 20000 → 8000 (more aggressive compression) - Truncation 3000 → 4500 chars (retains more tool output) - Regex file path pre-extraction with "MUST appear in summary" - Template restructured: Relevant Files + Critical Context moved up - MANDATORY PRESERVATION RULES added to both prompts Addresses NousResearch#705, NousResearch#1273, and context drift from lossy summarization. Score improved 3.6% (0.6346 → 0.6572).
Summary
Fixes the issue where the agent gets stuck in an infinite reading loop after context compression, re-reading the same files endlessly without writing or responding.
Root cause: Context compression summarizes conversation history but loses track of which files were already read. After compression, the model thinks it hasn't examined the files yet and reads them again. Combined with todo re-injection of completed items, this creates an infinite loop.
Fix (multi-layered):
Read tracking with escalation (
tools/file_tools.py): Track file reads per task. 2nd read returns a soft warning with content. 3rd+ read blocks — returns error with no content, forcing the model to stop.Search tracking (
tools/file_tools.py): Same mechanism forsearch_files— identical searches are warned then blocked after 3 repeats.File history injection (
run_agent.py): After context compression, inject a structured message listing all files already read with "do NOT re-read" instruction.Todo re-injection filtering (
tools/todo_tool.py):format_for_injection()now filters out completed/cancelled todos. Only pending/in_progress items are re-injected after compression, preventing the model from re-doing finished work.Design decisions:
_read_tracker_lockTest plan