Skip to content

fix: detect, warn, and block file re-read/search loops after context compression#705

Merged
teknium1 merged 5 commits intoNousResearch:mainfrom
0xbyt4:fix/reading-loop-detection
Mar 10, 2026
Merged

fix: detect, warn, and block file re-read/search loops after context compression#705
teknium1 merged 5 commits intoNousResearch:mainfrom
0xbyt4:fix/reading-loop-detection

Conversation

@0xbyt4
Copy link
Copy Markdown
Contributor

@0xbyt4 0xbyt4 commented Mar 8, 2026

Summary

Fixes the issue where the agent gets stuck in an infinite reading loop after context compression, re-reading the same files endlessly without writing or responding.

Root cause: Context compression summarizes conversation history but loses track of which files were already read. After compression, the model thinks it hasn't examined the files yet and reads them again. Combined with todo re-injection of completed items, this creates an infinite loop.

Fix (multi-layered):

  1. Read tracking with escalation (tools/file_tools.py): Track file reads per task. 2nd read returns a soft warning with content. 3rd+ read blocks — returns error with no content, forcing the model to stop.

  2. Search tracking (tools/file_tools.py): Same mechanism for search_files — identical searches are warned then blocked after 3 repeats.

  3. File history injection (run_agent.py): After context compression, inject a structured message listing all files already read with "do NOT re-read" instruction.

  4. Todo re-injection filtering (tools/todo_tool.py): format_for_injection() now filters out completed/cancelled todos. Only pending/in_progress items are re-injected after compression, preventing the model from re-doing finished work.

Design decisions:

  • Escalating response: warn (2nd) → block (3rd+) — gives the model one chance before hard-stopping
  • Thread-safe tracking with _read_tracker_lock
  • Task-isolated — different tasks have separate trackers
  • Pagination-aware — different offsets of the same file don't trigger false warnings
  • Search tracking keyed on (pattern, target, path, file_glob) — different queries are independent

Test plan

  • 26 unit tests: read warning/blocking, search warning/blocking, task isolation, different file/region/pattern, summary accuracy, tracker cleanup, compression history injection, todo filtering
  • Manual bash verification of all 3 fixes with real function calls
0xbyt4 added 2 commits March 8, 2026 20:44
When context compression summarizes conversation history, the agent
loses track of which files it already read and re-reads them in a loop.
Users report the agent reading the same files endlessly without writing.

Root cause: context compression is lossy — file contents and read history
are lost in the summary. After compression, the model thinks it hasn't
examined the files yet and reads them again.

Fix (two-part):
1. Track file reads per task in file_tools.py. When the same file region
   is read again, include a _warning in the response telling the model
   to stop re-reading and use existing information.
2. After context compression, inject a structured message listing all
   files already read in the session with explicit "do NOT re-read"
   instruction, preserving read history across compression boundaries.

Adds 16 tests covering warning detection, task isolation, summary
accuracy, tracker cleanup, and compression history injection.
…ted todos

- Block file reads after 3+ re-reads of same region (no content returned)
- Track search_files calls and block repeated identical searches
- Filter completed/cancelled todos from post-compression injection
  to prevent agent from re-doing finished work
- Add 10 new tests covering all three fixes
@0xbyt4 0xbyt4 changed the title fix: detect and warn on file re-read loops after context compression Mar 8, 2026
0xbyt4 added 3 commits March 8, 2026 23:07
Completed/cancelled items are now filtered from format_for_injection()
output. Update the existing test to verify active items appear and
completed items are excluded.
Combine read/search loop detection with main's redact_sensitive_text
and truncation hint features. Add tracker reset to TestSearchHints
to prevent cross-test state leakage.
_FakeReadResult and _FakeSearchResult now expose the attributes
that read_file_tool/search_tool access after the redact_sensitive_text
integration from main.
@teknium1 teknium1 merged commit b53d5da into NousResearch:main Mar 10, 2026
1 check passed
teknium1 added a commit that referenced this pull request Mar 10, 2026
…ds, fix bugs

Follow-up to PR #705 (merged from 0xbyt4). Addresses several issues:

1. CONSECUTIVE-ONLY TRACKING: Redesigned the read/search tracker to only
   warn/block on truly consecutive identical calls. Any other tool call
   in between (write, patch, terminal, etc.) resets the counter via
   notify_other_tool_call(), called from handle_function_call() in
   model_tools.py. This prevents false blocks in read→edit→verify flows.

2. THRESHOLD ADJUSTMENT: Warn on 3rd consecutive (was 2nd), block on
   4th+ consecutive (was 3rd+). Gives the model more room before
   intervening.

3. TUPLE UNPACKING BUG: Fixed get_read_files_summary() which crashed on
   search keys (5-tuple) when trying to unpack as 3-tuple. Now uses a
   separate read_history set that only tracks file reads.

4. WEB_EXTRACT DOCSTRING: Reverted incorrect removal of 'title' from
   web_extract return docs in code_execution_tool.py — the field IS
   returned by web_tools.py.

5. TESTS: Rewrote test_read_loop_detection.py (35 tests) to cover
   consecutive-only behavior, notify_other_tool_call, interleaved
   read/search, and summary-unaffected-by-searches.
@teknium1
Copy link
Copy Markdown
Contributor

Merged! Thanks for the contribution @0xbyt4 — the read-loop detection and todo injection filtering are great additions.

I pushed a follow-up commit (a458b53) on top with several improvements:

  1. Consecutive-only tracking — The counter now resets whenever any other tool is called in between (write, patch, terminal, etc.), so only truly back-to-back identical reads/searches trigger warnings. This prevents false blocks in legitimate read→edit→verify workflows.

  2. Adjusted thresholds — Warn on 3rd consecutive (was 2nd), block on 4th+ (was 3rd+).

  3. Fixed tuple unpacking bugget_read_files_summary() was crashing on search keys (5-tuple) when trying to unpack as 3-tuple. Now uses a separate read_history set that only tracks file reads, so searches don't corrupt the summary.

  4. Reverted web_extract docstring — The title field IS returned by web_tools.py, so restored it in the docs.

  5. Tests updated — 35 tests covering the new consecutive-only behavior, notify_other_tool_call, interleaved operations, etc.

Death-Incarnate added a commit to Death-Incarnate/hermes-agent that referenced this pull request Mar 24, 2026
Applied Karpathy's autoresearch pattern to autonomously optimize the
context compressor. 50 experiments run, 8 improvements kept.

- _SUMMARY_RATIO 0.20 → 0.30 (more budget for summaries)
- _MIN_SUMMARY_TOKENS 2000 → 500 (no inflation on short conversations)
- _MAX_SUMMARY_TOKENS 8000 → 4000 (tighter cap)
- _DEFAULT_TAIL_TOKEN_BUDGET 20000 → 8000 (more aggressive compression)
- Truncation 3000 → 4500 chars (retains more tool output)
- Regex file path pre-extraction with "MUST appear in summary"
- Template restructured: Relevant Files + Critical Context moved up
- MANDATORY PRESERVATION RULES added to both prompts

Addresses NousResearch#705, NousResearch#1273, and context drift from lossy summarization.
Score improved 3.6% (0.6346 → 0.6572).
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

2 participants