Skip to content

feat: replace inline nudges with background memory/skill review#2235

Merged
teknium1 merged 1 commit intomainfrom
hermes/hermes-3369cdb1
Mar 21, 2026
Merged

feat: replace inline nudges with background memory/skill review#2235
teknium1 merged 1 commit intomainfrom
hermes/hermes-3369cdb1

Conversation

@teknium1
Copy link
Copy Markdown
Contributor

Summary

Replaces the inline memory/skill nudges — which polluted 43% of user messages with backward-looking system instructions — with a background agent that reviews and saves independently after the main response is delivered.

The problem

Memory and skill nudges were appended directly to the user's message content:

User's actual message: "fix this bug"
What the model saw:    "fix this bug

[System: The previous task involved many tool calls. Save the approach as a skill...]
[System: You've had several exchanges. Consider: has the user shared preferences...]"

The model had to choose between the user's forward-looking task and the backward-looking system directives. In 2 confirmed cases, the agent spent 1-3 tool calls on memory/skill work before starting the user's task. The nudges were also permanently stored in conversation history, polluting session transcripts.

The solution

When nudge conditions are met, a background review agent spawns after the main response completes:

# After the agent finishes responding to the user:
threading.Thread(target=_run_review, daemon=True).start()

The review agent:

  • Uses the main model (same quality, not auxiliary — skills/memory are high-precision)
  • Gets a read-only snapshot of the conversation
  • Has only memory + skill_manage tools (5 iteration budget)
  • Shares the memory store so writes persist immediately
  • Runs with quiet_mode=True, skip_context_files=True
  • Never modifies the main conversation or produces user-visible output
  • All exceptions caught — can never affect the main session

What doesn't change

  • Trigger conditions: still every 10 user turns (memory) and after 10+ tool iterations (skills)
  • Token cost: same context processed either way, just on a separate track
  • Memory/skill quality: actually better — dedicated prompt for review vs a hint appended to an unrelated message

Changes

run_agent.py:

  • Remove nudge injection from run_conversation() (lines 5225-5249 → tracking only, no user_message +=)
  • Add _spawn_background_review() method with _BACKGROUND_REVIEW_PROMPT
  • Add background fork trigger after response delivery (before return)
  • Net: +77 lines, -15 lines

Test plan

5670 passed, 200 skipped, 23 deselected

Closes #2227.

@teknium1 teknium1 force-pushed the hermes/hermes-3369cdb1 branch 3 times, most recently from 42ae0be to eaf3eec Compare March 20, 2026 23:59
Remove the memory and skill nudges that were appended directly to user
messages, causing backward-looking system instructions to compete with
forward-looking user tasks. Found in 43% of user messages across 15
sessions, with confirmed cases of the agent spending tool calls on
nudge responses before starting the user's actual request.

Replace with a background review agent that runs AFTER the main agent
finishes responding:
- Spawns a background thread with a snapshot of the conversation
- Uses the main model (not auxiliary) for high-precision memory/skill work
- Only has memory + skill_manage tools (5 iteration budget)
- Shares the memory store for direct writes
- Never modifies the main conversation history
- Never competes with the user's task for model attention
- Zero latency impact (runs after response is delivered)
- Same token cost (processes the same context, just on a separate track)

The trigger conditions are unchanged (every 10 user turns for memory,
after 10+ tool iterations for skills). Only the execution path changes:
from inline injection to background fork.

Closes #2227.
@teknium1 teknium1 force-pushed the hermes/hermes-3369cdb1 branch from eaf3eec to 470d89c Compare March 21, 2026 01:28
@teknium1 teknium1 merged commit 45058b4 into main Mar 21, 2026
1 check passed
outsourc-e pushed a commit to outsourc-e/hermes-agent that referenced this pull request Mar 26, 2026
…Research#2235)

Remove the memory and skill nudges that were appended directly to user
messages, causing backward-looking system instructions to compete with
forward-looking user tasks. Found in 43% of user messages across 15
sessions, with confirmed cases of the agent spending tool calls on
nudge responses before starting the user's actual request.

Replace with a background review agent that runs AFTER the main agent
finishes responding:
- Spawns a background thread with a snapshot of the conversation
- Uses the main model (not auxiliary) for high-precision memory/skill work
- Only has memory + skill_manage tools (5 iteration budget)
- Shares the memory store for direct writes
- Never modifies the main conversation history
- Never competes with the user's task for model attention
- Zero latency impact (runs after response is delivered)
- Same token cost (processes the same context, just on a separate track)

The trigger conditions are unchanged (every 10 user turns for memory,
after 10+ tool iterations for skills). Only the execution path changes:
from inline injection to background fork.

Closes NousResearch#2227.

Co-authored-by: Test <test@test.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

1 participant