security: redact secrets from execute_code output and all tool results by 0xbyt4 · Pull Request #4364 · NousResearch/hermes-agent

0xbyt4 · 2026-03-31T22:11:41Z

Summary

Multiple secret exfiltration vectors where API keys and tokens from .env could leak into LLM context or persistent storage, bypassing existing redaction.

Vulnerabilities Fixed

1. execute_code (PTC) — raw stdout returned without redaction

execute_code runs Python scripts and returns stdout to LLM context. No redaction was applied:

import os; print(os.environ["ANTHROPIC_API_KEY"])

This bypasses terminal_tool's redaction entirely.

Fix: Apply redact_sensitive_text to stdout and stderr in code_execution_tool.py.

2. All tool results — no defense-in-depth redaction

Only terminal_tool and file_tools had redaction. Browser, MCP, vision, delegate, and all other tools returned raw output to LLM context.

Fix: Add redact_sensitive_text to both sequential and concurrent tool result paths in run_agent.py — catches any tool that doesn't redact internally.

3. Memory persistence — secrets writable to memory files

Memory entries are injected into the system prompt on every session. A prompt injection that saves ANTHROPIC_API_KEY=sk-ant-... to memory would exfiltrate it across all future sessions.

Fix: Add API key and KEY=value pattern scanning to _scan_memory_content in memory_tool.py.

4. Skill persistence — secrets writable to skill files

Skill files are loaded into context when referenced. Same exfiltration vector as memory.

Fix: Add _scan_skill_for_secrets to skill_manage — blocks create, edit, patch, and write_file actions containing secrets.

What was vulnerable

Vector	Had protection?	Fix
terminal_tool output	Yes (existing)	—
file_tools read	Yes (existing)	—
execute_code stdout	No	redact_sensitive_text on stdout/stderr
All tool results → LLM	No	defense-in-depth in run_agent.py
Memory persistence	No	secret pattern scanning
Skill persistence	No	secret pattern scanning

Test plan

48 redaction tests passing (38 existing + 10 new)
execute_code: env var, OpenRouter key, multi-key dump, non-secret passthrough
memory: blocks API key, blocks env assignment, allows normal content
skill: blocks API key, blocks env assignment, allows normal content

execute_code (PTC) returned script stdout/stderr to LLM context without redaction. An agent could exfiltrate .env secrets via: import os; print(os.environ["ANTHROPIC_API_KEY"]) bypassing terminal_tool's redaction entirely. Fix: - Apply redact_sensitive_text to execute_code stdout/stderr - Add defense-in-depth redaction in the sequential tool result path (run_agent.py) so ALL tool outputs are redacted before entering LLM context, regardless of whether individual tools redact Added 4 tests verifying secret redaction in script output.

The concurrent (parallel) tool execution path was missing the defense-in-depth redaction added to the sequential path. Tool results from parallel execution could enter LLM context unredacted.

Memory and skill files are injected into the system prompt on every session. A prompt injection that saves a secret to memory/skill would exfiltrate it across all future sessions. - Add secret pattern scanning (API keys, KEY=value assignments) to memory_tool's _scan_memory_content - Add _scan_skill_for_secrets to skill_manage — blocks create, edit, patch, and write_file actions containing secrets - 6 new tests for memory/skill secret blocking

jeremyjh · 2026-03-31T23:01:45Z

tools/code_execution_tool.py

+        # Without this, execute_code can exfiltrate .env secrets via
+        # `import os; print(os.environ)` bypassing terminal redaction.
+        from agent.redact import redact_sensitive_text
+        stdout_text = redact_sensitive_text(stdout_text)


Pattern matching is not enough. It is not possible to know the pattern of every secret value and the KEY=VALUE pattern is already defeated in the sample I shared.

0xbyt4 added 3 commits April 1, 2026 01:10

security: add redaction to concurrent tool result path

d000398

The concurrent (parallel) tool execution path was missing the defense-in-depth redaction added to the sequential path. Tool results from parallel execution could enter LLM context unredacted.

jeremyjh reviewed Mar 31, 2026

View reviewed changes

0xbyt4 mentioned this pull request Mar 31, 2026

security: block secret exfiltration via browser URLs and LLM responses #4371

Open

2 tasks

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

security: redact secrets from execute_code output and all tool results#4364

security: redact secrets from execute_code output and all tool results#4364
0xbyt4 wants to merge 3 commits intoNousResearch:mainfrom
0xbyt4:fix/secret-exfil-redaction

0xbyt4 commented Mar 31, 2026 •

edited

Loading

jeremyjh Mar 31, 2026

Labels

2 participants

Conversation

0xbyt4 commented Mar 31, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!