security: block secret exfiltration via browser URLs and LLM responses#4371
Open
0xbyt4 wants to merge 2 commits intoNousResearch:mainfrom
Open
security: block secret exfiltration via browser URLs and LLM responses#43710xbyt4 wants to merge 2 commits intoNousResearch:mainfrom
0xbyt4 wants to merge 2 commits intoNousResearch:mainfrom
Conversation
…M calls Three exfiltration vectors closed: 1. Browser URL exfil — agent could embed secrets in URL params and navigate to attacker-controlled server. Now scans URLs for known API key patterns before navigating (browser_navigate, web_extract). 2. Browser snapshot leak — page displaying env vars or API keys would send secrets to auxiliary LLM via _extract_relevant_content before run_agent.py's redaction layer sees the result. Now redacts snapshot text before the auxiliary call. 3. Camofox annotation leak — accessibility tree text sent to vision LLM could contain secrets visible on screen. Now redacts annotation context before the vision call. 10 new tests covering URL blocking, snapshot redaction, and annotation redaction for both browser and camofox backends.
LLM responses from browser snapshot extraction and vision analysis could echo back secrets that appeared on screen or in page content. Input redaction alone is insufficient — the LLM may reproduce secrets it read from screenshots (which cannot be text-redacted). Now redact outputs from: - _extract_relevant_content (auxiliary LLM response) - browser_vision (vision LLM response) - camofox_vision (vision LLM response)
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
Closes remaining secret exfiltration vectors through browser navigation, web extraction, and auxiliary/vision LLM calls.
Companion to #4364 which covers execute_code, tool results, memory, and skills.
Vulnerabilities Fixed
1. Browser URL exfiltration
Agent could embed secrets in URL parameters and navigate to attacker-controlled server:
Fix: Scan URLs for known API key patterns before navigating. Applies to both
browser_navigateandweb_extract_tool.2. Browser snapshot leak to auxiliary LLM
A page displaying env vars or API keys would send secrets to the auxiliary extraction model via
_extract_relevant_content— beforerun_agent.py's general redaction layer ever sees the tool result.Fix: Redact snapshot text before the auxiliary LLM call, and redact the auxiliary LLM's response.
3. Vision LLM response leak
Vision analysis of screenshots containing secrets (e.g. terminal showing env vars) could echo those secrets back in the analysis text. Screenshots themselves cannot be text-redacted, but the LLM's text response can.
Fix: Redact vision LLM responses in both
browser_visionandcamofox_vision.Test plan