🤖 feat: add browser sidebar tab for live agent-browser viewing#2951
🤖 feat: add browser sidebar tab for live agent-browser viewing#2951
Conversation
|
@codex review |
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: 62add94c74
ℹ️ About Codex in GitHub
Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".
|
@codex review Addressed the P1 review comment: added a |
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: cc063e0722
ℹ️ About Codex in GitHub
Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".
|
@codex review Addressed the P2 review comment: |
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: 60f8420a8c
ℹ️ About Codex in GitHub
Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".
Add the Browser tab shell, live browser-session subscription UI, and workspace-layout migration so the new tab appears in existing sidebars.
60f8420 to
f873fdc
Compare
|
@codex review Fixed |
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: 3e81cb06fa
ℹ️ About Codex in GitHub
Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".
|
@codex review Changes since last review:
|
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: ce580631e5
ℹ️ About Codex in GitHub
Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".
|
@codex review Added a |
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: a1bfde2640
ℹ️ About Codex in GitHub
Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".
src/browser/features/RightSidebar/BrowserTab/useBrowserSessionSubscription.ts
Outdated
Show resolved
Hide resolved
|
@codex review Fixed both P2s: (1) concurrent start guard now only returns nonterminal sessions, (2) action timeline clears when a new session ID arrives on restart. |
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: b033cc8933
ℹ️ About Codex in GitHub
Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".
src/browser/features/RightSidebar/BrowserTab/useBrowserSessionSubscription.ts
Outdated
Show resolved
Hide resolved
|
@codex review Fixed final P1+P2: (1) concurrent startSession calls now return the same in-flight promise via |
|
Codex Review: Didn't find any major issues. Delightful! ℹ️ About Codex in GitHubYour team has set up Codex to review pull requests in this repo. Reviews are triggered when you
If Codex has suggestions, it will comment; otherwise it will react with 👍. Codex can also answer questions or update the PR. Try commenting "@codex address that feedback". |
Summary
Add a Browser tab to Mux's right sidebar that shows what agent-browser is doing in real time — live screenshots, URL/title tracking, action timeline, and session lifecycle controls.
Background
When an agent uses
agent-browserto automate web pages, users currently have no visibility into what's happening. This feature adds a workspace-scoped Browser tab in the right sidebar so users can watch the agent navigate, see which page it's on, and understand the sequence of actions — without leaving the Mux interface.Implementation
Architecture
Backend (4 new files + 6 modified)
src/common/types/browserSession.ts— shared types (BrowserSession,BrowserAction,BrowserSessionEvent)src/common/orpc/schemas/api.ts— Zod schemas forgetActive,start,stop,subscribesrc/node/services/browserSessionBackend.ts— CLI adapter managing the agent-browser subprocess, screenshot polling, metadata extraction, external-close detectionsrc/node/services/browserSessionService.ts— EventEmitter service with workspace-scoped events (mirrors DevToolsService pattern)Frontend (4 new files + 5 modified)
src/browser/features/RightSidebar/BrowserTab/BrowserTab.tsx— main tab component with idle/starting/live/error/ended states, screenshot viewer, action timeline, Start/Stop/Restart controlssrc/browser/features/RightSidebar/BrowserTab/useBrowserSessionSubscription.ts— ORPC subscription hook (mirrorsuseDevToolsSubscription)Key design decisions
sharplazy-loaded — prevents Bun test env crashes from the native dependencyabout:blank, the backend infers the browser was closed externally and surfaces an errorValidation
make static-checkpasses (typecheck + lint + shellcheck + prettier + docs)bun test src/browser/utils/rightSidebarLayout.test.ts— 23/23 passbun test src/cli/cli.test.ts src/cli/server.test.ts— 28/28 passRisks
sharpin production Electron — works in Node.js but may need packaging attention; fallback to raw PNG if unavailable📋 Implementation Plan
Integrate agent-browser into Mux right sidebar — implementation plan
Objective
Ship a workspace-scoped Browser tab in Mux’s right sidebar so users can watch an agent-driven browser session live, understand the agent’s current step, and eventually take over input when needed.
The implementation should:
Verified repo and product constraints
agent-browser capabilities confirmed from official research
Relevant Mux integration points already in the repo
src/browser/features/RightSidebar/RightSidebar.tsxsrc/browser/types/rightSidebar.tssrc/browser/features/RightSidebar/Tabs/registry.tssrc/browser/features/RightSidebar/TerminalTab.tsxsrc/browser/features/RightSidebar/DevToolsTab/useDevToolsSubscription.tssrc/common/types/src/common/orpc/schemas/api.tssrc/node/orpc/router.tssrc/node/orpc/context.tssrc/node/services/streamManager.ts,src/node/services/mcpServerManager.tssrc/desktop/main.tsRecommended delivery strategy
Approach A — CLI-backed Browser tab behind a stable service interface
Summary: Mux launches agent-browser as a managed subprocess, uses the documented stream port/WebSocket viewer, and renders the live viewport in a new Browser tab.
Approach B — BrowserManager-backed Mux-native browser session service
Summary: Mux owns a
BrowserSessionServiceand a native agent-browser backend adapter, with structured session state, action events, and a right-sidebar viewer.Approach C — Human takeover, recording, and replay-friendly session history
Summary: Add explicit user takeover, input arbitration, recording hooks, and lightweight persisted session/action history.
Recommended execution decision
Execute a 1-day spike first, then branch:
BrowserSessionBackendusing the BrowserManager/API path.BrowserSessionBackendinterface, so the UI and ORPC contracts remain stable.This keeps the team moving while avoiding a throwaway UI.
Non-negotiable architectural invariants
dangerouslySetInnerHTML,webview, or loose iframe-based browsing surface for remote pages.browser:${sessionId}multi-instance tabs.src/common/types/browserSession.ts.src/common/orpc/schemas/api.ts.Proposed architecture
Top-level model
Core components/services
BrowserSessionService(src/node/services/browserSessionService.ts)DevToolsService.BrowserSessionBackend(src/node/services/browserSessionBackends/BrowserSessionBackend.ts)BrowserTab(src/browser/features/RightSidebar/BrowserTab.tsx)Transport split
Why split transport instead of streaming frames through ORPC?
Mux already has a clean ORPC event-stream pattern for subscription-style data, but browser frames are much higher-frequency than devtools/tool state updates. Sending base64 JPEG frames through ORPC would increase serialization pressure, trigger avoidable rerenders, and tie the viewer’s frame rate to the app’s control plane.
A dedicated viewer socket keeps the control plane small and typed while letting the Browser tab own the rendering loop.
Execution phases and agent workstreams
Phase 0 — 1-day architecture spike and runtime decision
Owner: Backend/platform agent
Parallelizable: no; everything else depends on this answer
Goal: choose the backend implementation path without blocking the rest of the team for more than 1 day
Tasks
Deliverables
Exit criteria
Quality gate
Phase 1 — Shared contracts, service skeleton, and ORPC surface
Owner: Shared-contracts agent
Parallelizable with: frontend shell work once types stabilize
Primary files:
src/common/types/browserSession.ts(new)src/common/orpc/schemas/api.tssrc/node/orpc/context.tssrc/node/services/browserSessionService.ts(new)src/node/orpc/router.tsTasks
src/common/types/browserSession.ts:BrowserSessionBrowserActionBrowserSessionEventsrc/common/orpc/schemas/api.tsfor:browserSession.getActivebrowserSession.startbrowserSession.stopbrowserSession.subscribebrowserSession.clearRecentActionsif the UI needs it laterbrowserSessionServicetoORPCContextinsrc/node/orpc/context.ts.BrowserSessionServiceas a workspace-scopedEventEmitterservice.DevToolsService/useDevToolsSubscriptionmodel:Acceptance criteria
Defensive programming requirements
viewerUrlis null unless the session is starting/live/paused.Estimated product code
Phase 2 — Backend adapter implementation and lifecycle management
Owner: Backend integration agent
Parallelizable with: Phase 3 UI shell once contracts are stable
Primary files:
src/node/services/browserSessionBackends/BrowserSessionBackend.ts(new)src/node/services/browserSessionBackends/AgentBrowserManagerBackend.ts(new, preferred)src/node/services/browserSessionBackends/AgentBrowserCliBackend.ts(new, fallback)src/node/services/browserSessionService.tssrc/node/services/streamManager.ts(only if shared run lifecycle hooks are needed)src/node/services/mcpServerManager.ts(only if a bridge is needed later; avoid coupling MVP to this)Tasks
startSession(...)stopSession(sessionId)getViewerEndpoint(sessionId)onAction(...)onSessionUpdate(...)BrowserActionentries.Explicit scope control
Acceptance criteria
Defensive programming requirements
Estimated product code
Phase 3 — Right-sidebar Browser tab shell and layout integration
Owner: Frontend/right-sidebar agent
Parallelizable with: Phase 2 once contracts are stable enough to mock
Primary files:
src/browser/types/rightSidebar.tssrc/browser/features/RightSidebar/Tabs/registry.tssrc/browser/features/RightSidebar/Tabs/TabLabels.tsxsrc/browser/features/RightSidebar/RightSidebar.tsxsrc/browser/features/RightSidebar/BrowserTab.tsx(new)src/browser/utils/rightSidebarLayout.tssrc/browser/utils/uiLayouts.ts(only if layout presets need updating)Tasks
"browser"to the right-sidebar base tab model.BROWSER_TAB_CONFIGin the right-sidebar registry.BrowserTabLabelshowing:BrowserTab.tsxwith the following UI regions:BrowserTab:UX requirements
Acceptance criteria
Defensive programming requirements
Estimated product code
Phase 4 — Viewer transport, frame rendering, and performance hardening
Owner: Frontend performance/interaction agent
Parallelizable with: late Phase 2 / late Phase 3
Primary files:
src/browser/features/RightSidebar/BrowserTab.tsxsrc/browser/features/RightSidebar/BrowserViewer.tsx(new, only if extraction materially reduces complexity)src/common/types/browserSession.tsfor frame metadata typesTasks
<img>or canvas update loop,requestAnimationFrame.Acceptance criteria
Estimated product code
Phase 5 — Action timeline and tool synchronization
Owner: Tooling/instrumentation agent
Parallelizable with: Phase 4 once the action model is stable
Primary files:
src/common/types/browserSession.tssrc/node/services/browserSessionService.tssrc/browser/features/RightSidebar/BrowserTab.tsxsrc/browser/features/RightSidebar/DevToolsTab/*if cross-linking is added laterTasks
BrowserActionmodel for user-facing steps:navigateclickfilltypescrollsnapshotwaiterrorAcceptance criteria
Estimated product code
Phase 6 — Human takeover and collaboration controls (follow-up)
Owner: Interaction/UX agent
Parallelizable with: after Phases 2–5 stabilize
Primary files:
src/browser/features/RightSidebar/BrowserTab.tsxsrc/common/types/browserSession.tssrc/common/orpc/schemas/api.ts(only if extra control procedures are needed)Tasks
ownershipfromagenttouser,Acceptance criteria
Estimated product code
Phase 7 — Testing, stories, and rollout hardening
Owner: QA/verification agent
Parallelizable with: all later phases
Primary files:
tests/ipc/browserSession.test.ts(new)tests/ui/browserTab.test.ts(new)src/browser/stories/App.BrowserTab.stories.tsxor the nearest existing full-app story file that should absorb the new statesTest plan
tests/ipc)tests/ui)tests/e2e) only if happy-dom is insufficient for validating the viewer transport or takeover behavior.Validation commands
make typecheckmake static-checkRollout posture
loghelper on the backend.Acceptance criteria
Estimated product code
Cross-cutting design decisions the team should follow
1. Keep browser integration separate from generic MCP integration for the first delivery
Mux already has
MCPServerManager, but the first delivery should not try to unify every possible browser MCP server under one viewer abstraction. Build a Mux-owned browser session service first; if future MCP tools want to publish into it, add a bridge later.2. Treat the Browser tab as a first-class right-sidebar resident
The Browser tab should live beside Terminal, Output, and Debug; it should not open in a separate window for the MVP unless the spike proves the in-sidebar viewer is impossible.
3. Never persist raw frames
Persist, at most, lightweight metadata and redacted action history. Raw image streams are too large and too risky to store casually.
4. Prefer local viewer transport over backend relaying
If Electron/network policy allows it, the renderer should connect directly to the locally managed viewer socket. Only add a relay if direct connection is blocked or unsafe.
5. Avoid hook proliferation
Colocate live viewer logic with
BrowserTab.tsx. Extract only the pieces that are genuinely reusable or become too complex to read.Parallelization map for a team of agents
Dogfooding plan (required)
Dogfooding principles to follow
This plan should absorb the core discipline from the repo’s
dogfoodandagent-browserskills:agent-browserdirectly, nevernpx agent-browser.wait --load networkidleor element/url waits; only use sleeps to make repro videos human-watchable.Dogfooding harness and setup
Each dogfood run should create an isolated run ID, session name, and evidence directory.
make devor the team’s standard desktop/Electron dev path)../dogfood-output/browser-tab/<run-id>/screenshots./dogfood-output/browser-tab/<run-id>/videos./dogfood-output/browser-tab/<run-id>/report.mdRecommended command pattern for browser-side dogfooding
Use the
agent-browserskill’s proven workflow for the site being driven inside the Browser tab.For authenticated or recurring scenarios, prefer
--session-name,--profile, or saved state so reruns are fast and reproducible.Structured dogfooding workflow
1. Initialize
2. Authenticate (if needed)
3. Orient
snapshot -i.4. Explore systematically
Test the feature like a real user, page by page and workflow by workflow.
At a minimum, cover:
During exploration, use the agent-browser workflow rigorously:
snapshot -ibefore discovering refs,wait --load networkidleor element/url waits after major actions,errorsandconsoleperiodically,diff snapshotwhen validating that an action changed the page as expected.Repro-first issue documentation rules
When a bug is found, stop exploring and document it immediately.
Interactive / behavioral issues
Examples: wrong action log, frozen stream, mismatched viewport, takeover race, session cleanup bug, visible console error after an action.
Required evidence:
report.mdimmediately with:ISSUE-001, etc.),When typing is part of the observable repro, prefer
typeoverfillso the video is understandable.Static / visible-on-load issues
Examples: clipped text, wrong icon/state, bad empty state copy, layout overlap, stale title/url, immediately visible console error.
Required evidence:
report.mdimmediately.N/A.Evidence requirements per milestone
For every milestone review, provide both broad milestone evidence and issue-specific evidence.
Broad milestone evidence
Issue-specific evidence
Where practical, capture both:
Wrap-up procedure
At the end of each dogfood run:
Phase quality gates tied to dogfooding
Parallel-team guidance
If multiple agents dogfood simultaneously:
Aim for the depth of coverage that would normally yield 5–10 well-documented findings’ worth of exploration. If fewer issues are found, state explicitly that no additional reproducible issues were observed rather than inventing weak findings.
Final milestone definitions
Milestone M1 — Visible viewer MVP
Includes Phases 0–4 and the Phase 7 test/story minimums.
Success means:
Milestone M2 — “See what the agent is doing” product pass
Adds Phase 5 and expands verification.
Success means:
Milestone M3 — Human collaboration pass
Adds Phase 6.
Success means:
Recommended first implementation order
What not to do in the first pass
Generated with
mux• Model:anthropic:claude-opus-4-6• Thinking:xhigh• Cost:$52.11