Understudy
Understudy
Open Source · Bring Your Own Model

One instruction. Your entire computer.

Understudy is an open-source AI agent that lives on your computer. Give it a task and it researches, browses the web, clicks through desktop apps, manages files, and replies through your existing channels. Teach it once and it learns. Use it daily and it gets faster.

Operate any app Browse, click, type across your desktop
Learn from you Show it once, it remembers forever
Grow over time Gets faster the more you use it

See it in action.

Four demos, each showing a different side of what Understudy can do.

One message, multi-tool execution

The agent researches the web, controls your browser, invokes skills, and delivers a polished result — all from a single instruction. No staging, no multi-step prompting. One local runtime handles everything.

Example: "Research Cowork and build a tech-style landing page in my downloads folder"

Phone message in, desktop result out

Send a message from your phone via Telegram. Understudy receives it on your Mac, converts a file to PDF, opens desktop Telegram, finds the right contact, and sends it — all through GUI automation. Phone view and desktop view shown side by side.

Example: "Convert the Cowork webpage to PDF and send it to Alex on Telegram"

Show once, refine, replay with generalization

Demonstrate a workflow once — Understudy watches, understands the intent, and publishes a reusable skill. Interactively refine the generated skill, then invoke it with natural language. On replay, the agent automatically generalizes: Google Image search becomes browser automation, downloads become shell commands, while Pixelmator Pro stays GUI-controlled.

Example: "Find a photo of [person], remove the background, and send it to [contact] on Telegram"

One prompt. Real iPhone. Published on YouTube.

A six-stage pipeline browses the real App Store, installs an app via iPhone Mirroring, explores it autonomously — discovering features it's never seen — composes a narrated review video locally, uploads it to YouTube, and cleans up the device. The middle stage is genuinely agentic: 51 quality-gate rules guide the agent, but it navigates an unfamiliar app freely and makes its own editorial decisions. About one hour, zero human intervention.

Example: "Make a Snapseed iPhone app review video from scratch — capture proof-first clips, add narration and subtitles, upload to YouTube, and clean up"
The published review
How it was made

It operates your computer like you do.

One local agent that can see your screen, open apps, browse the web, run commands, and send messages — all from a single instruction.

💻

Desktop Agent

One message and it researches, creates files, opens apps, and delivers the result. No staging, no multi-step prompting. Just say what you need.

📱

Remote Control

Send a message from your phone via Telegram, Slack, or any of 8 channels. Understudy works on your Mac and sends back the result while you're away.

💬

8 Channels Built In

Telegram, Slack, Discord, WhatsApp, Signal, LINE, iMessage, and Web. Control your agent through the messaging apps you already use.

It keeps getting better.

Like a new colleague who grows into the role — Understudy starts by following instructions, then gradually learns your routines and finds better ways to get things done.

Day 1
Operates
Does what you say
Week 1
Learns
Watches and remembers
Month 1
Remembers
Does it independently
Month 3
Optimizes
Finds faster ways
Month 6
Anticipates
Acts before you ask
"In theater, an understudy watches the lead, learns the role, and steps in when needed."

Open source. Bring your own model.

No subscriptions, no lock-in. Run it locally with full control over your data and your choice of AI provider.

🔓

Open Source

MIT license. Full source code on GitHub. Inspect, modify, and contribute freely.

🧰

Bring Your Own Model

Anthropic, OpenAI, Google, MiniMax, and more. Use your own API key — no bundled subscription required.

🔒

Local-first

Runs on your machine. Screenshots, recordings, and task data stay on your computer by default.

💻

macOS Today

Full native GUI automation on macOS. Linux and Windows desktop support are planned — contributions welcome.

Get started in minutes.

Quick Start

Install from npm and let the wizard walk you through setup.

# Install
npm install -g @understudy-ai/understudy
understudy wizard

# Start
understudy daemon --start
understudy chat

Roadmap

Five layers of capability, built progressively. No shortcuts.

Implemented
Operate Software See, click, type across any macOS app
Implemented
Learn from Demos Teach by demonstration, publish skills
In Progress
Remember Lock in successful paths automatically
In Progress
Get Faster Discover and promote quicker routes
Vision
Anticipate Act proactively without disrupting you
Supported models: Anthropic (Claude), OpenAI (GPT / Codex), Google (Gemini), MiniMax, and more via configurable providers.