All guides
MASTER CLASS · COMPUTER USE · INTERMEDIATE

Claude Computer Use: an agent that drives your Mac.

Anthropic's Computer Use API lets Claude take screenshots, click, type, scroll — like a human operator. Wire it up, sandbox it, and ship the highest-leverage agent on your stack.

▸ When you're done

A working Computer Use agent on your laptop that opens apps, fills forms, scrapes UIs without APIs, and screenshots its work. Sandboxed via Docker so it can't touch anything you don't want it to.

22 min walkthrough
3 tools · all free tier
Copy-paste ready · no theory
The stack
◢ The build · 4 steps · 22 min

Follow these in order. Don't skip.

Step 01 / 04

Get an Anthropic API key with Computer Use access

The Pro plan includes Claude Code, but Computer Use API runs on pay-as-you-go API credits.

  • console.anthropic.com → Sign up (or log in)
  • Settings → Billing → add a card. Computer Use is available on Sonnet 4.5+ and Opus 4+.
  • Settings → API Keys → Create Key. Name it computer-use-dev.
  • Save it as ANTHROPIC_API_KEY in your .env.
Watch out
Computer Use can click anything visible. Treat the API key like nuclear codes — never commit it, never run agents on your real desktop without sandboxing.
Step 02 / 04

Run the official sandbox in Docker (the safe way)

Anthropic ships a Docker image with a virtual display, browser, and the agent loop. You watch through a browser. Your real machine never gets touched.

Terminal
1# Pull and run the official quickstart image
2export ANTHROPIC_API_KEY=sk-ant-...
3
4docker run \
5 -e ANTHROPIC_API_KEY=$ANTHROPIC_API_KEY \
6 -v $HOME/.anthropic:/home/computeruse/.anthropic \
7 -p 5900:5900 \
8 -p 8501:8501 \
9 -p 6080:6080 \
10 -p 8080:8080 \
11 -it ghcr.io/anthropics/anthropic-quickstarts:computer-use-demo-latest
  • Open http://localhost:8080 — that's the Streamlit chat UI
  • Open http://localhost:6080/vnc.html — that's the live screen the agent sees
  • Tell it something like: "Open Firefox, search for the Anthropic blog, and summarize the latest post."
  • Watch the screen update in real time.
Step 03 / 04

Build your own agent loop in Python

Terminal
1pip install anthropic pillow
agent/computer_use.py
1import os, base64
2from anthropic import Anthropic
3from PIL import ImageGrab
4
5client = Anthropic(api_key=os.environ["ANTHROPIC_API_KEY"])
6
7def screenshot_b64() -> str:
8 img = ImageGrab.grab()
9 img.save("/tmp/screen.png")
10 return base64.b64encode(open("/tmp/screen.png", "rb").read()).decode()
11
12def run(task: str):
13 messages = [{
14 "role": "user",
15 "content": [
16 {"type": "text", "text": task},
17 {"type": "image", "source": {
18 "type": "base64", "media_type": "image/png", "data": screenshot_b64(),
19 }},
20 ],
21 }]
22
23 while True:
24 resp = client.beta.messages.create(
25 model="claude-sonnet-4-6-20250101",
26 max_tokens=4096,
27 tools=[{"type": "computer_20250124", "name": "computer",
28 "display_width_px": 1920, "display_height_px": 1080, "display_number": 1}],
29 messages=messages,
30 betas=["computer-use-2025-01-24"],
31 )
32
33 # Handle each tool_use block: click, type, screenshot, key, etc.
34 # Pyautogui or applescript executes the action.
35 # Send a fresh screenshot back as tool_result.
36 # Loop until stop_reason == "end_turn".
37 if resp.stop_reason == "end_turn":
38 return resp
39 # ... action dispatch + tool_result append ...
40
41if __name__ == "__main__":
42 run("Open my browser and find the cheapest flight from NYC to SF next Friday.")
Heads up
Use the Docker quickstart code as your reference implementation — github.com/anthropics/anthropic-quickstarts. The action dispatch is 80 lines and handles every tool the model can call.
Step 04 / 04

When to use Computer Use vs an API call

  • USE Computer Use when: the target has no API (legacy ERP, internal tools, login-walled SaaS), or you need a screenshot of what happened.
  • DON'T use Computer Use when: an API exists. APIs are 10× faster, 50× cheaper, and never break on a UI redesign.
  • Hybrid pattern: API for the boring 90%, Computer Use only for the 10% the API can't reach.
Ship-it checklist
5 CHECKS
  • Anthropic API key with billing enabled
  • Docker quickstart running locally — you saw the agent click around in the noVNC viewer
  • You ran one custom task end-to-end (e.g., "open browser, search X, summarize")
  • You understand the screenshot → tool_use → action → screenshot loop
  • You know which 3 problems on your stack should be Computer Use vs API