File: computer-use.md | Updated: 11/15/2025
Agent Skills are now available! Learn more about extending Claude's capabilities with Agent Skills .
English
Search...
Ctrl K
Search...
Navigation
Tools
Computer use tool
Home Developer Guide API Reference Model Context Protocol (MCP) Resources Release Notes
On this page
Claude can interact with computer environments through the computer use tool, which provides screenshot capabilities and mouse/keyboard control for autonomous desktop interaction.
Computer use is currently in beta and requires a beta header :
"computer-use-2025-01-24" (Claude 4 models and Claude Sonnet 3.7 (deprecated
))Computer use is a beta feature that enables Claude to interact with desktop environments. This tool provides:
While computer use can be augmented with other tools like bash and text editor for more comprehensive automation workflows, computer use specifically refers to the computer use tool’s capability to see and control desktop environments.
Computer use is available for the following Claude models:
| Model | Tool Version | Beta Flag |
| --- | --- | --- |
| Claude 4 models | computer_20250124 | computer-use-2025-01-24 |
| Claude Sonnet 3.7 (deprecated<br>) | computer_20250124 | computer-use-2025-01-24 |
Claude 4 models use updated tool versions optimized for the new architecture. Claude Sonnet 3.7 (deprecated ) introduces additional capabilities including the thinking feature for more insight into the model’s reasoning process.
Older tool versions are not guaranteed to be backwards-compatible with newer models. Always use the tool version that corresponds to your model version.
Computer use is a beta feature with unique risks distinct from standard API features. These risks are heightened when interacting with the internet. To minimize risks, consider taking precautions such as:
In some circumstances, Claude will follow commands found in content even if it conflicts with the user’s instructions. For example, Claude instructions on webpages or contained in images may override instructions or cause Claude to make mistakes. We suggest taking precautions to isolate Claude from sensitive data and actions to avoid risks related to prompt injection.We’ve trained the model to resist these prompt injections and have added an extra layer of defense. If you use our computer use tools, we’ll automatically run classifiers on your prompts to flag potential instances of prompt injections. When these classifiers identify potential prompt injections in screenshots, they will automatically steer the model to ask for user confirmation before proceeding with the next action. We recognize that this extra protection won’t be ideal for every use case (for example, use cases without a human in the loop), so if you’d like to opt out and turn it off, please contact us .We still suggest taking precautions to isolate Claude from sensitive data and actions to avoid risks related to prompt injection.Finally, please inform end users of relevant risks and obtain their consent prior to enabling computer use in your own products.
Please use this form to provide feedback on the quality of the model responses, the API itself, or the quality of the documentation - we cannot wait to hear from you!
Here’s how to get started with computer use:
Python
Shell
Copy
import anthropic
client = anthropic.Anthropic()
response = client.beta.messages.create(
model="claude-sonnet-4-5", # or another compatible model
max_tokens=1024,
tools=[\
{\
"type": "computer_20250124",\
"name": "computer",\
"display_width_px": 1024,\
"display_height_px": 768,\
"display_number": 1,\
},\
{\
"type": "text_editor_20250124",\
"name": "str_replace_editor"\
},\
{\
"type": "bash_20250124",\
"name": "bash"\
}\
],
messages=[{"role": "user", "content": "Save a picture of a cat to my desktop."}],
betas=["computer-use-2025-01-24"]
)
print(response)
A beta header is only required for the computer use tool.The example above shows all three tools being used together, which requires the beta header because it includes the computer use tool.
1. Provide Claude with the computer use tool and a user prompt
2. Claude decides to use the computer use tool
stop_reason of tool_use, signaling Claude’s intent.3. Extract tool input, evaluate the tool on a computer, and return results
user message containing a tool_result content block.4. Claude continues calling computer use tools until it's completed the task
tool_use stop_reason and you should return to step 3.We refer to the repetition of steps 3 and 4 without user input as the “agent loop” - i.e., Claude responding with a tool use request and your application responding to Claude with the results of evaluating that request.
The computing environment
Computer use requires a sandboxed computing environment where Claude can safely interact with applications and the web. This environment includes:
When you use computer use, Claude doesn’t directly connect to this environment. Instead, your application:
For security and isolation, the reference implementation runs all of this inside a Docker container with appropriate port mappings for viewing and interacting with the environment.
Start with our reference implementation
We have built a reference implementation that includes everything you need to get started quickly with computer use:
A containerized environment suitable for computer use with Claude
Implementations of the computer use tools
An agent loop that interacts with the Claude API and executes the computer use tools
A web interface to interact with the container, agent loop, and tools.
Understanding the multi-agent loop
The core of computer use is the “agent loop” - a cycle where Claude requests tool actions, your application executes them, and returns results to Claude. Here’s a simplified example:
Copy
async def sampling_loop(
*,
model: str,
messages: list[dict],
api_key: str,
max_tokens: int = 4096,
tool_version: str,
thinking_budget: int | None = None,
max_iterations: int = 10, # Add iteration limit to prevent infinite loops
):
"""
A simple agent loop for Claude computer use interactions.
This function handles the back-and-forth between:
1. Sending user messages to Claude
2. Claude requesting to use tools
3. Your app executing those tools
4. Sending tool results back to Claude
"""
# Set up tools and API parameters
client = Anthropic(api_key=api_key)
beta_flag = "computer-use-2025-01-24" if "20250124" in tool_version else "computer-use-2024-10-22"
# Configure tools - you should already have these initialized elsewhere
tools = [\
{"type": f"computer_{tool_version}", "name": "computer", "display_width_px": 1024, "display_height_px": 768},\
{"type": f"text_editor_{tool_version}", "name": "str_replace_editor"},\
{"type": f"bash_{tool_version}", "name": "bash"}\
]
# Main agent loop (with iteration limit to prevent runaway API costs)
iterations = 0
while True and iterations < max_iterations:
iterations += 1
# Set up optional thinking parameter (for Claude Sonnet 3.7)
thinking = None
if thinking_budget:
thinking = {"type": "enabled", "budget_tokens": thinking_budget}
# Call the Claude API
response = client.beta.messages.create(
model=model,
max_tokens=max_tokens,
messages=messages,
tools=tools,
betas=[beta_flag],
thinking=thinking
)
# Add Claude's response to the conversation history
response_content = response.content
messages.append({"role": "assistant", "content": response_content})
# Check if Claude used any tools
tool_results = []
for block in response_content:
if block.type == "tool_use":
# In a real app, you would execute the tool here
# For example: result = run_tool(block.name, block.input)
result = {"result": "Tool executed successfully"}
# Format the result for Claude
tool_results.append({
"type": "tool_result",
"tool_use_id": block.id,
"content": result
})
# If no tools were used, Claude is done - return the final messages
if not tool_results:
return messages
# Add tool results to messages for the next iteration with Claude
messages.append({"role": "user", "content": tool_results})
The loop continues until either Claude responds without requesting any tools (task completion) or the maximum iteration limit is reached. This safeguard prevents potential infinite loops that could result in unexpected API costs.
When using the computer use tool, you must include the appropriate beta flag for your model version:
Claude 4 models
When using computer_20250124, include this beta flag:
Copy
"betas": ["computer-use-2025-01-24"]
Claude Sonnet 3.7
When using computer_20250124, include this beta flag:
Copy
"betas": ["computer-use-2025-01-24"]
We recommend trying the reference implementation out before reading the rest of this documentation.
Optimize model performance with prompting
Here are some tips on how to get the best quality outputs:
After each step, take a screenshot and carefully evaluate if you have achieved the right outcome. Explicitly show your thinking: "I have evaluated step X..." If not correct, try again. Only when you confirm a step was executed correctly should you move on to the next one.<robot_credentials>. Using computer use within applications that require login increases the risk of bad outcomes as a result of prompt injection. Please review our guide on mitigating prompt injections
before providing the model with login credentials.If you repeatedly encounter a clear set of issues or know in advance the tasks Claude will need to complete, use the system prompt to provide Claude with explicit tips or instructions on how to do the tasks successfully.
System prompts
When one of the Anthropic-defined tools is requested via the Claude API, a computer use-specific system prompt is generated. It’s similar to the tool use system prompt but starts with:
You have access to a set of functions you can use to answer the user’s question. This includes access to a sandboxed computing environment. You do NOT currently have the ability to inspect files or interact with external resources, except by invoking the below functions.
As with regular tool use, the user-provided system_prompt field is still respected and used in the construction of the combined system prompt.
Available actions
The computer use tool supports these actions: Basic actions (all versions)
[x, y]Enhanced actions (computer_20250124) Available in Claude 4 models and Claude Sonnet 3.7:
Example actions
Copy
// Take a screenshot
{
"action": "screenshot"
}
// Click at position
{
"action": "left_click",
"coordinate": [500, 300]
}
// Type text
{
"action": "type",
"text": "Hello, world!"
}
// Scroll down (Claude 4/3.7)
{
"action": "scroll",
"coordinate": [500, 400],
"scroll_direction": "down",
"scroll_amount": 3
}
Tool parameters
| Parameter | Required | Description |
| --- | --- | --- |
| type | Yes | Tool version (computer_20250124 or computer_20241022) |
| name | Yes | Must be “computer” |
| display_width_px | Yes | Display width in pixels |
| display_height_px | Yes | Display height in pixels |
| display_number | No | Display number for X11 environments |
Keep display resolution at or below 1280x800 (WXGA) for best performance. Higher resolutions may cause accuracy issues due to image resizing .
Important: The computer use tool must be explicitly executed by your application - Claude cannot execute it directly. You are responsible for implementing the screenshot capture, mouse movements, keyboard inputs, and other actions based on Claude’s requests.
Enable thinking capability in Claude 4 models and Claude Sonnet 3.7
Claude Sonnet 3.7 introduced a new “thinking” capability that allows you to see the model’s reasoning process as it works through complex tasks. This feature helps you understand how Claude is approaching a problem and can be particularly valuable for debugging or educational purposes. To enable thinking, add a thinking parameter to your API request:
Copy
"thinking": {
"type": "enabled",
"budget_tokens": 1024
}
The budget_tokens parameter specifies how many tokens Claude can use for thinking. This is subtracted from your overall max_tokens budget. When thinking is enabled, Claude will return its reasoning process as part of the response, which can help you:
Here’s an example of what thinking output might look like:
Copy
[Thinking]
I need to save a picture of a cat to the desktop. Let me break this down into steps:
1. First, I'll take a screenshot to see what's on the desktop
2. Then I'll look for a web browser to search for cat images
3. After finding a suitable image, I'll need to save it to the desktop
Let me start by taking a screenshot to see what's available...
Augmenting computer use with other tools
The computer use tool can be combined with other tools to create more powerful automation workflows. This is particularly useful when you need to:
Shell
Python
TypeScript
Java
Copy
curl https://api.anthropic.com/v1/messages \
-H "content-type: application/json" \
-H "x-api-key: $ANTHROPIC_API_KEY" \
-H "anthropic-version: 2023-06-01" \
-H "anthropic-beta: computer-use-2025-01-24" \
-d '{
"model": "claude-sonnet-4-5",
"max_tokens": 2000,
"tools": [\
{\
"type": "computer_20250124",\
"name": "computer",\
"display_width_px": 1024,\
"display_height_px": 768,\
"display_number": 1\
},\
{\
"type": "text_editor_20250124",\
"name": "str_replace_editor"\
},\
{\
"type": "bash_20250124",\
"name": "bash"\
},\
{\
"name": "get_weather",\
"description": "Get the current weather in a given location",\
"input_schema": {\
"type": "object",\
"properties": {\
"location": {\
"type": "string",\
"description": "The city and state, e.g. San Francisco, CA"\
},\
"unit": {\
"type": "string",\
"enum": ["celsius", "fahrenheit"],\
"description": "The unit of temperature, either 'celsius' or 'fahrenheit'"\
}\
},\
"required": ["location"]\
}\
}\
],
"messages": [\
{\
"role": "user",\
"content": "Find flights from San Francisco to a place with warmer weather."\
}\
],
"thinking": {
"type": "enabled",
"budget_tokens": 1024
}
}'
Build a custom computer use environment
The reference implementation is meant to help you get started with computer use. It includes all of the components needed have Claude use a computer. However, you can build your own environment for computer use to suit your needs. You’ll need:
tool_use results using your tool implementationsImplement the computer use tool
The computer use tool is implemented as a schema-less tool. When using this tool, you don’t need to provide an input schema as with other tools; the schema is built into Claude’s model and can’t be modified.
1
Set up your computing environment
Create a virtual display or connect to an existing display that Claude will interact with. This typically involves setting up Xvfb (X Virtual Framebuffer) or similar technology.
2
Implement action handlers
Create functions to handle each action type that Claude might request:
Copy
def handle_computer_action(action_type, params):
if action_type == "screenshot":
return capture_screenshot()
elif action_type == "left_click":
x, y = params["coordinate"]
return click_at(x, y)
elif action_type == "type":
return type_text(params["text"])
# ... handle other actions
3
Process Claude's tool calls
Extract and execute tool calls from Claude’s responses:
Copy
for content in response.content:
if content.type == "tool_use":
action = content.input["action"]
result = handle_computer_action(action, content.input)
# Return result to Claude
tool_result = {
"type": "tool_result",
"tool_use_id": content.id,
"content": result
}
4
Implement the agent loop
Create a loop that continues until Claude completes the task:
Copy
while True:
response = client.beta.messages.create(...)
# Check if Claude used any tools
tool_results = process_tool_calls(response)
if not tool_results:
# No more tool use, task complete
break
# Continue conversation with tool results
messages.append({"role": "user", "content": tool_results})
Handle errors
When implementing the computer use tool, various errors may occur. Here’s how to handle them:
Screenshot capture failure
If screenshot capture fails, return an appropriate error message:
Copy
{
"role": "user",
"content": [\
{\
"type": "tool_result",\
"tool_use_id": "toolu_01A09q90qw90lq917835lq9",\
"content": "Error: Failed to capture screenshot. Display may be locked or unavailable.",\
"is_error": true\
}\
]
}
Invalid coordinates
If Claude provides coordinates outside the display bounds:
Copy
{
"role": "user",
"content": [\
{\
"type": "tool_result",\
"tool_use_id": "toolu_01A09q90qw90lq917835lq9",\
"content": "Error: Coordinates (1200, 900) are outside display bounds (1024x768).",\
"is_error": true\
}\
]
}
Action execution failure
If an action fails to execute:
Copy
{
"role": "user",
"content": [\
{\
"type": "tool_result",\
"tool_use_id": "toolu_01A09q90qw90lq917835lq9",\
"content": "Error: Failed to perform click action. The application may be unresponsive.",\
"is_error": true\
}\
]
}
Follow implementation best practices
Use appropriate display resolution
Set display dimensions that match your use case while staying within recommended limits:
Implement proper screenshot handling
When returning screenshots to Claude:
Add action delays
Some applications need time to respond to actions:
Copy
def click_and_wait(x, y, wait_time=0.5):
click_at(x, y)
time.sleep(wait_time) # Allow UI to update
Validate actions before execution
Check that requested actions are safe and valid:
Copy
def validate_action(action_type, params):
if action_type == "left_click":
x, y = params.get("coordinate", (0, 0))
if not (0 <= x < display_width and 0 <= y < display_height):
return False, "Coordinates out of bounds"
return True, None
Log actions for debugging
Keep a log of all actions for troubleshooting:
Copy
import logging
def log_action(action_type, params, result):
logging.info(f"Action: {action_type}, Params: {params}, Result: {result}")
The computer use functionality is in beta. While Claude’s capabilities are cutting edge, developers should be aware of its limitations:
left_mouse_down, left_mouse_up, and new modifier key support. Cell selection can be more reliable by using these fine-grained controls and combining modifier keys with clicks.Always carefully review and verify Claude’s computer use actions and logs. Do not use Claude for tasks requiring perfect precision or sensitive user information without human oversight.
Computer use follows the standard tool use pricing . When using the computer use tool: System prompt overhead: The computer use beta adds 466-499 tokens to the system prompt Computer use tool token usage:
| Model | Input tokens per tool definition | | --- | --- | | Claude 4.x models | 735 tokens | | Claude Sonnet 3.7 (deprecated<br>) | 735 tokens |
Additional token consumption:
If you’re also using bash or text editor tools alongside computer use, those tools have their own token costs as documented in their respective pages.
Reference implementation
------------------------
Get started quickly with our complete Docker-based implementation
Tool documentation
------------------
Learn more about tool use and creating custom tools
Was this page helpful?
YesNo
Code execution tool Text editor tool
Assistant
Responses are generated using AI and may contain mistakes.