File: computer-use.md | Updated: 11/15/2025
Menu
Google Gemini Image Generation
Get started with Claude 3.7 Sonnet
Get started with OpenAI o3-mini
Generate Text with Chat Prompt
Generate Image with Chat Prompt
streamText Multi-Step Cookbook
Markdown Chatbot with Memoization
Generate Object with File Prompt through Form Submission
Model Context Protocol (MCP) Tools
Share useChat State Across Components
Human-in-the-Loop Agent with Next.js
Render Visual Interface in Chat
Generate Text with Chat Prompt
Generate Text with Image Prompt
Generate Object with a Reasoning Model
Stream Object with Image Prompt
Record Token Usage After Streaming Object
Record Final Object after Streaming Object
Model Context Protocol (MCP) Tools
Retrieval Augmented Generation
Copy markdown
===============================================================================================================
With the release of Computer Use in Claude 3.5 Sonnet , you can now direct AI models to interact with computers like humans do - moving cursors, clicking buttons, and typing text. This capability enables automation of complex tasks while leveraging Claude's advanced reasoning abilities.
The AI SDK is a powerful TypeScript toolkit for building AI applications with large language models (LLMs) like Anthropic's Claude alongside popular frameworks like React, Next.js, Vue, Svelte, Node.js, and more. In this guide, you will learn how to integrate Computer Use into your AI SDK applications.
Computer Use is currently in beta with some limitations . The feature may be error-prone at times. Anthropic recommends starting with low-risk tasks and implementing appropriate safety measures.
Anthropic recently released a new version of the Claude 3.5 Sonnet model which is capable of 'Computer Use'. This allows the model to interact with computer interfaces through basic actions like:
Computer Use enables the model to read and interact with on-screen content through a series of coordinated steps. Here's how the process works:
Start with a prompt and tools
Add Anthropic-defined Computer Use tools to your request and provide a task (prompt) for the model. For example: "save an image to your downloads folder."
Select the right tool
The model evaluates which computer tools can help accomplish the task. It then sends a formatted tool_call to use the appropriate tool.
Execute the action and return results
The AI SDK processes Claude's request by running the selected tool. The results can then be sent back to Claude through a tool_result message.
Complete the task through iterations
Claude analyzes each result to determine if more actions are needed. It continues requesting tool use and processing results until it completes your task or requires additional input.
There are three main tools available in the Computer Use API:
Computer Use tools in the AI SDK are predefined interfaces that require your own implementation of the execution layer. While the SDK provides the type definitions and structure for these tools, you need to:
The recommended approach is to start with Anthropic's reference implementation , which provides:
This reference implementation serves as a foundation to understand the requirements before building your own custom solution.
Getting Started with the AI SDK
If you have never used the AI SDK before, start by following the Getting Started guide .
For a working example of Computer Use implementation with Next.js and the AI SDK, check out our AI SDK Computer Use Template .
First, ensure you have the AI SDK and Anthropic AI SDK provider installed:
pnpm add ai @ai-sdk/anthropic
You can add Computer Use to your AI SDK applications using provider-defined-client tools. These tools accept various input parameters (like display height and width in the case of the computer tool) and then require that you define an execute function.
Here's how you could set up the Computer Tool with the AI SDK:
import { anthropic } from '@ai-sdk/anthropic';import { getScreenshot, executeComputerAction } from '@/utils/computer-use';
const computerTool = anthropic.tools.computer_20250124({ displayWidthPx: 1920, displayHeightPx: 1080, execute: async ({ action, coordinate, text }) => { switch (action) { case 'screenshot': { return { type: 'image', data: getScreenshot(), }; } default: { return executeComputerAction(action, coordinate, text); } } }, toModelOutput(result) { return typeof result === 'string' ? [{ type: 'text', text: result }] : [{ type: 'image', data: result.data, mediaType: 'image/png' }]; },});
The computerTool handles two main actions: taking screenshots via getScreenshot() and executing computer actions like mouse movements and clicks through executeComputerAction(). Remember, you have to implement this execution logic (eg. the getScreenshot and executeComputerAction functions) to handle the actual computer interactions. The execute function should handle all low-level interactions with the operating system.
Finally, to send tool results back to the model, use the toModelOutput()
function to convert text and image responses into a format the model can process. The AI SDK includes experimental support for these multi-modal tool results when using Anthropic's models.
Computer Use requires appropriate safety measures like using virtual machines, limiting access to sensitive data, and implementing human oversight for critical actions.
Once your tool is defined, you can use it with both the generateText
and streamText
functions.
For one-shot text generation, use generateText:
const result = await generateText({ model: anthropic('claude-sonnet-4-20250514'), prompt: 'Move the cursor to the center of the screen and take a screenshot', tools: { computer: computerTool },});
console.log(result.text);
For streaming responses, use streamText to receive updates in real-time:
const result = streamText({ model: anthropic('claude-sonnet-4-20250514'), prompt: 'Open the browser and navigate to vercel.com', tools: { computer: computerTool },});
for await (const chunk of result.textStream) { console.log(chunk);}
To allow the model to perform multiple steps without user intervention, use the stopWhen parameter. This will automatically send any tool results back to the model to trigger a subsequent generation:
import { stepCountIs } from 'ai';
const stream = streamText({ model: anthropic('claude-sonnet-4-20250514'), prompt: 'Open the browser and navigate to vercel.com', tools: { computer: computerTool }, stopWhen: stepCountIs(10), // experiment with this value based on your use case});
You can combine multiple tools in a single request to enable more complex workflows. The AI SDK supports all three of Claude's Computer Use tools:
const computerTool = anthropic.tools.computer_20250124({ ...});
const bashTool = anthropic.tools.bash_20250124({ execute: async ({ command, restart }) => execSync(command).toString()});
const textEditorTool = anthropic.tools.textEditor_20250124({ execute: async ({ command, path, file_text, insert_line, new_str, old_str, view_range }) => { // Handle file operations based on command switch(command) { return executeTextEditorFunction({ command, path, fileText: file_text, insertLine: insert_line, newStr: new_str, oldStr: old_str, viewRange: view_range }); } }});
const response = await generateText({ model: anthropic("claude-sonnet-4-20250514"), prompt: "Create a new file called example.txt, write 'Hello World' to it, and run 'cat example.txt' in the terminal", tools: { computer: computerTool, bash: bashTool, str_replace_editor: textEditorTool, },});
Always implement appropriate security measures and obtain user consent before enabling Computer Use in production applications.
To get the best results when using Computer Use:
Remember, Computer Use is a beta feature. Please be aware that it poses unique risks that are distinct from standard API features or chat interfaces. These risks are heightened when using Computer Use to interact with the internet. To minimize risks, consider taking precautions such as:
On this page
Getting Started with the AI SDK
Using Computer Tools with Text Generation
Configure Multi-Step (Agentic) Generations
Best Practices for Computer Use
Deploy and Scale AI Apps with Vercel.
Vercel delivers the infrastructure and developer experience you need to ship reliable AI-powered applications at scale.
Trusted by industry leaders: