File: vision.md | Updated: 11/15/2025
Agent Skills are now available! Learn more about extending Claude's capabilities with Agent Skills .
English
Search...
Ctrl K
Search...
Navigation
Capabilities
Vision
Home Developer Guide API Reference Model Context Protocol (MCP) Resources Release Notes
On this page
This guide describes how to work with images in Claude, including best practices, code examples, and limitations to keep in mind.
Use Claude’s vision capabilities via:
Basics and Limits
You can include multiple images in a single request (up to 20 for claude.ai and 100 for API requests). Claude will analyze all provided images when formulating its response. This can be helpful for comparing or contrasting images. If you submit an image larger than 8000x8000 px, it will be rejected. If you submit more than 20 images in one API request, this limit is 2000x2000 px.
While the API supports 100 images per request, there is a 32MB request size limit for standard endpoints.
Evaluate image size
For optimal performance, we recommend resizing images before uploading if they are too large. If your image’s long edge is more than 1568 pixels, or your image is more than ~1,600 tokens, it will first be scaled down, preserving aspect ratio, until it’s within the size limits. If your input image is too large and needs to be resized, it will increase latency of time-to-first-token , without giving you any additional model performance. Very small images under 200 pixels on any given edge may degrade performance.
To improve time-to-first-token , we recommend resizing images to no more than 1.15 megapixels (and within 1568 pixels in both dimensions).
Here is a table of maximum image sizes accepted by our API that will not be resized for common aspect ratios. With the Claude Sonnet 3.7 model, these images use approximately 1,600 tokens and around $4.80/1K images.
| Aspect ratio | Image size | | --- | --- | | 1:1 | 1092x1092 px | | 3:4 | 951x1268 px | | 2:3 | 896x1344 px | | 9:16 | 819x1456 px | | 1:2 | 784x1568 px |
Calculate image costs
Each image you include in a request to Claude counts towards your token usage. To calculate the approximate cost, multiply the approximate number of image tokens by the per-token price of the model
you’re using. If your image does not need to be resized, you can estimate the number of tokens used through this algorithm: tokens = (width px * height px)/750 Here are examples of approximate tokenization and costs for different image sizes within our API’s size constraints based on Claude Sonnet 3.7 per-token price of $3 per million input tokens:
| Image size | # of Tokens | Cost / image | Cost / 1K images | | --- | --- | --- | --- | | 200x200 px(0.04 megapixels) | ~54 | ~$0.00016 | ~$0.16 | | 1000x1000 px(1 megapixel) | ~1334 | ~$0.004 | ~$4.00 | | 1092x1092 px(1.19 megapixels) | ~1590 | ~$0.0048 | ~$4.80 |
Ensuring image quality
When providing images to Claude, keep the following in mind for best results:
Many of the prompting techniques that work well for text-based interactions with Claude can also be applied to image-based prompts. These examples demonstrate best practice prompt structures involving images.
Just as with document-query placement, Claude works best when images come before text. Images placed after text or interpolated with text will still perform well, but if your use case allows it, we recommend an image-then-text structure.
About the prompt examples
The following examples demonstrate how to use Claude’s vision capabilities using various programming languages and approaches. You can provide images to Claude in three ways:
image content blocksThe base64 example prompts use these variables:
Shell
Python
TypeScript
Java
Copy
# For URL-based images, you can use the URL directly in your JSON request
# For base64-encoded images, you need to first encode the image
# Example of how to encode an image to base64 in bash:
BASE64_IMAGE_DATA=$(curl -s "https://upload.wikimedia.org/wikipedia/commons/a/a7/Camponotus_flavomarginatus_ant.jpg" | base64)
# The encoded data can now be used in your API calls
Below are examples of how to include images in a Messages API request using base64-encoded images and URL references:
Base64-encoded image example
Shell
Python
TypeScript
Java
Copy
curl https://api.anthropic.com/v1/messages \
-H "x-api-key: $ANTHROPIC_API_KEY" \
-H "anthropic-version: 2023-06-01" \
-H "content-type: application/json" \
-d '{
"model": "claude-sonnet-4-5",
"max_tokens": 1024,
"messages": [\
{\
"role": "user",\
"content": [\
{\
"type": "image",\
"source": {\
"type": "base64",\
"media_type": "image/jpeg",\
"data": "'"$BASE64_IMAGE_DATA"'"\
}\
},\
{\
"type": "text",\
"text": "Describe this image."\
}\
]\
}\
]
}'
URL-based image example
Shell
Python
TypeScript
Java
Copy
curl https://api.anthropic.com/v1/messages \
-H "x-api-key: $ANTHROPIC_API_KEY" \
-H "anthropic-version: 2023-06-01" \
-H "content-type: application/json" \
-d '{
"model": "claude-sonnet-4-5",
"max_tokens": 1024,
"messages": [\
{\
"role": "user",\
"content": [\
{\
"type": "image",\
"source": {\
"type": "url",\
"url": "https://upload.wikimedia.org/wikipedia/commons/a/a7/Camponotus_flavomarginatus_ant.jpg"\
}\
},\
{\
"type": "text",\
"text": "Describe this image."\
}\
]\
}\
]
}'
Files API image example
For images you’ll use repeatedly or when you want to avoid encoding overhead, use the Files API :
Shell
Python
TypeScript
Java
Copy
# First, upload your image to the Files API
curl -X POST https://api.anthropic.com/v1/files \
-H "x-api-key: $ANTHROPIC_API_KEY" \
-H "anthropic-version: 2023-06-01" \
-H "anthropic-beta: files-api-2025-04-14" \
-F "file=@image.jpg"
# Then use the returned file_id in your message
curl https://api.anthropic.com/v1/messages \
-H "x-api-key: $ANTHROPIC_API_KEY" \
-H "anthropic-version: 2023-06-01" \
-H "anthropic-beta: files-api-2025-04-14" \
-H "content-type: application/json" \
-d '{
"model": "claude-sonnet-4-5",
"max_tokens": 1024,
"messages": [\
{\
"role": "user",\
"content": [\
{\
"type": "image",\
"source": {\
"type": "file",\
"file_id": "file_abc123"\
}\
},\
{\
"type": "text",\
"text": "Describe this image."\
}\
]\
}\
]
}'
See Messages API examples for more example code and parameter details.
Example: One image
It’s best to place images earlier in the prompt than questions about them or instructions for tasks that use them.Ask Claude to describe one image.
| Role | Content | | --- | --- | | User | [Image] Describe this image. |
Here is the corresponding API call using the Claude Sonnet 3.7 model.
Python
Copy
message = client.messages.create(
model="claude-sonnet-4-5",
max_tokens=1024,
messages=[\
{\
"role": "user",\
"content": [\
{\
"type": "image",\
"source": {\
"type": "base64",\
"media_type": image1_media_type,\
"data": image1_data,\
},\
},\
{\
"type": "text",\
"text": "Describe this image."\
}\
],\
}\
],
)
Example: Multiple images
In situations where there are multiple images, introduce each image with Image 1: and Image 2: and so on. You don’t need newlines between images or between images and the prompt.Ask Claude to describe the differences between multiple images.
| Role | Content | | --- | --- | | User | Image 1: [Image 1] Image 2: [Image 2] How are these images different? |
Here is the corresponding API call using the Claude Sonnet 3.7 model.
Python
Copy
message = client.messages.create(
model="claude-sonnet-4-5",
max_tokens=1024,
messages=[\
{\
"role": "user",\
"content": [\
{\
"type": "text",\
"text": "Image 1:"\
},\
{\
"type": "image",\
"source": {\
"type": "base64",\
"media_type": image1_media_type,\
"data": image1_data,\
},\
},\
{\
"type": "text",\
"text": "Image 2:"\
},\
{\
"type": "image",\
"source": {\
"type": "base64",\
"media_type": image2_media_type,\
"data": image2_data,\
},\
},\
{\
"type": "text",\
"text": "How are these images different?"\
}\
],\
}\
],
)
Example: Multiple images with a system prompt
Ask Claude to describe the differences between multiple images, while giving it a system prompt for how to respond.
| Content | | | --- | --- | | System | Respond only in Spanish. | | User | Image 1: [Image 1] Image 2: [Image 2] How are these images different? |
Here is the corresponding API call using the Claude Sonnet 3.7 model.
Python
Copy
message = client.messages.create(
model="claude-sonnet-4-5",
max_tokens=1024,
system="Respond only in Spanish.",
messages=[\
{\
"role": "user",\
"content": [\
{\
"type": "text",\
"text": "Image 1:"\
},\
{\
"type": "image",\
"source": {\
"type": "base64",\
"media_type": image1_media_type,\
"data": image1_data,\
},\
},\
{\
"type": "text",\
"text": "Image 2:"\
},\
{\
"type": "image",\
"source": {\
"type": "base64",\
"media_type": image2_media_type,\
"data": image2_data,\
},\
},\
{\
"type": "text",\
"text": "How are these images different?"\
}\
],\
}\
],
)
Example: Four images across two conversation turns
Claude’s vision capabilities shine in multimodal conversations that mix images and text. You can have extended back-and-forth exchanges with Claude, adding new images or follow-up questions at any point. This enables powerful workflows for iterative image analysis, comparison, or combining visuals with other knowledge.Ask Claude to contrast two images, then ask a follow-up question comparing the first images to two new images.
| Role | Content | | --- | --- | | User | Image 1: [Image 1] Image 2: [Image 2] How are these images different? | | Assistant | [Claude’s response] | | User | Image 1: [Image 3] Image 2: [Image 4] Are these images similar to the first two? | | Assistant | [Claude’s response] |
When using the API, simply insert new images into the array of Messages in the user role as part of any standard multiturn conversation
structure.
While Claude’s image understanding capabilities are cutting-edge, there are some limitations to be aware of:
Always carefully review and verify Claude’s image interpretations, especially for high-stakes use cases. Do not use Claude for tasks requiring perfect precision or sensitive image analysis without human oversight.
What image file types does Claude support?
Claude currently supports JPEG, PNG, GIF, and WebP image formats, specifically:
image/jpegimage/pngimage/gifimage/webpCan Claude read image URLs?
Yes, Claude can now process images from URLs with our URL image source blocks in the API. Simply use the “url” source type instead of “base64” in your API requests. Example:
Copy
{
"type": "image",
"source": {
"type": "url",
"url": "https://upload.wikimedia.org/wikipedia/commons/a/a7/Camponotus_flavomarginatus_ant.jpg"
}
}
Is there a limit to the image file size I can upload?
Yes, there are limits:
Images larger than these limits will be rejected and return an error when using our API.
How many images can I include in one request?
The image limits are:
Requests exceeding these limits will be rejected and return an error.
Does Claude read image metadata?
No, Claude does not parse or receive any metadata from images passed to it.
Can I delete images I've uploaded?
No. Image uploads are ephemeral and not stored beyond the duration of the API request. Uploaded images are automatically deleted after they have been processed.
Where can I find details on data privacy for image uploads?
Please refer to our privacy policy page for information on how we handle uploaded images and other data. We do not use uploaded images to train our models.
What if Claude's image interpretation seems wrong?
If Claude’s image interpretation seems incorrect:
Your feedback helps us improve!
Can Claude generate or edit images?
No, Claude is an image understanding model only. It can interpret and analyze images, but it cannot generate, produce, edit, manipulate, or create images.
Ready to start building with images using Claude? Here are a few helpful resources:
If you have any other questions, feel free to reach out to our support team . You can also join our developer community to connect with other creators and get help from Anthropic experts.
Was this page helpful?
YesNo
Assistant
Responses are generated using AI and may contain mistakes.