File: stream-text-with-image-prompt.md | Updated: 11/15/2025
Menu
Google Gemini Image Generation
Get started with Claude 3.7 Sonnet
Get started with OpenAI o3-mini
Generate Text with Chat Prompt
Generate Image with Chat Prompt
streamText Multi-Step Cookbook
Markdown Chatbot with Memoization
Generate Object with File Prompt through Form Submission
Model Context Protocol (MCP) Tools
Share useChat State Across Components
Human-in-the-Loop Agent with Next.js
Render Visual Interface in Chat
Generate Text with Chat Prompt
Generate Text with Image Prompt
Generate Object with a Reasoning Model
Stream Object with Image Prompt
Record Token Usage After Streaming Object
Record Final Object after Streaming Object
Model Context Protocol (MCP) Tools
Retrieval Augmented Generation
Copy markdown
==============================================================================================================================
Vision-language models can analyze images alongside text prompts to generate responses about visual content. This multimodal approach allows for rich interactions where you can ask questions about images, request descriptions, or analyze visual details. The combination of image and text inputs enables more sophisticated AI applications like visual question answering and image analysis.
import { anthropic } from '@ai-sdk/anthropic';import { streamText } from 'ai';import 'dotenv/config';import fs from 'node:fs';
async function main() { const result = streamText({ model: anthropic('claude-3-5-sonnet-20240620'), messages: [ { role: 'user', content: [ { type: 'text', text: 'Describe the image in detail.' }, { type: 'image', image: fs.readFileSync('./data/comic-cat.png') }, ], }, ], });
for await (const textPart of result.textStream) { process.stdout.write(textPart); }}
main().catch(console.error);
On this page
Deploy and Scale AI Apps with Vercel.
Vercel delivers the infrastructure and developer experience you need to ship reliable AI-powered applications at scale.
Trusted by industry leaders: