📄 ai-sdk/cookbook/node/stream-text-with-image-prompt

File: stream-text-with-image-prompt.md | Updated: 11/15/2025

Source: https://ai-sdk.dev/cookbook/node/stream-text-with-image-prompt

Menu

Multi-Modal Agent

Slackbot Agent Guide

Natural Language Postgres

Get started with Computer Use

Get started with Gemini 2.5

Get started with Claude 4

OpenAI Responses API

Google Gemini Image Generation

Get started with Claude 3.7 Sonnet

Get started with Llama 3.1

Get started with GPT-5

Get started with OpenAI o1

Get started with OpenAI o3-mini

Get started with DeepSeek R1

Generate Text with Chat Prompt

Generate Image with Chat Prompt

Stream Text with Chat Prompt

Stream Text with Image Prompt

streamText Multi-Step Cookbook

Markdown Chatbot with Memoization

Generate Object

Generate Object with File Prompt through Form Submission

Call Tools in Multiple Steps

Model Context Protocol (MCP) Tools

Share useChat State Across Components

Human-in-the-Loop Agent with Next.js

Send Custom Body from useChat

Render Visual Interface in Chat

Caching Middleware

Generate Text with Chat Prompt

Generate Text with Image Prompt

Stream Text with Chat Prompt

Stream Text with Image Prompt

Stream Text with File Prompt

Generate Object with a Reasoning Model

Generate Object

Stream Object with Image Prompt

Record Token Usage After Streaming Object

Record Final Object after Streaming Object

Call Tools with Image Prompt

Call Tools in Multiple Steps

Model Context Protocol (MCP) Tools

Manual Agent Loop

Web Search Agent

Embed Text in Batch

Intercepting Fetch Requests

Local Caching Middleware

Retrieval Augmented Generation

Knowledge Base Agent

Node.js HTTP Server

React Server Components

Copy markdown

Stream Text with Image Prompt

==============================================================================================================================

Vision-language models can analyze images alongside text prompts to generate responses about visual content. This multimodal approach allows for rich interactions where you can ask questions about images, request descriptions, or analyze visual details. The combination of image and text inputs enables more sophisticated AI applications like visual question answering and image analysis.

import { anthropic } from '@ai-sdk/anthropic';import { streamText } from 'ai';import 'dotenv/config';import fs from 'node:fs';
async function main() {  const result = streamText({    model: anthropic('claude-3-5-sonnet-20240620'),    messages: [      {        role: 'user',        content: [          { type: 'text', text: 'Describe the image in detail.' },          { type: 'image', image: fs.readFileSync('./data/comic-cat.png') },        ],      },    ],  });
  for await (const textPart of result.textStream) {    process.stdout.write(textPart);  }}
main().catch(console.error);

On this page

Stream Text with Image Prompt

Deploy and Scale AI Apps with Vercel.

Vercel delivers the infrastructure and developer experience you need to ship reliable AI-powered applications at scale.

Trusted by industry leaders:

OpenAI
Photoroom

Talk to an expert