📄 ai-sdk/cookbook/guides/google-gemini-image-generation

File: google-gemini-image-generation.md | Updated: 11/15/2025

Source: https://ai-sdk.dev/cookbook/guides/google-gemini-image-generation

AI SDK

Menu

Guides

RAG Agent

Multi-Modal Agent

Slackbot Agent Guide

Natural Language Postgres

Get started with Computer Use

Get started with Gemini 2.5

Get started with Claude 4

OpenAI Responses API

Google Gemini Image Generation

Get started with Claude 3.7 Sonnet

Get started with Llama 3.1

Get started with GPT-5

Get started with OpenAI o1

Get started with OpenAI o3-mini

Get started with DeepSeek R1

Next.js

Generate Text

Generate Text with Chat Prompt

Generate Image with Chat Prompt

Stream Text

Stream Text with Chat Prompt

Stream Text with Image Prompt

Chat with PDFs

streamText Multi-Step Cookbook

Markdown Chatbot with Memoization

Generate Object

Generate Object with File Prompt through Form Submission

Stream Object

Call Tools

Call Tools in Multiple Steps

Model Context Protocol (MCP) Tools

Share useChat State Across Components

Human-in-the-Loop Agent with Next.js

Send Custom Body from useChat

Render Visual Interface in Chat

Caching Middleware

Node

Generate Text

Generate Text with Chat Prompt

Generate Text with Image Prompt

Stream Text

Stream Text with Chat Prompt

Stream Text with Image Prompt

Stream Text with File Prompt

Generate Object with a Reasoning Model

Generate Object

Stream Object

Stream Object with Image Prompt

Record Token Usage After Streaming Object

Record Final Object after Streaming Object

Call Tools

Call Tools with Image Prompt

Call Tools in Multiple Steps

Model Context Protocol (MCP) Tools

Manual Agent Loop

Web Search Agent

Embed Text

Embed Text in Batch

Intercepting Fetch Requests

Local Caching Middleware

Retrieval Augmented Generation

Knowledge Base Agent

API Servers

Node.js HTTP Server

Express

Hono

Fastify

Nest.js

React Server Components

Copy markdown

Generate and Edit Images with Google Gemini 2.5 Flash

================================================================================================================================================================================

This guide will show you how to generate and edit images with the AI SDK and Google's latest multimodal language model Gemini 2.5 Flash Image.

Generating Images


As Gemini 2.5 Flash Image is a language model with multimodal capabilities, you can use the generateText or streamText functions (not generateImage) to create images. The model determines which modality to respond in based on your prompt and configuration. Here's how to create your first image:

import { google } from '@ai-sdk/google';import { generateText } from 'ai';import fs from 'node:fs';import 'dotenv/config';
async function generateImage() {  const result = await generateText({    model: google('gemini-2.5-flash-image-preview'),    prompt:      'Create a picture of a nano banana dish in a fancy restaurant with a Gemini theme',  });
  // Save generated images  for (const file of result.files) {    if (file.mediaType.startsWith('image/')) {      const timestamp = Date.now();      const fileName = `generated-${timestamp}.png`;
      fs.mkdirSync('output', { recursive: true });      await fs.promises.writeFile(`output/${fileName}`, file.uint8Array);
      console.log(`Generated and saved image: output/${fileName}`);    }  }}
generateImage().catch(console.error);

Here are some key points to remember:

  • Generated images are returned in the result.files array
  • Images are returned as Uint8Array data
  • The model leverages Gemini's world knowledge, so detailed prompts yield better results

Editing Images


Gemini 2.5 Flash Image excels at editing existing images with natural language instructions. You can add elements, modify styles, or transform images while maintaining their core characteristics:

import { google } from '@ai-sdk/google';import { generateText } from 'ai';import fs from 'node:fs';import 'dotenv/config';
async function editImage() {  const editResult = await generateText({    model: google('gemini-2.5-flash-image-preview'),    prompt: [      {        role: 'user',        content: [          {            type: 'text',            text: 'Add a small wizard hat to this cat. Keep everything else the same.',          },          {            type: 'image',            // image: DataContent (string | Uint8Array | ArrayBuffer | Buffer) or URL            image: new URL(              'https://raw.githubusercontent.com/vercel/ai/refs/heads/main/examples/ai-core/data/comic-cat.png',            ),            mediaType: 'image/jpeg',          },        ],      },    ],  });
  // Save the edited image  const timestamp = Date.now();  fs.mkdirSync('output', { recursive: true });
  for (const file of editResult.files) {    if (file.mediaType.startsWith('image/')) {      await fs.promises.writeFile(        `output/edited-${timestamp}.png`,        file.uint8Array,      );      console.log(`Saved edited image: output/edited-${timestamp}.png`);    }  }}
editImage().catch(console.error);

What's Next?


You've learned how to generate new images from text prompts and edit existing images using natural language instructions with Google's Gemini 2.5 Flash Image model.

For more advanced techniques, integration patterns, and practical examples, check out our Cookbook where you'll find comprehensive guides for building sophisticated AI-powered applications.

On this page

Generate and Edit Images with Google Gemini 2.5 Flash

Generating Images

Editing Images

What's Next?

Deploy and Scale AI Apps with Vercel.

Vercel delivers the infrastructure and developer experience you need to ship reliable AI-powered applications at scale.

Trusted by industry leaders:

  • OpenAI
  • Photoroom
  • leonardo-ai Logoleonardo-ai Logo
  • zapier Logozapier Logo

Talk to an expert