📄 ai-sdk/docs/advanced/rate-limiting

File: rate-limiting.md | Updated: 11/15/2025

Source: https://ai-sdk.dev/docs/advanced/rate-limiting

AI SDK

Menu

v5 (Latest)

AI SDK 5.x

AI SDK by Vercel

AI SDK 6 Beta

Foundations

Overview

Providers and Models

Prompts

Tools

Streaming

Getting Started

Navigating the Library

Next.js App Router

Next.js Pages Router

Svelte

Vue.js (Nuxt)

Node.js

Expo

Agents

Agents

Building Agents

Workflow Patterns

Loop Control

AI SDK Core

Overview

Generating Text

Generating Structured Data

Tool Calling

Model Context Protocol (MCP) Tools

Prompt Engineering

Settings

Embeddings

Image Generation

Transcription

Speech

Language Model Middleware

Provider & Model Management

Error Handling

Testing

Telemetry

AI SDK UI

Overview

Chatbot

Chatbot Message Persistence

Chatbot Resume Streams

Chatbot Tool Usage

Generative User Interfaces

Completion

Object Generation

Streaming Custom Data

Error Handling

Transport

Reading UIMessage Streams

Message Metadata

Stream Protocols

AI SDK RSC

Advanced

Prompt Engineering

Stopping Streams

Backpressure

Caching

Multiple Streamables

Rate Limiting

Rendering UI with Language Models

Language Models as Routers

Multistep Interfaces

Sequential Generations

Vercel Deployment Guide

Reference

AI SDK Core

AI SDK UI

AI SDK RSC

Stream Helpers

AI SDK Errors

Migration Guides

Troubleshooting

Copy markdown

Rate Limiting

==============================================================================

Rate limiting helps you protect your APIs from abuse. It involves setting a maximum threshold on the number of requests a client can make within a specified timeframe. This simple technique acts as a gatekeeper, preventing excessive usage that can degrade service performance and incur unnecessary costs.

Rate Limiting with Vercel KV and Upstash Ratelimit


In this example, you will protect an API endpoint using Vercel KV and Upstash Ratelimit .

app/api/generate/route.ts

import kv from '@vercel/kv';import { openai } from '@ai-sdk/openai';import { streamText } from 'ai';import { Ratelimit } from '@upstash/ratelimit';import { NextRequest } from 'next/server';
// Allow streaming responses up to 30 secondsexport const maxDuration = 30;
// Create Rate limitconst ratelimit = new Ratelimit({  redis: kv,  limiter: Ratelimit.fixedWindow(5, '30s'),});
export async function POST(req: NextRequest) {  // call ratelimit with request ip  const ip = req.ip ?? 'ip';  const { success, remaining } = await ratelimit.limit(ip);
  // block the request if unsuccessfull  if (!success) {    return new Response('Ratelimited!', { status: 429 });  }
  const { messages } = await req.json();
  const result = streamText({    model: openai('gpt-3.5-turbo'),    messages,  });
  return result.toUIMessageStreamResponse();}

Simplify API Protection


With Vercel KV and Upstash Ratelimit, it is possible to protect your APIs from such attacks with ease. To learn more about how Ratelimit works and how it can be configured to your needs, see Ratelimit Documentation .

On this page

Rate Limiting

Rate Limiting with Vercel KV and Upstash Ratelimit

Simplify API Protection

Deploy and Scale AI Apps with Vercel.

Vercel delivers the infrastructure and developer experience you need to ship reliable AI-powered applications at scale.

Trusted by industry leaders:

  • OpenAI
  • Photoroom
  • leonardo-ai Logoleonardo-ai Logo
  • zapier Logozapier Logo

Talk to an expert