📄 ai-sdk/docs/advanced/rate-limiting

Rate limiting helps you protect your APIs from abuse. It involves setting a maximum threshold on the number of requests a client can make within a specified timeframe. This simple technique acts as a gatekeeper, preventing excessive usage that can degrade service performance and incur unnecessary costs.

Rate Limiting with Vercel KV and Upstash Ratelimit

In this example, you will protect an API endpoint using Vercel KV and Upstash Ratelimit .

app/api/generate/route.ts

import kv from '@vercel/kv';import { openai } from '@ai-sdk/openai';import { streamText } from 'ai';import { Ratelimit } from '@upstash/ratelimit';import { NextRequest } from 'next/server';
// Allow streaming responses up to 30 secondsexport const maxDuration = 30;
// Create Rate limitconst ratelimit = new Ratelimit({  redis: kv,  limiter: Ratelimit.fixedWindow(5, '30s'),});
export async function POST(req: NextRequest) {  // call ratelimit with request ip  const ip = req.ip ?? 'ip';  const { success, remaining } = await ratelimit.limit(ip);
  // block the request if unsuccessfull  if (!success) {    return new Response('Ratelimited!', { status: 429 });  }
  const { messages } = await req.json();
  const result = streamText({    model: openai('gpt-3.5-turbo'),    messages,  });
  return result.toUIMessageStreamResponse();}

Simplify API Protection

With Vercel KV and Upstash Ratelimit, it is possible to protect your APIs from such attacks with ease. To learn more about how Ratelimit works and how it can be configured to your needs, see Ratelimit Documentation .

On this page

Rate Limiting

Rate Limiting with Vercel KV and Upstash Ratelimit

Simplify API Protection

Deploy and Scale AI Apps with Vercel.

Vercel delivers the infrastructure and developer experience you need to ship reliable AI-powered applications at scale.

Trusted by industry leaders:

OpenAI
Photoroom

Talk to an expert