File: retrieval-augmented-generation.md | Updated: 11/15/2025
Menu
Google Gemini Image Generation
Get started with Claude 3.7 Sonnet
Get started with OpenAI o3-mini
Generate Text with Chat Prompt
Generate Image with Chat Prompt
streamText Multi-Step Cookbook
Markdown Chatbot with Memoization
Generate Object with File Prompt through Form Submission
Model Context Protocol (MCP) Tools
Share useChat State Across Components
Human-in-the-Loop Agent with Next.js
Render Visual Interface in Chat
Generate Text with Chat Prompt
Generate Text with Image Prompt
Generate Object with a Reasoning Model
Stream Object with Image Prompt
Record Token Usage After Streaming Object
Record Final Object after Streaming Object
Model Context Protocol (MCP) Tools
Retrieval Augmented Generation
Copy markdown
Retrieval Augmented Generation
=================================================================================================================================
Retrieval Augmented Generation (RAG) is a technique that enhances the capabilities of language models by providing them with relevant information from external sources during the generation process. This approach allows the model to access and incorporate up-to-date or specific knowledge that may not be present in its original training data.
This example uses the following essay
as an input (essay.txt). This example uses a simple in-memory vector database to store and retrieve relevant information. Alternatively, you can check out our Knowledge Base Agent example
which uses Upstash Search to generate embeddings and manage the knowledge base.
For a more in-depth guide, check out the RAG Chatbot Guide which will show you how to build a RAG chatbot with Next.js , Drizzle ORM and Postgres .
import fs from 'fs';import path from 'path';import dotenv from 'dotenv';import { openai } from '@ai-sdk/openai';import { cosineSimilarity, embed, embedMany, generateText } from 'ai';
dotenv.config();
async function main() { const db: { embedding: number[]; value: string }[] = [];
const essay = fs.readFileSync(path.join(__dirname, 'essay.txt'), 'utf8'); const chunks = essay .split('.') .map(chunk => chunk.trim()) .filter(chunk => chunk.length > 0 && chunk !== '\n');
const { embeddings } = await embedMany({ model: openai.textEmbeddingModel('text-embedding-3-small'), values: chunks, }); embeddings.forEach((e, i) => { db.push({ embedding: e, value: chunks[i], }); });
const input = 'What were the two main things the author worked on before college?';
const { embedding } = await embed({ model: openai.textEmbeddingModel('text-embedding-3-small'), value: input, }); const context = db .map(item => ({ document: item, similarity: cosineSimilarity(embedding, item.embedding), })) .sort((a, b) => b.similarity - a.similarity) .slice(0, 3) .map(r => r.document.value) .join('\n');
const { text } = await generateText({ model: openai('gpt-4o'), prompt: `Answer the following question based only on the provided context: ${context}
Question: ${input}`, }); console.log(text);}
main().catch(console.error);
On this page
Retrieval Augmented Generation
Deploy and Scale AI Apps with Vercel.
Vercel delivers the infrastructure and developer experience you need to ship reliable AI-powered applications at scale.
Trusted by industry leaders: