📄 ai-sdk/docs/reference/ai-sdk-core/transcribe

File: transcribe.md | Updated: 11/15/2025

Source: https://ai-sdk.dev/docs/reference/ai-sdk-core/transcribe

Menu

v5 (Latest)

AI SDK 5.x

AI SDK by Vercel

Providers and Models

Getting Started

Navigating the Library

Next.js App Router

Next.js Pages Router

Building Agents

Workflow Patterns

Generating Text

Generating Structured Data

Model Context Protocol (MCP) Tools

Prompt Engineering

Image Generation

Language Model Middleware

Provider & Model Management

Chatbot Message Persistence

Chatbot Resume Streams

Chatbot Tool Usage

Generative User Interfaces

Object Generation

Streaming Custom Data

Reading UIMessage Streams

Message Metadata

Stream Protocols

experimental_createMCPClient

Experimental_StdioMCPTransport

validateUIMessages

safeValidateUIMessages

createProviderRegistry

cosineSimilarity

wrapLanguageModel

LanguageModelV2Middleware

extractReasoningMiddleware

simulateStreamingMiddleware

defaultSettingsMiddleware

simulateReadableStream

createIdGenerator

Migration Guides

Troubleshooting

Copy markdown

======================================================================================

transcribe is an experimental feature.

Generates a transcript from an audio file.

import { experimental_transcribe as transcribe } from 'ai';import { openai } from '@ai-sdk/openai';import { readFile } from 'fs/promises';
const { text: transcript } = await transcribe({  model: openai.transcription('whisper-1'),  audio: await readFile('audio.mp3'),});
console.log(transcript);

import { experimental_transcribe as transcribe } from "ai"

Parameters

model:

TranscriptionModelV2

The transcription model to use.

audio:

DataContent (string | Uint8Array | ArrayBuffer | Buffer) | URL

The audio file to generate the transcript from.

providerOptions?:

Record<string, Record<string, JSONValue>>

Additional provider-specific options.

maxRetries?:

number

Maximum number of retries. Default: 2.

abortSignal?:

AbortSignal

An optional abort signal to cancel the call.

headers?:

Record<string, string>

Additional HTTP headers for the request.

Returns

text:

string

The complete transcribed text from the audio input.

segments:

Array<{ text: string; startSecond: number; endSecond: number }>

An array of transcript segments, each containing a portion of the transcribed text along with its start and end times in seconds.

language:

string | undefined

The language of the transcript in ISO-639-1 format e.g. "en" for English.

durationInSeconds:

number | undefined

The duration of the transcript in seconds.

warnings:

TranscriptionWarning[]

Warnings from the model provider (e.g. unsupported settings).

responses:

Array<TranscriptionModelResponseMetadata>

Response metadata from the provider. There may be multiple responses if we made multiple calls to the model.

TranscriptionModelResponseMetadata

timestamp:

Date

Timestamp for the start of the generated response.

modelId:

string

The ID of the response model that was used to generate the response.

headers?:

Record<string, string>

Response headers.

On this page

Deploy and Scale AI Apps with Vercel.

Vercel delivers the infrastructure and developer experience you need to ship reliable AI-powered applications at scale.

Trusted by industry leaders:

OpenAI
Photoroom

Talk to an expert