File: embeddings.md | Updated: 11/15/2025
Agent Skills are now available! Learn more about extending Claude's capabilities with Agent Skills .
English
Search...
Ctrl K
Search...
Navigation
Capabilities
Embeddings
Home Developer Guide API Reference Model Context Protocol (MCP) Resources Release Notes
On this page
When selecting an embeddings provider, there are several factors you can consider depending on your needs and preferences:
Anthropic does not offer its own embedding model. One embeddings provider that has a wide variety of options and capabilities encompassing all of the above considerations is Voyage AI. Voyage AI makes state-of-the-art embedding models and offers customized models for specific industry domains such as finance and healthcare, or bespoke fine-tuned models for individual customers. The rest of this guide is for Voyage AI, but we encourage you to assess a variety of embeddings vendors to find the best fit for your specific use case.
Voyage recommends using the following text embedding models:
| Model | Context Length | Embedding Dimension | Description |
| --- | --- | --- | --- |
| voyage-3-large | 32,000 | 1024 (default), 256, 512, 2048 | The best general-purpose and multilingual retrieval quality. See blog post<br> for details. |
| voyage-3.5 | 32,000 | 1024 (default), 256, 512, 2048 | Optimized for general-purpose and multilingual retrieval quality. See blog post<br> for details. |
| voyage-3.5-lite | 32,000 | 1024 (default), 256, 512, 2048 | Optimized for latency and cost. See blog post<br> for details. |
| voyage-code-3 | 32,000 | 1024 (default), 256, 512, 2048 | Optimized for code retrieval. See blog post<br> for details. |
| voyage-finance-2 | 32,000 | 1024 | Optimized for finance retrieval and RAG. See blog post<br> for details. |
| voyage-law-2 | 16,000 | 1024 | Optimized for legal and long-context retrieval and RAG. Also improved performance across all domains. See blog post<br> for details. |
Additionally, the following multimodal embedding models are recommended:
| Model | Context Length | Embedding Dimension | Description |
| --- | --- | --- | --- |
| voyage-multimodal-3 | 32000 | 1024 | Rich multimodal embedding model that can vectorize interleaved text and content-rich images, such as screenshots of PDFs, slides, tables, figures, and more. See blog post<br> for details. |
Need help deciding which text embedding model to use? Check out the FAQ .
To access Voyage embeddings:
Copy
export VOYAGE_API_KEY="<your secret key>"
You can obtain the embeddings by either using the official voyageai Python package
or HTTP requests, as described below.
Voyage Python library
The voyageai package can be installed using the following command:
Copy
pip install -U voyageai
Then, you can create a client object and start using it to embed your texts:
Copy
import voyageai
vo = voyageai.Client()
# This will automatically use the environment variable VOYAGE_API_KEY.
# Alternatively, you can use vo = voyageai.Client(api_key="<your secret key>")
texts = ["Sample text 1", "Sample text 2"]
result = vo.embed(texts, model="voyage-3.5", input_type="document")
print(result.embeddings[0])
print(result.embeddings[1])
result.embeddings will be a list of two embedding vectors, each containing 1024 floating-point numbers. After running the above code, the two embeddings will be printed on the screen:
Copy
[-0.013131560757756233, 0.019828535616397858, ...] # embedding for "Sample text 1"
[-0.0069352793507277966, 0.020878976210951805, ...] # embedding for "Sample text 2"
When creating the embeddings, you can specify a few other arguments to the embed() function. For more information on the Voyage python package, see the Voyage documentation
.
Voyage HTTP API
You can also get embeddings by requesting Voyage HTTP API. For example, you can send an HTTP request through the curl command in a terminal:
Copy
curl https://api.voyageai.com/v1/embeddings \
-H "Content-Type: application/json" \
-H "Authorization: Bearer $VOYAGE_API_KEY" \
-d '{
"input": ["Sample text 1", "Sample text 2"],
"model": "voyage-3.5"
}'
The response you would get is a JSON object containing the embeddings and the token usage:
Copy
{
"object": "list",
"data": [\
{\
"embedding": [-0.013131560757756233, 0.019828535616397858, ...],\
"index": 0\
},\
{\
"embedding": [-0.0069352793507277966, 0.020878976210951805, ...],\
"index": 1\
}\
],
"model": "voyage-3.5",
"usage": {
"total_tokens": 10
}
}
For more information on the Voyage HTTP API, see the Voyage documentation .
AWS Marketplace
Voyage embeddings are available on AWS Marketplace . Instructions for accessing Voyage on AWS are available here .
Now that we know how to get embeddings, let’s see a brief example. Suppose we have a small corpus of six documents to retrieve from
Copy
documents = [\
"The Mediterranean diet emphasizes fish, olive oil, and vegetables, believed to reduce chronic diseases.",\
"Photosynthesis in plants converts light energy into glucose and produces essential oxygen.",\
"20th-century innovations, from radios to smartphones, centered on electronic advancements.",\
"Rivers provide water, irrigation, and habitat for aquatic species, vital for ecosystems.",\
"Apple's conference call to discuss fourth fiscal quarter results and business updates is scheduled for Thursday, November 2, 2023 at 2:00 p.m. PT / 5:00 p.m. ET.",\
"Shakespeare's works, like 'Hamlet' and 'A Midsummer Night's Dream,' endure in literature."\
]
We will first use Voyage to convert each of them into an embedding vector
Copy
import voyageai
vo = voyageai.Client()
# Embed the documents
doc_embds = vo.embed(
documents, model="voyage-3.5", input_type="document"
).embeddings
The embeddings will allow us to do semantic search / retrieval in the vector space. Given an example query,
Copy
query = "When is Apple's conference call scheduled?"
we convert it into an embedding, and conduct a nearest neighbor search to find the most relevant document based on the distance in the embedding space.
Copy
import numpy as np
# Embed the query
query_embd = vo.embed(
[query], model="voyage-3.5", input_type="query"
).embeddings[0]
# Compute the similarity
# Voyage embeddings are normalized to length 1, therefore dot-product
# and cosine similarity are the same.
similarities = np.dot(doc_embds, query_embd)
retrieved_id = np.argmax(similarities)
print(documents[retrieved_id])
Note that we use input_type="document" and input_type="query" for embedding the document and query, respectively. More specification can be found here
. The output would be the 5th document, which is indeed the most relevant to the query:
Copy
Apple's conference call to discuss fourth fiscal quarter results and business updates is scheduled for Thursday, November 2, 2023 at 2:00 p.m. PT / 5:00 p.m. ET.
If you are looking for a detailed set of cookbooks on how to do RAG with embeddings, including vector databases, check out our RAG cookbook .
Why do Voyage embeddings have superior quality?
Embedding models rely on powerful neural networks to capture and compress semantic context, similar to generative models. Voyage’s team of experienced AI researchers optimizes every component of the embedding process, including:
Learn more about Voyage’s technical approach on their blog .
What embedding models are available and which should I use?
For general-purpose embedding, we recommend:
voyage-3-large: Best qualityvoyage-3.5-lite: Lowest latency and costvoyage-3.5: Balanced performance with superior retrieval quality at a competitive price pointFor retrieval, use the input_type parameter to specify whether the text is a query or document type.Domain-specific models:
voyage-law-2voyage-code-3voyage-finance-2Which similarity function should I use?
You can use Voyage embeddings with either dot-product similarity, cosine similarity, or Euclidean distance. An explanation about embedding similarity can be found here .Voyage AI embeddings are normalized to length 1, which means that:
What is the relationship between characters, words, and tokens?
Please see this page .
When and how should I use the input_type parameter?
For all retrieval tasks and use cases (e.g., RAG), we recommend that the input_type parameter be used to specify whether the input text is a query or document. Do not omit input_type or set input_type=None. Specifying whether input text is a query or document can create better dense vector representations for retrieval, which can lead to better retrieval quality.When using the input_type parameter, special prompts are prepended to the input text prior to embedding. Specifically:
📘 Prompts associated with
input_type
- For a query, the prompt is “Represent the query for retrieving supporting documents: “.
- For a document, the prompt is “Represent the document for retrieval: “.
- Example
- When
input_type="query", a query like “When is Apple’s conference call scheduled?” will become “Represent the query for retrieving supporting documents: When is Apple’s conference call scheduled?”- When
input_type="document", a query like “Apple’s conference call to discuss fourth fiscal quarter results and business updates is scheduled for Thursday, November 2, 2023 at 2:00 p.m. PT / 5:00 p.m. ET.” will become “Represent the document for retrieval: Apple’s conference call to discuss fourth fiscal quarter results and business updates is scheduled for Thursday, November 2, 2023 at 2:00 p.m. PT / 5:00 p.m. ET.”
voyage-large-2-instruct, as the name suggests, is trained to be responsive to additional instructions that are prepended to the input text. For classification, clustering, or other MTEB
subtasks, please use the instructions here
.
What quantization options are available?
Quantization in embeddings converts high-precision values, like 32-bit single-precision floating-point numbers, to lower-precision formats such as 8-bit integers or 1-bit binary values, reducing storage, memory, and costs by 4x and 32x, respectively. Supported Voyage models enable quantization by specifying the output data type with the output_dtype parameter:
float: Each returned embedding is a list of 32-bit (4-byte) single-precision floating-point numbers. This is the default and provides the highest precision / retrieval accuracy.int8 and uint8: Each returned embedding is a list of 8-bit (1-byte) integers ranging from -128 to 127 and 0 to 255, respectively.binary and ubinary: Each returned embedding is a list of 8-bit integers that represent bit-packed, quantized single-bit embedding values: int8 for binary and uint8 for ubinary. The length of the returned list of integers is 1/8 of the actual dimension of the embedding. The binary type uses the offset binary method, which you can learn more about in the FAQ below.Binary quantization example Consider the following eight embedding values: -0.03955078, 0.006214142, -0.07446289, -0.039001465, 0.0046463013, 0.00030612946, -0.08496094, and 0.03994751. With binary quantization, values less than or equal to zero will be quantized to a binary zero, and positive values to a binary one, resulting in the following binary sequence: 0, 1, 0, 0, 1, 1, 0, 1. These eight bits are then packed into a single 8-bit integer, 01001101 (with the leftmost bit as the most significant bit).
ubinary: The binary sequence is directly converted and represented as the unsigned integer (uint8) 77.binary: The binary sequence is represented as the signed integer (int8) -51, calculated using the offset binary method (77 - 128 = -51).
How can I truncate Matryoshka embeddings?
Matryoshka learning creates embeddings with coarse-to-fine representations within a single vector. Voyage models, such as voyage-code-3, that support multiple output dimensions generate such Matryoshka embeddings. You can truncate these vectors by keeping the leading subset of dimensions. For example, the following Python code demonstrates how to truncate 1024-dimensional vectors to 256 dimensions:
Copy
import voyageai
import numpy as np
def embd_normalize(v: np.ndarray) -> np.ndarray:
"""
Normalize the rows of a 2D numpy array to unit vectors by dividing each row by its Euclidean
norm. Raises a ValueError if any row has a norm of zero to prevent division by zero.
"""
row_norms = np.linalg.norm(v, axis=1, keepdims=True)
if np.any(row_norms == 0):
raise ValueError("Cannot normalize rows with a norm of zero.")
return v / row_norms
vo = voyageai.Client()
# Generate voyage-code-3 vectors, which by default are 1024-dimensional floating-point numbers
embd = vo.embed(['Sample text 1', 'Sample text 2'], model='voyage-code-3').embeddings
# Set shorter dimension
short_dim = 256
# Resize and normalize vectors to shorter dimension
resized_embd = embd_normalize(np.array(embd)[:, :short_dim]).tolist()
Visit Voyage’s pricing page for the most up to date pricing details.
Was this page helpful?
YesNo
Assistant
Responses are generated using AI and may contain mistakes.