This page describes features extending the AI Gateway, which provides a unified API for accessing multiple AI providers. To learn more, see AI Gateway.
List of supported models
Responses API
Supported Models
| Provider | Model |
|---|---|
| OpenAI | openai/gpt-3.5-turbo |
| OpenAI | openai/gpt-3.5-turbo-0125 |
| OpenAI | openai/gpt-3.5-turbo-16k |
| OpenAI | openai/gpt-4-0125-preview |
| OpenAI | openai/gpt-4-turbo |
| OpenAI | openai/gpt-4-turbo-2024-04-09 |
| OpenAI | openai/gpt-4.1 |
| OpenAI | openai/gpt-4.1-2025-04-14 |
| OpenAI | openai/gpt-4.1-mini |
| OpenAI | openai/gpt-4.1-mini-2025-04-14 |
| OpenAI | openai/gpt-4.1-nano |
| OpenAI | openai/gpt-4.1-nano-2025-04-14 |
| OpenAI | openai/gpt-4o |
| OpenAI | openai/gpt-4o-2024-05-13 |
| OpenAI | openai/gpt-4o-2024-08-06 |
| OpenAI | openai/gpt-4o-mini |
| OpenAI | openai/gpt-4o-mini-2024-07-18 |
| OpenAI | openai/gpt-5 |
| OpenAI | openai/gpt-5-chat-latest |
| OpenAI | openai/gpt-5-mini |
| OpenAI | openai/gpt-5-nano |
| OpenAI | openai/gpt-5-pro |
| OpenAI | openai/gpt-5.1 |
| OpenAI | openai/gpt-5.1-chat-latest |
| OpenAI | openai/o1 |
| OpenAI | openai/o1-2024-12-17 |
| OpenAI | openai/o3 |
| OpenAI | openai/o3-2025-04-16 |
| OpenAI | openai/o3-mini |
| OpenAI | openai/o3-mini-2025-01-31 |
| OpenAI | openai/o4-mini |
| OpenAI | openai/o4-mini-2025-04-16 |
Chat models
| Provider | Model |
|---|---|
| Anthropic | anthropic/claude-3-5-haiku-20241022 |
| Anthropic | anthropic/claude-3-5-sonnet-20241022 |
| Anthropic | anthropic/claude-3-7-sonnet-20250219 |
| Anthropic | anthropic/claude-3-7-sonnet-latest |
| Anthropic | anthropic/claude-3-haiku-20240307 |
| Anthropic | anthropic/claude-haiku-4-5-20251001 |
| Anthropic | anthropic/claude-opus-4-1-20250805 |
| Anthropic | anthropic/claude-opus-4-20250514 |
| Anthropic | anthropic/claude-sonnet-4-20250514 |
| Anthropic | anthropic/claude-sonnet-4-5-20250929 |
| AWS Bedrock | aws/anthropic.claude-3-5-sonnet-20241022-v2:0 |
| AWS Bedrock | aws/anthropic.claude-3-haiku-20240307-v1:0 |
| AWS Bedrock | aws/anthropic.claude-3-opus-20240229-v1:0 |
| AWS Bedrock | aws/anthropic.claude-3-sonnet-20240229-v1:0 |
| AWS Bedrock | aws/eu.anthropic.claude-3-5-sonnet-20240620-v1:0 |
| AWS Bedrock | aws/eu.anthropic.claude-3-7-sonnet-20250219-v1:0 |
| AWS Bedrock | aws/eu.anthropic.claude-sonnet-4-20250514-v1:0 |
| Azure | azure/gpt-4.1 |
| Azure | azure/gpt-4.1-mini |
| Azure | azure/gpt-4.1-nano |
| Azure | azure/gpt-4o |
| Azure | azure/gpt-4o-mini |
| Azure | azure/gpt-5-chat |
| Azure | azure/gpt-5-mini |
| Azure | azure/gpt-5-nano |
| Azure | azure/llama-3.1-405B-instruct |
| Azure | azure/llama-3.1-8B |
| Azure | azure/o1 |
| Azure | azure/o1-mini |
| Azure | azure/o3-mini |
| Cerebras | cerebras/gpt-oss-120b |
| Cerebras | cerebras/llama-3.3-70b |
| Cerebras | cerebras/llama-4-scout-17b-16e-instruct |
| Cerebras | cerebras/llama3.1-8b |
| Cerebras | cerebras/qwen-3-235b-a22b-instruct-2507 |
| Cerebras | cerebras/qwen-3-32b |
| Cerebras | cerebras/qwen-3-coder-480b |
| Cohere | cohere/command-a-03-2025 |
| Cohere | cohere/command-a-reasoning-08-2025 |
| Cohere | cohere/command-a-translate-08-2025 |
| Cohere | cohere/command-a-vision-07-2025 |
| Cohere | cohere/command-r-08-2024 |
| Cohere | cohere/command-r-plus-08-2024 |
| Cohere | cohere/command-r7b-12-2024 |
| Vertex AI | google/claude-3-5-haiku@20241022 |
| Vertex AI | google/claude-3-5-sonnet-v2@20241022 |
| Vertex AI | google/claude-3-7-sonnet@20250219 |
| Vertex AI | google/claude-3-opus@20240229 |
| Vertex AI | google/claude-haiku-4-5@20251001 |
| Vertex AI | google/claude-opus-4-1@20250805 |
| Vertex AI | google/claude-opus-4@20250514 |
| Vertex AI | google/claude-sonnet-4-5@20250929 |
| Vertex AI | google/claude-sonnet-4@20250514 |
| Vertex AI | google/gemini-2.0-flash |
| Vertex AI | google/gemini-2.0-flash-001 |
| Vertex AI | google/gemini-2.0-flash-lite-001 |
| Vertex AI | google/gemini-2.5-flash |
| Vertex AI | google/gemini-2.5-flash-lite |
| Vertex AI | google/gemini-2.5-flash-lite-preview-09-2025 |
| Vertex AI | google/gemini-2.5-flash-preview-09-2025 |
| Vertex AI | google/gemini-2.5-pro |
| Vertex AI | google/gemini-3-pro-preview |
| Vertex AI | google/meta/llama-3.3-70b-instruct-maas |
| Vertex AI | google/meta/llama-4-maverick-17b-128e-instruct-maas |
| Vertex AI | google/meta/llama-4-scout-17b-16e-instruct-maas |
| Vertex AI | google/mistral-small-2503 |
| Google AI | google-ai/gemini-2.0-flash |
| Google AI | google-ai/gemini-2.0-flash-001 |
| Google AI | google-ai/gemini-2.0-flash-lite-001 |
| Google AI | google-ai/gemini-2.0-flash-lite-preview-02-05 |
| Google AI | google-ai/gemini-2.0-flash-thinking-exp-01-21 |
| Google AI | google-ai/gemini-2.0-pro-exp-02-05 |
| Google AI | google-ai/gemini-2.5-flash |
| Google AI | google-ai/gemini-2.5-flash-lite |
| Google AI | google-ai/gemini-2.5-pro |
| Google AI | google-ai/gemini-3-pro-preview |
| Groq | groq/llama-3.3-70b-versatile |
| Groq | groq/meta-llama/llama-4-maverick-17b-128e-instruct |
| Groq | groq/meta-llama/llama-4-scout-17b-16e-instruct |
| Groq | groq/meta-llama/llama-guard-4-12b |
| Groq | groq/meta-llama/llama-prompt-guard-2-86m |
| Groq | groq/moonshotai/kimi-k2-instruct-0905 |
| Groq | groq/openai/gpt-oss-120b |
| Groq | groq/openai/gpt-oss-20b |
| mistral | mistral/magistral-medium-2509 |
| mistral | mistral/ministral-3b-2410 |
| mistral | mistral/ministral-8b-2410 |
| mistral | mistral/mistral-large-2411 |
| mistral | mistral/mistral-medium-2508 |
| mistral | mistral/mistral-medium-latest |
| mistral | mistral/mistral-small-2409 |
| mistral | mistral/mistral-small-latest |
| mistral | mistral/pixtral-large-2411 |
| OpenAI | openai/gpt-3.5-turbo |
| OpenAI | openai/gpt-3.5-turbo-0125 |
| OpenAI | openai/gpt-3.5-turbo-16k |
| OpenAI | openai/gpt-4-0125-preview |
| OpenAI | openai/gpt-4-turbo |
| OpenAI | openai/gpt-4-turbo-2024-04-09 |
| OpenAI | openai/gpt-4.1 |
| OpenAI | openai/gpt-4.1-2025-04-14 |
| OpenAI | openai/gpt-4.1-mini |
| OpenAI | openai/gpt-4.1-mini-2025-04-14 |
| OpenAI | openai/gpt-4.1-nano |
| OpenAI | openai/gpt-4.1-nano-2025-04-14 |
| OpenAI | openai/gpt-4o |
| OpenAI | openai/gpt-4o-2024-05-13 |
| OpenAI | openai/gpt-4o-2024-08-06 |
| OpenAI | openai/gpt-4o-mini |
| OpenAI | openai/gpt-4o-mini-2024-07-18 |
| OpenAI | openai/gpt-5 |
| OpenAI | openai/gpt-5-chat-latest |
| OpenAI | openai/gpt-5-mini |
| OpenAI | openai/gpt-5-nano |
| OpenAI | openai/gpt-5-pro |
| OpenAI | openai/gpt-5.1 |
| OpenAI | openai/gpt-5.1-chat-latest |
| OpenAI | openai/o1 |
| OpenAI | openai/o1-2024-12-17 |
| OpenAI | openai/o3 |
| OpenAI | openai/o3-2025-04-16 |
| OpenAI | openai/o3-mini |
| OpenAI | openai/o3-mini-2025-01-31 |
| OpenAI | openai/o4-mini |
| OpenAI | openai/o4-mini-2025-04-16 |
| Perplexity | perplexity/sonar |
| Perplexity | perplexity/sonar-deep-research |
| Perplexity | perplexity/sonar-pro |
| Perplexity | perplexity/sonar-reasoning |
| Perplexity | perplexity/sonar-reasoning-pro |
| Together AI | togetherai/deepseek-ai/DeepSeek-R1 |
| Together AI | togetherai/deepseek-ai/DeepSeek-V3 |
| Together AI | togetherai/deepseek-ai/DeepSeek-V3.1 |
| Together AI | togetherai/meta-llama/Llama-3.3-70B-Instruct-Turbo |
| Together AI | togetherai/meta-llama/Llama-4-Maverick-17B-128E-Instruct-FP8 |
| Together AI | togetherai/meta-llama/Llama-4-Scout-17B-16E-Instruct |
| Together AI | togetherai/meta-llama/Llama-Guard-4-12B |
Completion models
| Provider | Model |
|---|---|
| OpenAI | openai/gpt-3.5-turbo-instruct |
Embedding models
| Provider | Model |
|---|---|
| Azure | azure/text-embedding-3-small |
| Azure | azure/text-embedding-ada-002 |
| Cohere | cohere/embed-english-light-v3.0 |
| Cohere | cohere/embed-english-v3.0 |
| Cohere | cohere/embed-multilingual-light-v3.0 |
| Cohere | cohere/embed-multilingual-v3.0 |
| Cohere | cohere/embed-v4.0 |
| Vertex AI | google/gemini-embedding-001 |
| Vertex AI | google/multimodalembedding@001 |
| Vertex AI | google/text-multilingual-embedding-002 |
| Google AI | google-ai/text-embedding-004 |
| Jina AI | jina/jina-clip-v1 |
| Jina AI | jina/jina-clip-v2 |
| Jina AI | jina/jina-embeddings-v2-base-code |
| Jina AI | jina/jina-embeddings-v2-base-de |
| Jina AI | jina/jina-embeddings-v2-base-en |
| Jina AI | jina/jina-embeddings-v2-base-es |
| Jina AI | jina/jina-embeddings-v2-base-zh |
| Jina AI | jina/jina-embeddings-v3 |
| mistral | mistral/mistral-embed |
| OpenAI | openai/text-embedding-3-large |
| OpenAI | openai/text-embedding-3-small |
| OpenAI | openai/text-embedding-ada-002 |
Image models
Image Generation
Image Edit
Image Variations
Supported Image Models
| Provider | Model | Capabilities |
|---|---|---|
| Azure | azure/dall-e-3 | Generation, Edit |
| bytedance | bytedance/seededit-3-0-i2i-250628 | Generation, Edit |
| bytedance | bytedance/seedream-3-0-t2i-250415 | Generation |
| bytedance | bytedance/seedream-4-0-250828 | Generation, Edit |
| FAL | fal/flux-pro/new | Generation |
| FAL | fal/flux/dev | Generation |
| FAL | fal/flux/schnell | Generation |
| FAL | fal/gemini-25-flash-image | Generation |
| Vertex AI | google/imagen-3.0-fast-generate-001 | Generation |
| Vertex AI | google/imagen-3.0-generate-001 | Generation |
| Vertex AI | google/imagen-4.0-fast-generate-001 | Generation |
| Vertex AI | google/imagen-4.0-generate-001 | Generation |
| Vertex AI | google/imagen-4.0-ultra-generate-001 | Generation |
| Leonardo AI | leonardoai/leonard-diffusion-xl | Generation, Edit |
| Leonardo AI | leonardoai/leonard-kino-xl | Generation, Edit |
| Leonardo AI | leonardoai/leonard-lightning-xl | Generation, Edit |
| Leonardo AI | leonardoai/leonard-vision-xl | Generation, Edit |
| OpenAI | openai/dall-e-2 | Generation, Edit |
| OpenAI | openai/dall-e-3 | Generation |
| OpenAI | openai/gpt-image-1 | Generation, Edit |
Moderations models
| Provider | Model |
|---|---|
| mistral | mistral/mistral-moderation-2411 |
Rerank models
| Provider | Model |
|---|---|
| Cohere | cohere/rerank-english-v3.0 |
| Cohere | cohere/rerank-multilingual-v3.0 |
| Cohere | cohere/rerank-v3.5 |
| Jina AI | jina/jina-colbert-v2 |
| Jina AI | jina/jina-reranker-v1-base-en |
| Jina AI | jina/jina-reranker-v1-tiny-en |
| Jina AI | jina/jina-reranker-v1-turbo-en |
| Jina AI | jina/jina-reranker-v2-base-multilingual |
Speech-to-Text models
| Provider | Model |
|---|---|
| Azure | azure/whisper |
| Eleven Labs | elevenlabs/scribe_v1 |
| Groq | groq/whisper-large-v3 |
| Groq | groq/whisper-large-v3-turbo |
| mistral | mistral/voxtral-mini-2507 |
| OpenAI | openai/gpt-4o-mini-transcribe |
| OpenAI | openai/gpt-4o-transcribe |
| OpenAI | openai/whisper-1 |
Text-to-Speech models
| Provider | Model |
|---|---|
| Eleven Labs | elevenlabs/eleven_flash_v2 |
| Eleven Labs | elevenlabs/eleven_flash_v2_5 |
| Eleven Labs | elevenlabs/eleven_multilingual_v2 |
| Eleven Labs | elevenlabs/eleven_turbo_v2_5 |
| Vertex AI | google/gemini-2.5-flash-preview-tts |
| Vertex AI | google/gemini-2.5-pro-preview-tts |
| OpenAI | openai/gpt-4o-mini-tts |
| OpenAI | openai/tts-1 |
| OpenAI | openai/tts-1-hd |
Text-to-Speech Voices
The following voices are available for Text-to-Speech models:OpenAI
alloy: Neutral, versatile voiceecho: Neutral, soft-spoken voicefable: Expressive, narrative-focused voiceonyx: Deep, authoritative voicenova: Warm, natural voiceshimmer: Clear, optimistic voice
ElevenLabs
aria: Neutral, versatile voiceroger: Deep, authoritative voicesarah: Warm, friendly voicelaura: Soft, gentle voicecharlie: Casual, conversational voicegeorge: Professional, articulate voicecallum: Youthful, energetic voiceriver: Calm, soothing voiceliam: Clear, confident voicecharlotte: Elegant, refined voicealice: Bright, cheerful voicematilda: Thoughtful, measured voicewill: Reliable, trustworthy voicejessica: Engaging, expressive voiceeric: Authoritative, commanding voicechris: Friendly, approachable voicebrian: Mature, distinguished voicedaniel: Versatile, balanced voicelily: Sweet, melodious voicebill: Grounded, authentic voice
Retries & Error Handling Streaming