Documentation Index Fetch the complete documentation index at: https://docs.orq.ai/llms.txt
Use this file to discover all available pages before exploring further.
The AI Router supports all input and output modalities through a single OpenAI-compatible API. All endpoints share the same base URL, authentication, and orq.ai router features: fallbacks , caching , load balancing , and retries .
Modality Endpoint Image input POST /v3/router/chat/completionsPDF input POST /v3/router/chat/completionsImage generation POST /v3/router/images/generationsImage editing POST /v3/router/images/editsImage variations POST /v3/router/images/variationsText to speech POST /v3/router/audio/speechTranscription POST /v3/router/audio/transcriptionsTranslation POST /v3/router/audio/translations
All endpoints use the same base URL and authentication:
BASE_URL = https://api.orq.ai/v3/router
Authorization: Bearer $ORQ_API_KEY
To see which models support a specific modality, filter the Supported Models page or check the Providers page in orq.ai .
Image input Analyze images alongside text. Pass image URLs or base64-encoded files in chat/completions messages.
PDF input Send PDF documents for extraction and analysis. Supported natively by compatible models.
Image generation Generate, edit, and vary images using DALL-E 2, DALL-E 3, and GPT Image 1.
Audio Convert text to speech, transcribe audio files, and translate audio to English.
Analyze images alongside text using POST /v3/router/chat/completions. Pass images as public URLs or base64-encoded data in the message content array.
curl -X POST https://api.orq.ai/v3/router/chat/completions \
-H "Authorization: Bearer $ORQ_API_KEY " \
-H "Content-Type: application/json" \
-d '{
"model": "openai/gpt-4o",
"messages": [
{
"role": "user",
"content": [
{
"type": "text",
"text": "What is in this image? Describe in detail."
},
{
"type": "image_url",
"image_url": {
"url": "https://example.com/image.jpg"
}
}
]
}
]
}'
Format Use case Max size JPEG/JPG Photos, general images 20MB PNG Screenshots, diagrams 20MB GIF Static images only 20MB WebP Modern web images 20MB Base64 Embedded image data Model context limit URLs Public image links Model context limit
Detail levels
Level Resolution Speed Cost Use case "low"512x512 Fast Low Quick overview "high"Full resolution Slow High Detailed analysis "auto"Model decides Medium Medium Balanced (default)
Set detail in the image_url object:
{
"type" : "image_url" ,
"image_url" : {
"url" : "https://example.com/image.jpg" ,
"detail" : "high"
}
}
Patterns
Multiple images:
const content = [
{ type: "text" , text: "Compare these before and after photos. What changes do you notice?" },
{ type: "image_url" , image_url: { url: "https://example.com/before.jpg" , detail: "high" } },
{ type: "image_url" , image_url: { url: "https://example.com/after.jpg" , detail: "high" } },
];
const response = await client.chat.completions. create ({
model: "openai/gpt-4o" ,
messages: [{ role: "user" , content }],
});
OCR and text extraction:
const response = await client.chat.completions. create ({
model: "openai/gpt-4o" ,
messages: [
{
role: "user" ,
content: [
{
type: "text" ,
text: "Extract all text from this image. Return as plain text, preserving formatting where possible." ,
},
{
type: "image_url" ,
image_url: { url: imageUrl, detail: "high" },
},
],
},
],
});
Structured output:
from pydantic import BaseModel
from typing import List
class ImageAnalysis ( BaseModel ):
objects: List[ str ]
text_content: str
dominant_colors: List[ str ]
confidence: float
response = client.beta.chat.completions.parse(
model = "openai/gpt-4o" ,
messages = [{
"role" : "user" ,
"content" : [
{ "type" : "text" , "text" : "Analyze this image systematically" },
{ "type" : "image_url" , "image_url" : { "url" : image_url}}
]
}],
response_format = ImageAnalysis
)
Limitations
Limitation Details Workaround File size 20MB max per image Compress before upload Image count Varies by model (5-16) Process in batches Video Static images only Extract frames for analysis Privacy Images sent to provider Use on-premise models if needed
Send PDF documents directly in chat messages for analysis and content extraction using POST /v3/router/chat/completions.
PDF input support varies by model. See the Supported Models page and check your provider’s documentation for PDF capability.
curl -X POST https://api.orq.ai/v3/router/chat/completions \
-H "Authorization: Bearer $ORQ_API_KEY " \
-H "Content-Type: application/json" \
-d '{
"model": "openai/gpt-4o",
"messages": [
{
"role": "user",
"content": [
{
"type": "text",
"text": "Please analyze this PDF document and provide a summary"
},
{
"type": "file",
"file": {
"file_data": "data:application/pdf;base64,YOUR_BASE64_ENCODED_PDF",
"filename": "document.pdf"
}
}
]
}
]
}'
Parameters
Parameter Type Required Description type"file"Yes Content type for file input file.file_datastring Yes Data URI with base64 PDF content file.filenamestring Yes Name of the file for model context
Format: data:application/pdf;base64,{base64_content}
Use cases
Scenario Example prompt Contract analysis ”Extract key terms and obligations” Invoice processing ”Extract amounts, dates, vendor info” Research papers ”Summarize methodology and findings” Form extraction ”Convert form data to JSON”
Limitations
Limitation Details Workaround File size Model context limits Split large PDFs Scanned documents Quality varies by model Use OCR preprocessing Complex layouts Tables and charts may not extract well Use structured prompts Security Sensitive documents sent to provider Use on-premise models
Image generation
Generate images from a text prompt using POST /v3/router/images/generations.
For the full and up-to-date list of supported image models, see Image models on the Supported Models page.
import OpenAI from "openai" ;
const client = new OpenAI ({
apiKey: process.env. ORQ_API_KEY ,
baseURL: "https://api.orq.ai/v3/router" ,
});
const response = await client.images. generate ({
model: "openai/gpt-image-1" ,
prompt: "A futuristic city skyline at sunset, photorealistic" ,
n: 1 ,
size: "1024x1024" ,
});
console. log (response.data[ 0 ].b64_json?. slice ( 0 , 40 ));
Parameters
Parameter Description modelModel ID promptText description of the desired image nNumber of images to generate sizeImage dimensions (see Supported Models for per-model sizes) response_formaturl or b64_json. DALL-E 2/3 only; gpt-image-1 always returns b64_jsonqualityImage quality level. Values vary by model stylevivid or natural. DALL-E 3 onlybackgroundtransparent, opaque, or auto. gpt-image-1 onlyoutput_formatpng, jpeg, or webp. gpt-image-1 onlyoutput_compressionCompression level 0-100%. gpt-image-1 only moderationauto or low. gpt-image-1 only
Set response_format to url to receive a hosted image link, or b64_json to receive the image inline as a base64-encoded string.
{
"created" : 1234567890 ,
"data" : [
{
"b64_json" : "iVBORw0KGgo..."
}
]
}
Image editing
Modify an existing image using a prompt and an optional mask with POST /v3/router/images/edits.
import fs from "fs" ;
import OpenAI from "openai" ;
const client = new OpenAI ({
apiKey: process.env. ORQ_API_KEY ,
baseURL: "https://api.orq.ai/v3/router" ,
});
const response = await client.images. edit ({
model: "openai/gpt-image-1" ,
image: fs. createReadStream ( "original.png" ),
prompt: "Add a sunset sky behind the buildings" ,
size: "1024x1024" ,
});
console. log (response.data[ 0 ].b64_json?. slice ( 0 , 40 ));
Parameter Description modelModel ID imagePNG, WEBP, or JPEG file to edit. Some models accept an array of images promptText description of the desired edit maskOptional PNG mask where transparent areas indicate where to edit sizeOutput image dimensions response_formaturl or b64_json. gpt-image-1 always returns b64_jsonqualityImage quality level. Values vary by model
Image variations
Generate variations of an existing image with POST /v3/router/images/variations. See Image models for which models support variations.
import fs from "fs" ;
import OpenAI from "openai" ;
const client = new OpenAI ({
apiKey: process.env. ORQ_API_KEY ,
baseURL: "https://api.orq.ai/v3/router" ,
});
const response = await client.images. createVariation ({
model: "openai/dall-e-2" ,
image: fs. createReadStream ( "original.png" ),
size: "1024x1024" ,
response_format: "url" ,
});
response.data. forEach (( img ) => console. log (img.url));
Parameter Description modelModel ID imagePNG image to create a variation of nNumber of variations to generate (1-10) sizeOutput image dimensions response_formaturl or b64_json
Fallbacks and reliability
Image endpoints support the same fallbacks and retry parameters as chat completions:
const response = await client.images. generate ({
model: "openai/gpt-image-1" ,
prompt: "A mountain lake at dawn" ,
size: "1024x1024" ,
// @ts-ignore - orq.ai extension
fallbacks: [ "openai/dall-e-3" , "openai/dall-e-2" ],
});
Audio
The AI Router exposes three OpenAI-compatible audio endpoints. All support fallbacks , load balancing , and retries .
Text to speech
Convert text to audio using POST /v3/router/audio/speech.
Provider Model OpenAI openai/tts-1OpenAI openai/tts-1-hdOpenAI openai/gpt-4o-mini-ttsElevenLabs elevenlabs/eleven_multilingual_v2ElevenLabs elevenlabs/eleven_turbo_v2_5ElevenLabs elevenlabs/eleven_flash_v2_5ElevenLabs elevenlabs/eleven_flash_v2Google AI google-ai/gemini-2.5-flash-preview-ttsGoogle AI google-ai/gemini-2.5-pro-preview-ttsVertex AI google/gemini-2.5-flash-preview-ttsVertex AI google/gemini-2.5-pro-preview-tts
import OpenAI from "openai" ;
import fs from "fs" ;
const client = new OpenAI ({
apiKey: process.env. ORQ_API_KEY ,
baseURL: "https://api.orq.ai/v3/router" ,
});
const response = await client.audio.speech. create ({
model: "openai/tts-1" ,
voice: "alloy" ,
input: "Hello, welcome to Acme Corp. How can I help you today?" ,
response_format: "mp3" ,
});
const buffer = Buffer. from ( await response. arrayBuffer ());
fs. writeFileSync ( "output.mp3" , buffer);
Streaming: Process audio chunks in real time as they arrive, useful for low-latency playback pipelines.
const response = await client.audio.speech. create ({
model: "openai/tts-1" ,
voice: "alloy" ,
input: "Hello, welcome to Acme Corp. How can I help you today?" ,
response_format: "pcm" ,
});
const reader = response.body ! . getReader ();
while ( true ) {
const { done , value } = await reader. read ();
if (done) break ;
processAudioChunk (value);
}
Parameters:
Parameter Description modelModel ID inputText to synthesize. Maximum length varies by provider voiceVoice ID. See voices table below response_formatOutput format: mp3, opus, aac, flac, wav, pcm. Supported values vary by provider speedPlayback speed of the generated audio
Voices:
Provider Voices OpenAI alloy, echo, fable, onyx, nova, shimmerElevenLabs aria, roger, sarah, laura, charlie, george, callum, river, liam, charlotte, alice, matilda, will, jessica, eric, chris
Transcription
Transcribe an audio file to text using POST /v3/router/audio/transcriptions.
Provider Model OpenAI openai/whisper-1OpenAI openai/gpt-4o-transcribeOpenAI openai/gpt-4o-mini-transcribeElevenLabs elevenlabs/scribe_v1Groq groq/whisper-large-v3Groq groq/whisper-large-v3-turboMistral mistral/voxtral-mini-2507Azure azure/whisper
For the full and up-to-date list of transcription models, see Speech-to-Text models on the Supported Models page.
import OpenAI from "openai" ;
import fs from "fs" ;
const client = new OpenAI ({
apiKey: process.env. ORQ_API_KEY ,
baseURL: "https://api.orq.ai/v3/router" ,
});
const transcription = await client.audio.transcriptions. create ({
model: "openai/gpt-4o-transcribe" ,
file: fs. createReadStream ( "meeting.mp3" ),
response_format: "json" ,
language: "en" ,
});
console. log (transcription.text);
Parameters:
Parameter Description modelModel ID fileAudio file to transcribe. Supported formats: flac, mp3, mp4, mpeg, mpga, m4a, ogg, wav, webm languageISO-639-1 language code of the input audio (e.g. en, fr, de) promptOptional text to guide the model’s style or continue a previous segment response_formatjson, text, srt, verbose_json, or vtttemperatureSampling temperature between 0 and 1 timestamp_granularitiesArray of granularities: ["word"], ["segment"], or ["word", "segment"]. Requires verbose_json. Not supported by all models diarizeAnnotate which speaker is talking in the file. ElevenLabs only num_speakersMaximum number of speakers to identify. ElevenLabs only tag_audio_eventsTag non-speech events such as (laughter) or (applause). ElevenLabs only enable_loggingSet to false to disable logging and enable zero data retention
Translation
The OpenAI translation endpoint only supports openai/whisper-1. gpt-4o-transcribe and gpt-4o-mini-transcribe do not support translation.
Transcribe and translate audio to English using POST /v3/router/audio/translations. The output is always in English regardless of the source language.
import OpenAI from "openai" ;
import fs from "fs" ;
const client = new OpenAI ({
apiKey: process.env. ORQ_API_KEY ,
baseURL: "https://api.orq.ai/v3/router" ,
});
const translation = await client.audio.translations. create ({
model: "openai/whisper-1" ,
file: fs. createReadStream ( "interview_french.mp3" ),
response_format: "json" ,
});
console. log (translation.text);
Translation supports the same response_format and temperature parameters as transcription.