Skip to main content

Documentation Index

Fetch the complete documentation index at: https://docs.orq.ai/llms.txt

Use this file to discover all available pages before exploring further.

The AI Router supports all input and output modalities through a single OpenAI-compatible API. All endpoints share the same base URL, authentication, and orq.ai router features: fallbacks, caching, load balancing, and retries.
ModalityEndpoint
Image inputPOST /v3/router/chat/completions
PDF inputPOST /v3/router/chat/completions
Image generationPOST /v3/router/images/generations
Image editingPOST /v3/router/images/edits
Image variationsPOST /v3/router/images/variations
Text to speechPOST /v3/router/audio/speech
TranscriptionPOST /v3/router/audio/transcriptions
TranslationPOST /v3/router/audio/translations
All endpoints use the same base URL and authentication:
BASE_URL=https://api.orq.ai/v3/router
Authorization: Bearer $ORQ_API_KEY
To see which models support a specific modality, filter the Supported Models page or check the Providers page in orq.ai.

Image input

Analyze images alongside text. Pass image URLs or base64-encoded files in chat/completions messages.

PDF input

Send PDF documents for extraction and analysis. Supported natively by compatible models.

Image generation

Generate, edit, and vary images using DALL-E 2, DALL-E 3, and GPT Image 1.

Audio

Convert text to speech, transcribe audio files, and translate audio to English.

Image input

Analyze images alongside text using POST /v3/router/chat/completions. Pass images as public URLs or base64-encoded data in the message content array.
curl -X POST https://api.orq.ai/v3/router/chat/completions \
  -H "Authorization: Bearer $ORQ_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "model": "openai/gpt-4o",
    "messages": [
      {
        "role": "user",
        "content": [
          {
            "type": "text",
            "text": "What is in this image? Describe in detail."
          },
          {
            "type": "image_url",
            "image_url": {
              "url": "https://example.com/image.jpg"
            }
          }
        ]
      }
    ]
  }'

Supported formats

FormatUse caseMax size
JPEG/JPGPhotos, general images20MB
PNGScreenshots, diagrams20MB
GIFStatic images only20MB
WebPModern web images20MB
Base64Embedded image dataModel context limit
URLsPublic image linksModel context limit

Detail levels

LevelResolutionSpeedCostUse case
"low"512x512FastLowQuick overview
"high"Full resolutionSlowHighDetailed analysis
"auto"Model decidesMediumMediumBalanced (default)
Set detail in the image_url object:
{
  "type": "image_url",
  "image_url": {
    "url": "https://example.com/image.jpg",
    "detail": "high"
  }
}

Patterns

Multiple images:
const content = [
  { type: "text", text: "Compare these before and after photos. What changes do you notice?" },
  { type: "image_url", image_url: { url: "https://example.com/before.jpg", detail: "high" } },
  { type: "image_url", image_url: { url: "https://example.com/after.jpg", detail: "high" } },
];

const response = await client.chat.completions.create({
  model: "openai/gpt-4o",
  messages: [{ role: "user", content }],
});
OCR and text extraction:
const response = await client.chat.completions.create({
  model: "openai/gpt-4o",
  messages: [
    {
      role: "user",
      content: [
        {
          type: "text",
          text: "Extract all text from this image. Return as plain text, preserving formatting where possible.",
        },
        {
          type: "image_url",
          image_url: { url: imageUrl, detail: "high" },
        },
      ],
    },
  ],
});
Structured output:
from pydantic import BaseModel
from typing import List

class ImageAnalysis(BaseModel):
    objects: List[str]
    text_content: str
    dominant_colors: List[str]
    confidence: float

response = client.beta.chat.completions.parse(
    model="openai/gpt-4o",
    messages=[{
        "role": "user",
        "content": [
            {"type": "text", "text": "Analyze this image systematically"},
            {"type": "image_url", "image_url": {"url": image_url}}
        ]
    }],
    response_format=ImageAnalysis
)

Limitations

LimitationDetailsWorkaround
File size20MB max per imageCompress before upload
Image countVaries by model (5-16)Process in batches
VideoStatic images onlyExtract frames for analysis
PrivacyImages sent to providerUse on-premise models if needed

PDF input

Send PDF documents directly in chat messages for analysis and content extraction using POST /v3/router/chat/completions.
PDF input support varies by model. See the Supported Models page and check your provider’s documentation for PDF capability.
curl -X POST https://api.orq.ai/v3/router/chat/completions \
  -H "Authorization: Bearer $ORQ_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "model": "openai/gpt-4o",
    "messages": [
      {
        "role": "user",
        "content": [
          {
            "type": "text",
            "text": "Please analyze this PDF document and provide a summary"
          },
          {
            "type": "file",
            "file": {
              "file_data": "data:application/pdf;base64,YOUR_BASE64_ENCODED_PDF",
              "filename": "document.pdf"
            }
          }
        ]
      }
    ]
  }'

Parameters

ParameterTypeRequiredDescription
type"file"YesContent type for file input
file.file_datastringYesData URI with base64 PDF content
file.filenamestringYesName of the file for model context
Format: data:application/pdf;base64,{base64_content}

Use cases

ScenarioExample prompt
Contract analysis”Extract key terms and obligations”
Invoice processing”Extract amounts, dates, vendor info”
Research papers”Summarize methodology and findings”
Form extraction”Convert form data to JSON”

Limitations

LimitationDetailsWorkaround
File sizeModel context limitsSplit large PDFs
Scanned documentsQuality varies by modelUse OCR preprocessing
Complex layoutsTables and charts may not extract wellUse structured prompts
SecuritySensitive documents sent to providerUse on-premise models

Image generation

Generate images from a text prompt using POST /v3/router/images/generations.
For the full and up-to-date list of supported image models, see Image models on the Supported Models page.
import OpenAI from "openai";

const client = new OpenAI({
  apiKey: process.env.ORQ_API_KEY,
  baseURL: "https://api.orq.ai/v3/router",
});

const response = await client.images.generate({
  model: "openai/gpt-image-1",
  prompt: "A futuristic city skyline at sunset, photorealistic",
  n: 1,
  size: "1024x1024",
});

console.log(response.data[0].b64_json?.slice(0, 40));

Parameters

ParameterDescription
modelModel ID
promptText description of the desired image
nNumber of images to generate
sizeImage dimensions (see Supported Models for per-model sizes)
response_formaturl or b64_json. DALL-E 2/3 only; gpt-image-1 always returns b64_json
qualityImage quality level. Values vary by model
stylevivid or natural. DALL-E 3 only
backgroundtransparent, opaque, or auto. gpt-image-1 only
output_formatpng, jpeg, or webp. gpt-image-1 only
output_compressionCompression level 0-100%. gpt-image-1 only
moderationauto or low. gpt-image-1 only
Set response_format to url to receive a hosted image link, or b64_json to receive the image inline as a base64-encoded string.
{
  "created": 1234567890,
  "data": [
    {
      "b64_json": "iVBORw0KGgo..."
    }
  ]
}

Image editing

Modify an existing image using a prompt and an optional mask with POST /v3/router/images/edits.
import fs from "fs";
import OpenAI from "openai";

const client = new OpenAI({
  apiKey: process.env.ORQ_API_KEY,
  baseURL: "https://api.orq.ai/v3/router",
});

const response = await client.images.edit({
  model: "openai/gpt-image-1",
  image: fs.createReadStream("original.png"),
  prompt: "Add a sunset sky behind the buildings",
  size: "1024x1024",
});

console.log(response.data[0].b64_json?.slice(0, 40));
ParameterDescription
modelModel ID
imagePNG, WEBP, or JPEG file to edit. Some models accept an array of images
promptText description of the desired edit
maskOptional PNG mask where transparent areas indicate where to edit
sizeOutput image dimensions
response_formaturl or b64_json. gpt-image-1 always returns b64_json
qualityImage quality level. Values vary by model

Image variations

Generate variations of an existing image with POST /v3/router/images/variations. See Image models for which models support variations.
import fs from "fs";
import OpenAI from "openai";

const client = new OpenAI({
  apiKey: process.env.ORQ_API_KEY,
  baseURL: "https://api.orq.ai/v3/router",
});

const response = await client.images.createVariation({
  model: "openai/dall-e-2",
  image: fs.createReadStream("original.png"),
  size: "1024x1024",
  response_format: "url",
});

response.data.forEach((img) => console.log(img.url));
ParameterDescription
modelModel ID
imagePNG image to create a variation of
nNumber of variations to generate (1-10)
sizeOutput image dimensions
response_formaturl or b64_json

Fallbacks and reliability

Image endpoints support the same fallbacks and retry parameters as chat completions:
TypeScript
const response = await client.images.generate({
  model: "openai/gpt-image-1",
  prompt: "A mountain lake at dawn",
  size: "1024x1024",
  // @ts-ignore - orq.ai extension
  fallbacks: ["openai/dall-e-3", "openai/dall-e-2"],
});

Audio

The AI Router exposes three OpenAI-compatible audio endpoints. All support fallbacks, load balancing, and retries.

Text to speech

Convert text to audio using POST /v3/router/audio/speech.
ProviderModel
OpenAIopenai/tts-1
OpenAIopenai/tts-1-hd
OpenAIopenai/gpt-4o-mini-tts
ElevenLabselevenlabs/eleven_multilingual_v2
ElevenLabselevenlabs/eleven_turbo_v2_5
ElevenLabselevenlabs/eleven_flash_v2_5
ElevenLabselevenlabs/eleven_flash_v2
Google AIgoogle-ai/gemini-2.5-flash-preview-tts
Google AIgoogle-ai/gemini-2.5-pro-preview-tts
Vertex AIgoogle/gemini-2.5-flash-preview-tts
Vertex AIgoogle/gemini-2.5-pro-preview-tts
For the full and up-to-date list of TTS models, see Text-to-Speech models on the Supported Models page.
import OpenAI from "openai";
import fs from "fs";

const client = new OpenAI({
  apiKey: process.env.ORQ_API_KEY,
  baseURL: "https://api.orq.ai/v3/router",
});

const response = await client.audio.speech.create({
  model: "openai/tts-1",
  voice: "alloy",
  input: "Hello, welcome to Acme Corp. How can I help you today?",
  response_format: "mp3",
});

const buffer = Buffer.from(await response.arrayBuffer());
fs.writeFileSync("output.mp3", buffer);
Streaming: Process audio chunks in real time as they arrive, useful for low-latency playback pipelines.
const response = await client.audio.speech.create({
  model: "openai/tts-1",
  voice: "alloy",
  input: "Hello, welcome to Acme Corp. How can I help you today?",
  response_format: "pcm",
});

const reader = response.body!.getReader();
while (true) {
  const { done, value } = await reader.read();
  if (done) break;
  processAudioChunk(value);
}
Parameters:
ParameterDescription
modelModel ID
inputText to synthesize. Maximum length varies by provider
voiceVoice ID. See voices table below
response_formatOutput format: mp3, opus, aac, flac, wav, pcm. Supported values vary by provider
speedPlayback speed of the generated audio
Voices:
ProviderVoices
OpenAIalloy, echo, fable, onyx, nova, shimmer
ElevenLabsaria, roger, sarah, laura, charlie, george, callum, river, liam, charlotte, alice, matilda, will, jessica, eric, chris

Transcription

Transcribe an audio file to text using POST /v3/router/audio/transcriptions.
ProviderModel
OpenAIopenai/whisper-1
OpenAIopenai/gpt-4o-transcribe
OpenAIopenai/gpt-4o-mini-transcribe
ElevenLabselevenlabs/scribe_v1
Groqgroq/whisper-large-v3
Groqgroq/whisper-large-v3-turbo
Mistralmistral/voxtral-mini-2507
Azureazure/whisper
For the full and up-to-date list of transcription models, see Speech-to-Text models on the Supported Models page.
import OpenAI from "openai";
import fs from "fs";

const client = new OpenAI({
  apiKey: process.env.ORQ_API_KEY,
  baseURL: "https://api.orq.ai/v3/router",
});

const transcription = await client.audio.transcriptions.create({
  model: "openai/gpt-4o-transcribe",
  file: fs.createReadStream("meeting.mp3"),
  response_format: "json",
  language: "en",
});

console.log(transcription.text);
Parameters:
ParameterDescription
modelModel ID
fileAudio file to transcribe. Supported formats: flac, mp3, mp4, mpeg, mpga, m4a, ogg, wav, webm
languageISO-639-1 language code of the input audio (e.g. en, fr, de)
promptOptional text to guide the model’s style or continue a previous segment
response_formatjson, text, srt, verbose_json, or vtt
temperatureSampling temperature between 0 and 1
timestamp_granularitiesArray of granularities: ["word"], ["segment"], or ["word", "segment"]. Requires verbose_json. Not supported by all models
diarizeAnnotate which speaker is talking in the file. ElevenLabs only
num_speakersMaximum number of speakers to identify. ElevenLabs only
tag_audio_eventsTag non-speech events such as (laughter) or (applause). ElevenLabs only
enable_loggingSet to false to disable logging and enable zero data retention

Translation

The OpenAI translation endpoint only supports openai/whisper-1. gpt-4o-transcribe and gpt-4o-mini-transcribe do not support translation.
Transcribe and translate audio to English using POST /v3/router/audio/translations. The output is always in English regardless of the source language.
import OpenAI from "openai";
import fs from "fs";

const client = new OpenAI({
  apiKey: process.env.ORQ_API_KEY,
  baseURL: "https://api.orq.ai/v3/router",
});

const translation = await client.audio.translations.create({
  model: "openai/whisper-1",
  file: fs.createReadStream("interview_french.mp3"),
  response_format: "json",
});

console.log(translation.text);
Translation supports the same response_format and temperature parameters as transcription.