Orq.ai Documentation - AI Gateway & LLM Collaboration Platform

The AI Router exposes three OpenAI-compatible audio endpoints: text-to-speech, transcription, and translation. All endpoints support fallbacks, load balancing, and retries.

Text to speech

Provider	Model
OpenAI	`openai/tts-1`
OpenAI	`openai/tts-1-hd`
OpenAI	`openai/gpt-4o-mini-tts`
ElevenLabs	`elevenlabs/eleven_multilingual_v2`
ElevenLabs	`elevenlabs/eleven_turbo_v2_5`
ElevenLabs	`elevenlabs/eleven_flash_v2_5`
ElevenLabs	`elevenlabs/eleven_flash_v2`
Google AI	`google-ai/gemini-2.5-flash-preview-tts`
Google AI	`google-ai/gemini-2.5-pro-preview-tts`
Vertex AI	`google/gemini-2.5-flash-preview-tts`
Vertex AI	`google/gemini-2.5-pro-preview-tts`

For the full and up-to-date list of TTS models, see Text-to-Speech models on the Supported Models page.

Convert text to audio using POST /v2/router/audio/speech.

import OpenAI from "openai";
import fs from "fs";

const client = new OpenAI({
  apiKey: process.env.ORQ_API_KEY,
  baseURL: "https://my.orq.ai/v2/router",
});

const response = await client.audio.speech.create({
  model: "openai/tts-1",
  voice: "alloy",
  input: "Hello, welcome to Acme Corp. How can I help you today?",
  response_format: "mp3",
});

const buffer = Buffer.from(await response.arrayBuffer());
fs.writeFileSync("output.mp3", buffer);

Streaming

Process audio chunks in real time as they arrive, useful for low-latency playback pipelines.

import OpenAI from "openai";

const client = new OpenAI({
  apiKey: process.env.ORQ_API_KEY,
  baseURL: "https://my.orq.ai/v2/router",
});

const response = await client.audio.speech.create({
  model: "openai/tts-1",
  voice: "alloy",
  input: "Hello, welcome to Acme Corp. How can I help you today?",
  response_format: "pcm",
});

const reader = response.body!.getReader();
while (true) {
  const { done, value } = await reader.read();
  if (done) break;
  processAudioChunk(value); // pipe to speaker, buffer, etc.
}

Parameters

Parameter	Description
`model`	Model ID
`input`	Text to synthesize. Maximum length varies by provider
`voice`	Voice ID. See Voices below
`response_format`	Output format. Supported values vary by provider (`mp3`, `opus`, `aac`, `flac`, `wav`, `pcm`)
`speed`	Playback speed of the generated audio

Voices

Provider	Voice
OpenAI	`alloy`, `echo`, `fable`, `onyx`, `nova`, `shimmer`
ElevenLabs	`aria`, `roger`, `sarah`, `laura`, `charlie`, `george`, `callum`, `river`, `liam`, `charlotte`, `alice`, `matilda`, `will`, `jessica`, `eric`, `chris`

Transcription

Provider	Model
OpenAI	`openai/whisper-1`
OpenAI	`openai/gpt-4o-transcribe`
OpenAI	`openai/gpt-4o-mini-transcribe`
ElevenLabs	`elevenlabs/scribe_v1`
Groq	`groq/whisper-large-v3`
Groq	`groq/whisper-large-v3-turbo`
Mistral	`mistral/voxtral-mini-2507`
Azure	`azure/whisper`

For the full and up-to-date list of transcription models, see Speech-to-Text models on the Supported Models page.

Transcribe an audio file to text using POST /v2/router/audio/transcriptions.

import OpenAI from "openai";
import fs from "fs";

const client = new OpenAI({
  apiKey: process.env.ORQ_API_KEY,
  baseURL: "https://my.orq.ai/v2/router",
});

const transcription = await client.audio.transcriptions.create({
  model: "openai/gpt-4o-transcribe",
  file: fs.createReadStream("meeting.mp3"),
  response_format: "json",
  language: "en",
});

console.log(transcription.text);

Parameters

Parameter	Description
`model`	Model ID
`file`	Audio file to transcribe. Supported formats: `flac`, `mp3`, `mp4`, `mpeg`, `mpga`, `m4a`, `ogg`, `wav`, `webm`
`language`	ISO-639-1 language code of the input audio (e.g. `en`, `fr`, `de`)
`prompt`	Optional text to guide the model’s style or continue a previous segment
`response_format`	`json`, `text`, `srt`, `verbose_json`, or `vtt`
`temperature`	Sampling temperature between 0 and 1
`timestamp_granularities`	Array of granularities: `["word"]`, `["segment"]`, or `["word", "segment"]`. Requires `verbose_json`. Not supported by all models
`diarize`	Annotate which speaker is talking in the file. ElevenLabs only
`num_speakers`	Maximum number of speakers to identify. ElevenLabs only
`tag_audio_events`	Tag non-speech events such as `(laughter)` or `(applause)`. ElevenLabs only
`enable_logging`	Set to `false` to disable logging and enable zero data retention

Translation

The OpenAI translation endpoint only supports openai/whisper-1. gpt-4o-transcribe and gpt-4o-mini-transcribe do not support translation.

Transcribe and translate audio to English using POST /v2/router/audio/translations. The output is always in English regardless of the source language.

import OpenAI from "openai";
import fs from "fs";

const client = new OpenAI({
  apiKey: process.env.ORQ_API_KEY,
  baseURL: "https://my.orq.ai/v2/router",
});

const translation = await client.audio.translations.create({
  model: "openai/whisper-1",
  file: fs.createReadStream("interview_french.mp3"),
  response_format: "json",
});

console.log(translation.text);

Translation supports the same response_format and temperature parameters as transcription.

Access & Security

AI & Execution

AI Router Features

API Reference

Audio | Speech and Transcription via AI Router

Text to speech

Streaming

Parameters

Voices

Transcription

Parameters

Translation

Access & Security

AI & Execution

AI Router Features

API Reference

​Text to speech

​Streaming

​Parameters

​Voices

​Transcription

​Parameters

​Translation

Text to speech

Streaming

Parameters

Voices

Transcription

Parameters

Translation