Skip to main content
The AI Router exposes three OpenAI-compatible audio endpoints: text-to-speech, transcription, and translation. All endpoints support fallbacks, load balancing, and retries.

Text to speech

ProviderModel
OpenAIopenai/tts-1
OpenAIopenai/tts-1-hd
OpenAIopenai/gpt-4o-mini-tts
ElevenLabselevenlabs/eleven_multilingual_v2
ElevenLabselevenlabs/eleven_turbo_v2_5
ElevenLabselevenlabs/eleven_flash_v2_5
ElevenLabselevenlabs/eleven_flash_v2
Google AIgoogle-ai/gemini-2.5-flash-preview-tts
Google AIgoogle-ai/gemini-2.5-pro-preview-tts
Vertex AIgoogle/gemini-2.5-flash-preview-tts
Vertex AIgoogle/gemini-2.5-pro-preview-tts
For the full and up-to-date list of TTS models, see Text-to-Speech models on the Supported Models page.
Convert text to audio using POST /v2/router/audio/speech.
import OpenAI from "openai";
import fs from "fs";

const client = new OpenAI({
  apiKey: process.env.ORQ_API_KEY,
  baseURL: "https://my.orq.ai/v2/router",
});

const response = await client.audio.speech.create({
  model: "openai/tts-1",
  voice: "alloy",
  input: "Hello, welcome to Acme Corp. How can I help you today?",
  response_format: "mp3",
});

const buffer = Buffer.from(await response.arrayBuffer());
fs.writeFileSync("output.mp3", buffer);

Streaming

Process audio chunks in real time as they arrive, useful for low-latency playback pipelines.
import OpenAI from "openai";

const client = new OpenAI({
  apiKey: process.env.ORQ_API_KEY,
  baseURL: "https://my.orq.ai/v2/router",
});

const response = await client.audio.speech.create({
  model: "openai/tts-1",
  voice: "alloy",
  input: "Hello, welcome to Acme Corp. How can I help you today?",
  response_format: "pcm",
});

const reader = response.body!.getReader();
while (true) {
  const { done, value } = await reader.read();
  if (done) break;
  processAudioChunk(value); // pipe to speaker, buffer, etc.
}

Parameters

ParameterDescription
modelModel ID
inputText to synthesize. Maximum length varies by provider
voiceVoice ID. See Voices below
response_formatOutput format. Supported values vary by provider (mp3, opus, aac, flac, wav, pcm)
speedPlayback speed of the generated audio

Voices

ProviderVoice
OpenAIalloy, echo, fable, onyx, nova, shimmer
ElevenLabsaria, roger, sarah, laura, charlie, george, callum, river, liam, charlotte, alice, matilda, will, jessica, eric, chris

Transcription

ProviderModel
OpenAIopenai/whisper-1
OpenAIopenai/gpt-4o-transcribe
OpenAIopenai/gpt-4o-mini-transcribe
ElevenLabselevenlabs/scribe_v1
Groqgroq/whisper-large-v3
Groqgroq/whisper-large-v3-turbo
Mistralmistral/voxtral-mini-2507
Azureazure/whisper
For the full and up-to-date list of transcription models, see Speech-to-Text models on the Supported Models page.
Transcribe an audio file to text using POST /v2/router/audio/transcriptions.
import OpenAI from "openai";
import fs from "fs";

const client = new OpenAI({
  apiKey: process.env.ORQ_API_KEY,
  baseURL: "https://my.orq.ai/v2/router",
});

const transcription = await client.audio.transcriptions.create({
  model: "openai/gpt-4o-transcribe",
  file: fs.createReadStream("meeting.mp3"),
  response_format: "json",
  language: "en",
});

console.log(transcription.text);

Parameters

ParameterDescription
modelModel ID
fileAudio file to transcribe. Supported formats: flac, mp3, mp4, mpeg, mpga, m4a, ogg, wav, webm
languageISO-639-1 language code of the input audio (e.g. en, fr, de)
promptOptional text to guide the model’s style or continue a previous segment
response_formatjson, text, srt, verbose_json, or vtt
temperatureSampling temperature between 0 and 1
timestamp_granularitiesArray of granularities: ["word"], ["segment"], or ["word", "segment"]. Requires verbose_json. Not supported by all models
diarizeAnnotate which speaker is talking in the file. ElevenLabs only
num_speakersMaximum number of speakers to identify. ElevenLabs only
tag_audio_eventsTag non-speech events such as (laughter) or (applause). ElevenLabs only
enable_loggingSet to false to disable logging and enable zero data retention

Translation

The OpenAI translation endpoint only supports openai/whisper-1. gpt-4o-transcribe and gpt-4o-mini-transcribe do not support translation.
Transcribe and translate audio to English using POST /v2/router/audio/translations. The output is always in English regardless of the source language.
import OpenAI from "openai";
import fs from "fs";

const client = new OpenAI({
  apiKey: process.env.ORQ_API_KEY,
  baseURL: "https://my.orq.ai/v2/router",
});

const translation = await client.audio.translations.create({
  model: "openai/whisper-1",
  file: fs.createReadStream("interview_french.mp3"),
  response_format: "json",
});

console.log(translation.text);
Translation supports the same response_format and temperature parameters as transcription.