> ## Documentation Index
> Fetch the complete documentation index at: https://docs.orq.ai/llms.txt
> Use this file to discover all available pages before exploring further.

# AI Router | Multimodal

> Send images, PDFs, and audio to LLMs, and generate images and speech through the AI Router. One unified OpenAI-compatible API for all modalities.

The **AI Router** supports all input and output modalities through a single OpenAI-compatible API. All endpoints share the same base URL, authentication, and **orq.ai** router features: [fallbacks](/docs/proxy/retries#fallbacks), [caching](/docs/proxy/cache), [load balancing](/docs/proxy/load-balancing), and [retries](/docs/proxy/retries).

| Modality                              | Endpoint                               |
| ------------------------------------- | -------------------------------------- |
| [Image input](#image-input)           | `POST /v3/router/chat/completions`     |
| [PDF input](#pdf-input)               | `POST /v3/router/chat/completions`     |
| [Image generation](#image-generation) | `POST /v3/router/images/generations`   |
| [Image editing](#image-editing)       | `POST /v3/router/images/edits`         |
| [Image variations](#image-variations) | `POST /v3/router/images/variations`    |
| [Text to speech](#text-to-speech)     | `POST /v3/router/audio/speech`         |
| [Transcription](#transcription)       | `POST /v3/router/audio/transcriptions` |
| [Translation](#translation)           | `POST /v3/router/audio/translations`   |

All endpoints use the same base URL and authentication:

```bash theme={"theme":{"light":"github-light","dark":"github-dark"}}
BASE_URL=https://api.orq.ai/v3/router
Authorization: Bearer $ORQ_API_KEY
```

To see which models support a specific modality, filter the [Supported Models](/docs/proxy/supported-models) page or check the [Providers](/docs/router/providers-overview) page in **orq.ai**.

<CardGroup cols={2}>
  <Card title="Image input" icon="image" href="#image-input">
    Analyze images alongside text. Pass image URLs or base64-encoded files in `chat/completions` messages.
  </Card>

  <Card title="PDF input" icon="file-pdf" href="#pdf-input">
    Send PDF documents for extraction and analysis. Supported natively by compatible models.
  </Card>

  <Card title="Image generation" icon="wand-magic-sparkles" href="#image-generation">
    Generate, edit, and vary images using DALL-E 2, DALL-E 3, and GPT Image 1.
  </Card>

  <Card title="Audio" icon="waveform-lines" href="#audio">
    Convert text to speech, transcribe audio files, and translate audio to English.
  </Card>
</CardGroup>

## Image input

Analyze images alongside text using `POST /v3/router/chat/completions`. Pass images as public URLs or base64-encoded data in the message content array.

<CodeGroup>
  ```bash cURL theme={"theme":{"light":"github-light","dark":"github-dark"}}
  curl -X POST https://api.orq.ai/v3/router/chat/completions \
    -H "Authorization: Bearer $ORQ_API_KEY" \
    -H "Content-Type: application/json" \
    -d '{
      "model": "openai/gpt-4o",
      "messages": [
        {
          "role": "user",
          "content": [
            {
              "type": "text",
              "text": "What is in this image? Describe in detail."
            },
            {
              "type": "image_url",
              "image_url": {
                "url": "https://example.com/image.jpg"
              }
            }
          ]
        }
      ]
    }'
  ```

  ```python Python theme={"theme":{"light":"github-light","dark":"github-dark"}}
  from openai import OpenAI
  import base64
  import os

  client = OpenAI(
      api_key=os.environ.get("ORQ_API_KEY"),
      base_url="https://api.orq.ai/v3/router"
  )

  def encode_image(image_path):
      with open(image_path, "rb") as image_file:
          return base64.b64encode(image_file.read()).decode('utf-8')

  base64_image = encode_image("chart.png")

  response = client.chat.completions.create(
      model="openai/gpt-4o",
      messages=[
          {
              "role": "user",
              "content": [
                  {
                      "type": "text",
                      "text": "Analyze this chart and extract the key data points"
                  },
                  {
                      "type": "image_url",
                      "image_url": {
                          "url": f"data:image/png;base64,{base64_image}"
                      }
                  }
              ]
          }
      ]
  )

  print(response.choices[0].message.content)
  ```

  ```typescript TypeScript theme={"theme":{"light":"github-light","dark":"github-dark"}}
  import OpenAI from "openai";
  import fs from "fs";

  const client = new OpenAI({
    apiKey: process.env.ORQ_API_KEY,
    baseURL: "https://api.orq.ai/v3/router",
  });

  const base64Image = fs.readFileSync("chart.png").toString("base64");

  const response = await client.chat.completions.create({
    model: "openai/gpt-4o",
    messages: [
      {
        role: "user",
        content: [
          {
            type: "text",
            text: "Analyze this chart and extract the key data points",
          },
          {
            type: "image_url",
            image_url: {
              url: `data:image/png;base64,${base64Image}`,
            },
          },
        ],
      },
    ],
  });

  console.log(response.choices[0].message.content);
  ```
</CodeGroup>

### Supported formats

| Format       | Use case               | Max size            |
| ------------ | ---------------------- | ------------------- |
| **JPEG/JPG** | Photos, general images | 20MB                |
| **PNG**      | Screenshots, diagrams  | 20MB                |
| **GIF**      | Static images only     | 20MB                |
| **WebP**     | Modern web images      | 20MB                |
| **Base64**   | Embedded image data    | Model context limit |
| **URLs**     | Public image links     | Model context limit |

### Detail levels

| Level    | Resolution      | Speed  | Cost   | Use case           |
| -------- | --------------- | ------ | ------ | ------------------ |
| `"low"`  | 512x512         | Fast   | Low    | Quick overview     |
| `"high"` | Full resolution | Slow   | High   | Detailed analysis  |
| `"auto"` | Model decides   | Medium | Medium | Balanced (default) |

Set `detail` in the `image_url` object:

```json theme={"theme":{"light":"github-light","dark":"github-dark"}}
{
  "type": "image_url",
  "image_url": {
    "url": "https://example.com/image.jpg",
    "detail": "high"
  }
}
```

### Patterns

**Multiple images:**

<CodeGroup>
  ```typescript TypeScript theme={"theme":{"light":"github-light","dark":"github-dark"}}
  const content = [
    { type: "text", text: "Compare these before and after photos. What changes do you notice?" },
    { type: "image_url", image_url: { url: "https://example.com/before.jpg", detail: "high" } },
    { type: "image_url", image_url: { url: "https://example.com/after.jpg", detail: "high" } },
  ];

  const response = await client.chat.completions.create({
    model: "openai/gpt-4o",
    messages: [{ role: "user", content }],
  });
  ```
</CodeGroup>

**OCR and text extraction:**

<CodeGroup>
  ```typescript TypeScript theme={"theme":{"light":"github-light","dark":"github-dark"}}
  const response = await client.chat.completions.create({
    model: "openai/gpt-4o",
    messages: [
      {
        role: "user",
        content: [
          {
            type: "text",
            text: "Extract all text from this image. Return as plain text, preserving formatting where possible.",
          },
          {
            type: "image_url",
            image_url: { url: imageUrl, detail: "high" },
          },
        ],
      },
    ],
  });
  ```
</CodeGroup>

**Structured output:**

<CodeGroup>
  ```python Python theme={"theme":{"light":"github-light","dark":"github-dark"}}
  from pydantic import BaseModel
  from typing import List

  class ImageAnalysis(BaseModel):
      objects: List[str]
      text_content: str
      dominant_colors: List[str]
      confidence: float

  response = client.beta.chat.completions.parse(
      model="openai/gpt-4o",
      messages=[{
          "role": "user",
          "content": [
              {"type": "text", "text": "Analyze this image systematically"},
              {"type": "image_url", "image_url": {"url": image_url}}
          ]
      }],
      response_format=ImageAnalysis
  )
  ```
</CodeGroup>

### Limitations

| Limitation      | Details                 | Workaround                      |
| --------------- | ----------------------- | ------------------------------- |
| **File size**   | 20MB max per image      | Compress before upload          |
| **Image count** | Varies by model (5-16)  | Process in batches              |
| **Video**       | Static images only      | Extract frames for analysis     |
| **Privacy**     | Images sent to provider | Use on-premise models if needed |

## PDF input

Send PDF documents directly in chat messages for analysis and content extraction using `POST /v3/router/chat/completions`.

<Note>
  PDF input support varies by model. See the [Supported Models](/docs/proxy/supported-models) page and check your provider's documentation for PDF capability.
</Note>

<CodeGroup>
  ```bash cURL theme={"theme":{"light":"github-light","dark":"github-dark"}}
  curl -X POST https://api.orq.ai/v3/router/chat/completions \
    -H "Authorization: Bearer $ORQ_API_KEY" \
    -H "Content-Type: application/json" \
    -d '{
      "model": "openai/gpt-4o",
      "messages": [
        {
          "role": "user",
          "content": [
            {
              "type": "text",
              "text": "Please analyze this PDF document and provide a summary"
            },
            {
              "type": "file",
              "file": {
                "file_data": "data:application/pdf;base64,YOUR_BASE64_ENCODED_PDF",
                "filename": "document.pdf"
              }
            }
          ]
        }
      ]
    }'
  ```

  ```python Python theme={"theme":{"light":"github-light","dark":"github-dark"}}
  from openai import OpenAI
  import os
  import base64

  client = OpenAI(
      api_key=os.environ.get("ORQ_API_KEY"),
      base_url="https://api.orq.ai/v3/router"
  )

  with open("document.pdf", "rb") as pdf_file:
      pdf_base64 = base64.b64encode(pdf_file.read()).decode('utf-8')

  response = client.chat.completions.create(
      model="openai/gpt-4o",
      messages=[
          {
              "role": "user",
              "content": [
                  {
                      "type": "text",
                      "text": "Please analyze this PDF document and provide a summary"
                  },
                  {
                      "type": "file",
                      "file": {
                          "file_data": f"data:application/pdf;base64,{pdf_base64}",
                          "filename": "document.pdf"
                      }
                  }
              ]
          }
      ]
  )
  ```

  ```typescript TypeScript theme={"theme":{"light":"github-light","dark":"github-dark"}}
  import OpenAI from "openai";
  import fs from "fs";

  const client = new OpenAI({
    apiKey: process.env.ORQ_API_KEY,
    baseURL: "https://api.orq.ai/v3/router",
  });

  const pdfBase64 = fs.readFileSync("document.pdf").toString("base64");

  const response = await client.chat.completions.create({
    model: "openai/gpt-4o",
    messages: [
      {
        role: "user",
        content: [
          {
            type: "text",
            text: "Please analyze this PDF document and provide a summary",
          },
          {
            type: "file",
            file: {
              file_data: `data:application/pdf;base64,${pdfBase64}`,
              filename: "document.pdf",
            },
          },
        ],
      },
    ],
  });
  ```
</CodeGroup>

### Parameters

**Chat Completions** (`/v3/router/chat/completions`):

| Parameter        | Type     | Required | Description                        |
| ---------------- | -------- | -------- | ---------------------------------- |
| `type`           | `"file"` | Yes      | Content type for file input        |
| `file.file_data` | string   | Yes      | Data URI with base64 PDF content   |
| `file.filename`  | string   | Yes      | Name of the file for model context |

**Responses API** (`/v3/router/responses`):

| Parameter   | Type           | Required | Description                        |
| ----------- | -------------- | -------- | ---------------------------------- |
| `type`      | `"input_file"` | Yes      | Content type for file input        |
| `file_data` | string         | Yes      | Data URI with base64 PDF content   |
| `filename`  | string         | Yes      | Name of the file for model context |

**Format:** `data:application/pdf;base64,{base64_content}`

### Use cases

| Scenario               | Example prompt                        |
| ---------------------- | ------------------------------------- |
| **Contract analysis**  | "Extract key terms and obligations"   |
| **Invoice processing** | "Extract amounts, dates, vendor info" |
| **Research papers**    | "Summarize methodology and findings"  |
| **Form extraction**    | "Convert form data to JSON"           |

### Limitations

| Limitation            | Details                                | Workaround             |
| --------------------- | -------------------------------------- | ---------------------- |
| **File size**         | Model context limits                   | Split large PDFs       |
| **Scanned documents** | Quality varies by model                | Use OCR preprocessing  |
| **Complex layouts**   | Tables and charts may not extract well | Use structured prompts |
| **Security**          | Sensitive documents sent to provider   | Use on-premise models  |

## Image generation

Generate images from a text prompt using `POST /v3/router/images/generations`.

<Note>
  For the full and up-to-date list of supported image models, see [Image models](/docs/proxy/supported-models#image-models) on the Supported Models page.
</Note>

<CodeGroup>
  ```typescript TypeScript theme={"theme":{"light":"github-light","dark":"github-dark"}}
  import OpenAI from "openai";

  const client = new OpenAI({
    apiKey: process.env.ORQ_API_KEY,
    baseURL: "https://api.orq.ai/v3/router",
  });

  const response = await client.images.generate({
    model: "openai/gpt-image-1",
    prompt: "A futuristic city skyline at sunset, photorealistic",
    n: 1,
    size: "1024x1024",
  });

  console.log(response.data[0].b64_json?.slice(0, 40));
  ```

  ```python Python theme={"theme":{"light":"github-light","dark":"github-dark"}}
  import os
  from openai import OpenAI

  client = OpenAI(
      api_key=os.environ["ORQ_API_KEY"],
      base_url="https://api.orq.ai/v3/router",
  )

  response = client.images.generate(
      model="openai/gpt-image-1",
      prompt="A futuristic city skyline at sunset, photorealistic",
      n=1,
      size="1024x1024",
  )

  print(response.data[0].b64_json[:40])
  ```
</CodeGroup>

### Parameters

| Parameter            | Description                                                                                              |
| -------------------- | -------------------------------------------------------------------------------------------------------- |
| `model`              | Model ID                                                                                                 |
| `prompt`             | Text description of the desired image                                                                    |
| `n`                  | Number of images to generate                                                                             |
| `size`               | Image dimensions (see [Supported Models](/docs/proxy/supported-models#image-models) for per-model sizes) |
| `response_format`    | `url` or `b64_json`. DALL-E 2/3 only; `gpt-image-1` always returns `b64_json`                            |
| `quality`            | Image quality level. Values vary by model                                                                |
| `style`              | `vivid` or `natural`. DALL-E 3 only                                                                      |
| `background`         | `transparent`, `opaque`, or `auto`. `gpt-image-1` only                                                   |
| `output_format`      | `png`, `jpeg`, or `webp`. `gpt-image-1` only                                                             |
| `output_compression` | Compression level 0-100%. `gpt-image-1` only                                                             |
| `moderation`         | `auto` or `low`. `gpt-image-1` only                                                                      |

Set `response_format` to `url` to receive a hosted image link, or `b64_json` to receive the image inline as a base64-encoded string.

```json theme={"theme":{"light":"github-light","dark":"github-dark"}}
{
  "created": 1234567890,
  "data": [
    {
      "b64_json": "iVBORw0KGgo..."
    }
  ]
}
```

### Image editing

Modify an existing image using a prompt and an optional mask with `POST /v3/router/images/edits`.

<CodeGroup>
  ```typescript TypeScript theme={"theme":{"light":"github-light","dark":"github-dark"}}
  import fs from "fs";
  import OpenAI from "openai";

  const client = new OpenAI({
    apiKey: process.env.ORQ_API_KEY,
    baseURL: "https://api.orq.ai/v3/router",
  });

  const response = await client.images.edit({
    model: "openai/gpt-image-1",
    image: fs.createReadStream("original.png"),
    prompt: "Add a sunset sky behind the buildings",
    size: "1024x1024",
  });

  console.log(response.data[0].b64_json?.slice(0, 40));
  ```

  ```python Python theme={"theme":{"light":"github-light","dark":"github-dark"}}
  import os
  from openai import OpenAI

  client = OpenAI(
      api_key=os.environ["ORQ_API_KEY"],
      base_url="https://api.orq.ai/v3/router",
  )

  with open("original.png", "rb") as image_file:
      response = client.images.edit(
          model="openai/gpt-image-1",
          image=image_file,
          prompt="Add a sunset sky behind the buildings",
          size="1024x1024",
      )

  print(response.data[0].b64_json[:40])
  ```
</CodeGroup>

| Parameter         | Description                                                            |
| ----------------- | ---------------------------------------------------------------------- |
| `model`           | Model ID                                                               |
| `image`           | PNG, WEBP, or JPEG file to edit. Some models accept an array of images |
| `prompt`          | Text description of the desired edit                                   |
| `mask`            | Optional PNG mask where transparent areas indicate where to edit       |
| `size`            | Output image dimensions                                                |
| `response_format` | `url` or `b64_json`. `gpt-image-1` always returns `b64_json`           |
| `quality`         | Image quality level. Values vary by model                              |

### Image variations

Generate variations of an existing image with `POST /v3/router/images/variations`. See [Image models](/docs/proxy/supported-models#image-models) for which models support variations.

<CodeGroup>
  ```typescript TypeScript theme={"theme":{"light":"github-light","dark":"github-dark"}}
  import fs from "fs";
  import OpenAI from "openai";

  const client = new OpenAI({
    apiKey: process.env.ORQ_API_KEY,
    baseURL: "https://api.orq.ai/v3/router",
  });

  const response = await client.images.createVariation({
    model: "openai/dall-e-2",
    image: fs.createReadStream("original.png"),
    size: "1024x1024",
    response_format: "url",
  });

  response.data.forEach((img) => console.log(img.url));
  ```

  ```python Python theme={"theme":{"light":"github-light","dark":"github-dark"}}
  import os
  from openai import OpenAI

  client = OpenAI(
      api_key=os.environ["ORQ_API_KEY"],
      base_url="https://api.orq.ai/v3/router",
  )

  with open("original.png", "rb") as image_file:
      response = client.images.create_variation(
          model="openai/dall-e-2",
          image=image_file,
          size="1024x1024",
          response_format="url",
      )

  for img in response.data:
      print(img.url)
  ```
</CodeGroup>

| Parameter         | Description                             |
| ----------------- | --------------------------------------- |
| `model`           | Model ID                                |
| `image`           | PNG image to create a variation of      |
| `n`               | Number of variations to generate (1-10) |
| `size`            | Output image dimensions                 |
| `response_format` | `url` or `b64_json`                     |

### Fallbacks and reliability

Image endpoints support the same `fallbacks` and `retry` parameters as chat completions:

```typescript TypeScript theme={"theme":{"light":"github-light","dark":"github-dark"}}
const response = await client.images.generate({
  model: "openai/gpt-image-1",
  prompt: "A mountain lake at dawn",
  size: "1024x1024",
  // @ts-ignore - orq.ai extension
  fallbacks: ["openai/dall-e-3", "openai/dall-e-2"],
});
```

## Audio

The **AI Router** exposes three OpenAI-compatible audio endpoints. All support [fallbacks](/docs/proxy/retries#fallbacks), [load balancing](/docs/proxy/load-balancing), and [retries](/docs/proxy/retries).

### Text to speech

Convert text to audio using `POST /v3/router/audio/speech`.

| Provider   | Model                                    |
| ---------- | ---------------------------------------- |
| OpenAI     | `openai/tts-1`                           |
| OpenAI     | `openai/tts-1-hd`                        |
| OpenAI     | `openai/gpt-4o-mini-tts`                 |
| ElevenLabs | `elevenlabs/eleven_multilingual_v2`      |
| ElevenLabs | `elevenlabs/eleven_turbo_v2_5`           |
| ElevenLabs | `elevenlabs/eleven_flash_v2_5`           |
| ElevenLabs | `elevenlabs/eleven_flash_v2`             |
| Google AI  | `google-ai/gemini-2.5-flash-preview-tts` |
| Google AI  | `google-ai/gemini-2.5-pro-preview-tts`   |
| Vertex AI  | `google/gemini-2.5-flash-preview-tts`    |
| Vertex AI  | `google/gemini-2.5-pro-preview-tts`      |

<Note>
  For the full and up-to-date list of TTS models, see [Text-to-Speech models](/docs/proxy/supported-models#text-to-speech-models) on the Supported Models page.
</Note>

<CodeGroup>
  ```typescript TypeScript theme={"theme":{"light":"github-light","dark":"github-dark"}}
  import OpenAI from "openai";
  import fs from "fs";

  const client = new OpenAI({
    apiKey: process.env.ORQ_API_KEY,
    baseURL: "https://api.orq.ai/v3/router",
  });

  const response = await client.audio.speech.create({
    model: "openai/tts-1",
    voice: "alloy",
    input: "Hello, welcome to Acme Corp. How can I help you today?",
    response_format: "mp3",
  });

  const buffer = Buffer.from(await response.arrayBuffer());
  fs.writeFileSync("output.mp3", buffer);
  ```

  ```python Python theme={"theme":{"light":"github-light","dark":"github-dark"}}
  import os
  from openai import OpenAI

  client = OpenAI(
      api_key=os.environ["ORQ_API_KEY"],
      base_url="https://api.orq.ai/v3/router",
  )

  with client.audio.speech.with_streaming_response.create(
      model="openai/tts-1",
      voice="alloy",
      input="Hello, welcome to Acme Corp. How can I help you today?",
      response_format="mp3",
  ) as response:
      response.stream_to_file("output.mp3")
  ```
</CodeGroup>

**Streaming:** Process audio chunks in real time as they arrive, useful for low-latency playback pipelines.

<CodeGroup>
  ```typescript TypeScript theme={"theme":{"light":"github-light","dark":"github-dark"}}
  const response = await client.audio.speech.create({
    model: "openai/tts-1",
    voice: "alloy",
    input: "Hello, welcome to Acme Corp. How can I help you today?",
    response_format: "pcm",
  });

  const reader = response.body!.getReader();
  while (true) {
    const { done, value } = await reader.read();
    if (done) break;
    processAudioChunk(value);
  }
  ```

  ```python Python theme={"theme":{"light":"github-light","dark":"github-dark"}}
  with client.audio.speech.with_streaming_response.create(
      model="openai/tts-1",
      voice="alloy",
      input="Hello, welcome to Acme Corp. How can I help you today?",
      response_format="pcm",
  ) as response:
      for chunk in response.iter_bytes(chunk_size=1024):
          process_audio_chunk(chunk)
  ```
</CodeGroup>

**Parameters:**

| Parameter         | Description                                                                                  |
| ----------------- | -------------------------------------------------------------------------------------------- |
| `model`           | Model ID                                                                                     |
| `input`           | Text to synthesize. Maximum length varies by provider                                        |
| `voice`           | Voice ID. See voices table below                                                             |
| `response_format` | Output format: `mp3`, `opus`, `aac`, `flac`, `wav`, `pcm`. Supported values vary by provider |
| `speed`           | Playback speed of the generated audio                                                        |

**Voices:**

| Provider   | Voices                                                                                                                                                 |
| ---------- | ------------------------------------------------------------------------------------------------------------------------------------------------------ |
| OpenAI     | `alloy`, `echo`, `fable`, `onyx`, `nova`, `shimmer`                                                                                                    |
| ElevenLabs | `aria`, `roger`, `sarah`, `laura`, `charlie`, `george`, `callum`, `river`, `liam`, `charlotte`, `alice`, `matilda`, `will`, `jessica`, `eric`, `chris` |

### Transcription

Transcribe an audio file to text using `POST /v3/router/audio/transcriptions`.

| Provider   | Model                           |
| ---------- | ------------------------------- |
| OpenAI     | `openai/whisper-1`              |
| OpenAI     | `openai/gpt-4o-transcribe`      |
| OpenAI     | `openai/gpt-4o-mini-transcribe` |
| ElevenLabs | `elevenlabs/scribe_v1`          |
| Groq       | `groq/whisper-large-v3`         |
| Groq       | `groq/whisper-large-v3-turbo`   |
| Mistral    | `mistral/voxtral-mini-2507`     |
| Azure      | `azure/whisper`                 |

<Note>
  For the full and up-to-date list of transcription models, see [Speech-to-Text models](/docs/proxy/supported-models#speech-to-text-models) on the Supported Models page.
</Note>

<CodeGroup>
  ```typescript TypeScript theme={"theme":{"light":"github-light","dark":"github-dark"}}
  import OpenAI from "openai";
  import fs from "fs";

  const client = new OpenAI({
    apiKey: process.env.ORQ_API_KEY,
    baseURL: "https://api.orq.ai/v3/router",
  });

  const transcription = await client.audio.transcriptions.create({
    model: "openai/gpt-4o-transcribe",
    file: fs.createReadStream("meeting.mp3"),
    response_format: "json",
    language: "en",
  });

  console.log(transcription.text);
  ```

  ```python Python theme={"theme":{"light":"github-light","dark":"github-dark"}}
  import os
  from openai import OpenAI

  client = OpenAI(
      api_key=os.environ["ORQ_API_KEY"],
      base_url="https://api.orq.ai/v3/router",
  )

  with open("meeting.mp3", "rb") as audio_file:
      transcription = client.audio.transcriptions.create(
          model="openai/gpt-4o-transcribe",
          file=audio_file,
          response_format="json",
          language="en",
      )

  print(transcription.text)
  ```
</CodeGroup>

**Parameters:**

| Parameter                 | Description                                                                                                                       |
| ------------------------- | --------------------------------------------------------------------------------------------------------------------------------- |
| `model`                   | Model ID                                                                                                                          |
| `file`                    | Audio file to transcribe. Supported formats: `flac`, `mp3`, `mp4`, `mpeg`, `mpga`, `m4a`, `ogg`, `wav`, `webm`                    |
| `language`                | ISO-639-1 language code of the input audio (e.g. `en`, `fr`, `de`)                                                                |
| `prompt`                  | Optional text to guide the model's style or continue a previous segment                                                           |
| `response_format`         | `json`, `text`, `srt`, `verbose_json`, or `vtt`                                                                                   |
| `temperature`             | Sampling temperature between 0 and 1                                                                                              |
| `timestamp_granularities` | Array of granularities: `["word"]`, `["segment"]`, or `["word", "segment"]`. Requires `verbose_json`. Not supported by all models |
| `diarize`                 | Annotate which speaker is talking in the file. ElevenLabs only                                                                    |
| `num_speakers`            | Maximum number of speakers to identify. ElevenLabs only                                                                           |
| `tag_audio_events`        | Tag non-speech events such as `(laughter)` or `(applause)`. ElevenLabs only                                                       |
| `enable_logging`          | Set to `false` to disable logging and enable zero data retention                                                                  |

### Translation

<Note>
  The OpenAI translation endpoint only supports `openai/whisper-1`. `gpt-4o-transcribe` and `gpt-4o-mini-transcribe` do not support translation.
</Note>

Transcribe and translate audio to English using `POST /v3/router/audio/translations`. The output is always in English regardless of the source language.

<CodeGroup>
  ```typescript TypeScript theme={"theme":{"light":"github-light","dark":"github-dark"}}
  import OpenAI from "openai";
  import fs from "fs";

  const client = new OpenAI({
    apiKey: process.env.ORQ_API_KEY,
    baseURL: "https://api.orq.ai/v3/router",
  });

  const translation = await client.audio.translations.create({
    model: "openai/whisper-1",
    file: fs.createReadStream("interview_french.mp3"),
    response_format: "json",
  });

  console.log(translation.text);
  ```

  ```python Python theme={"theme":{"light":"github-light","dark":"github-dark"}}
  import os
  from openai import OpenAI

  client = OpenAI(
      api_key=os.environ["ORQ_API_KEY"],
      base_url="https://api.orq.ai/v3/router",
  )

  with open("interview_french.mp3", "rb") as audio_file:
      translation = client.audio.translations.create(
          model="openai/whisper-1",
          file=audio_file,
          response_format="json",
      )

  print(translation.text)
  ```
</CodeGroup>

Translation supports the same `response_format` and `temperature` parameters as transcription.
