Orq.ai Documentation - AI Gateway & LLM Collaboration Platform

Create transcription

curl --request POST \
  --url https://api.orq.ai/v2/router/audio/transcriptions \
  --header 'Authorization: Bearer <token>' \
  --header 'Content-Type: multipart/form-data' \
  --form 'model=<string>' \
  --form 'prompt=<string>' \
  --form enable_logging=true \
  --form diarize=false \
  --form response_format=json \
  --form tag_audio_events=true \
  --form num_speakers=123 \
  --form timestamps_granularity=word \
  --form temperature=0.5 \
  --form 'language=<string>' \
  --form 'timestamp_granularities[0]=word' \
  --form 'timestamp_granularities[1]=segment' \
  --form 'name=<string>' \
  --form 'fallbacks={
  "model": "<string>"
}' \
  --form 'retry={
  "count": 3,
  "on_codes": [
    429,
    500,
    502,
    503,
    504
  ]
}' \
  --form 'load_balancer={
  "type": "weight_based",
  "models": [
    {
      "model": "openai/gpt-4o",
      "weight": 0.7
    }
  ]
}' \
  --form 'timeout={
  "call_timeout": 30000
}' \
  --form 'orq={
  "name": "<string>",
  "fallbacks": [
    {
      "model": "openai/gpt-4o-mini"
    }
  ],
  "retry": {
    "count": 3,
    "on_codes": [
      429,
      500,
      502,
      503,
      504
    ]
  },
  "identity": {
    "id": "contact_01ARZ3NDEKTSV4RRFFQ69G5FAV",
    "display_name": "Jane Doe",
    "email": "jane.doe@example.com",
    "metadata": [
      {
        "department": "Engineering",
        "role": "Senior Developer"
      }
    ],
    "logo_url": "https://example.com/avatars/jane-doe.jpg",
    "tags": [
      "hr",
      "engineering"
    ]
  },
  "contact": {
    "id": "contact_01ARZ3NDEKTSV4RRFFQ69G5FAV",
    "display_name": "Jane Doe",
    "email": "jane.doe@example.com",
    "metadata": [
      {
        "department": "Engineering",
        "role": "Senior Developer"
      }
    ],
    "logo_url": "https://example.com/avatars/jane-doe.jpg",
    "tags": [
      "hr",
      "engineering"
    ]
  },
  "load_balancer": {
    "type": "weight_based",
    "models": [
      {
        "model": "openai/gpt-4o",
        "weight": 0.7
      },
      {
        "model": "anthropic/claude-3-5-sonnet",
        "weight": 0.3
      }
    ]
  },
  "timeout": {
    "call_timeout": 30000
  }
}' \
  --form file='@example-file'

{
  "text": "<string>"
}

POST

router

audio

transcriptions

Create transcription

curl --request POST \
  --url https://api.orq.ai/v2/router/audio/transcriptions \
  --header 'Authorization: Bearer <token>' \
  --header 'Content-Type: multipart/form-data' \
  --form 'model=<string>' \
  --form 'prompt=<string>' \
  --form enable_logging=true \
  --form diarize=false \
  --form response_format=json \
  --form tag_audio_events=true \
  --form num_speakers=123 \
  --form timestamps_granularity=word \
  --form temperature=0.5 \
  --form 'language=<string>' \
  --form 'timestamp_granularities[0]=word' \
  --form 'timestamp_granularities[1]=segment' \
  --form 'name=<string>' \
  --form 'fallbacks={
  "model": "<string>"
}' \
  --form 'retry={
  "count": 3,
  "on_codes": [
    429,
    500,
    502,
    503,
    504
  ]
}' \
  --form 'load_balancer={
  "type": "weight_based",
  "models": [
    {
      "model": "openai/gpt-4o",
      "weight": 0.7
    }
  ]
}' \
  --form 'timeout={
  "call_timeout": 30000
}' \
  --form 'orq={
  "name": "<string>",
  "fallbacks": [
    {
      "model": "openai/gpt-4o-mini"
    }
  ],
  "retry": {
    "count": 3,
    "on_codes": [
      429,
      500,
      502,
      503,
      504
    ]
  },
  "identity": {
    "id": "contact_01ARZ3NDEKTSV4RRFFQ69G5FAV",
    "display_name": "Jane Doe",
    "email": "jane.doe@example.com",
    "metadata": [
      {
        "department": "Engineering",
        "role": "Senior Developer"
      }
    ],
    "logo_url": "https://example.com/avatars/jane-doe.jpg",
    "tags": [
      "hr",
      "engineering"
    ]
  },
  "contact": {
    "id": "contact_01ARZ3NDEKTSV4RRFFQ69G5FAV",
    "display_name": "Jane Doe",
    "email": "jane.doe@example.com",
    "metadata": [
      {
        "department": "Engineering",
        "role": "Senior Developer"
      }
    ],
    "logo_url": "https://example.com/avatars/jane-doe.jpg",
    "tags": [
      "hr",
      "engineering"
    ]
  },
  "load_balancer": {
    "type": "weight_based",
    "models": [
      {
        "model": "openai/gpt-4o",
        "weight": 0.7
      },
      {
        "model": "anthropic/claude-3-5-sonnet",
        "weight": 0.3
      }
    ]
  },
  "timeout": {
    "call_timeout": 30000
  }
}' \
  --form file='@example-file'

{
  "text": "<string>"
}

Authorizations

Authorization

string

header

required

Bearer authentication header of the form Bearer <token>, where <token> is your auth token.

Body

multipart/form-data

Transcribes audio into the input language.

model

string

required

ID of the model to use

prompt

string

An optional text to guide the model's style or continue a previous audio segment. The prompt should match the audio language.

enable_logging

boolean

default:true

When enable_logging is set to false, zero retention mode is used. This disables history features like request stitching and is only available to enterprise customers.

diarize

boolean

default:false

Whether to annotate which speaker is currently talking in the uploaded file.

response_format

enum<string>

The format of the transcript output, in one of these options: json, text, srt, verbose_json, or vtt.

Available options:

json,

text,

srt,

verbose_json,

vtt

tag_audio_events

boolean

default:true

Whether to tag audio events like (laughter), (footsteps), etc. in the transcription.

num_speakers

number

The maximum amount of speakers talking in the uploaded file. Helps with predicting who speaks when, the maximum is 32.

timestamps_granularity

enum<string>

default:word

The granularity of the timestamps in the transcription. Word provides word-level timestamps and character provides character-level timestamps per word.

Available options:

none,

word,

character

temperature

number

The sampling temperature, between 0 and 1. Higher values like 0.8 will make the output more random, while lower values like 0.2 will make it more focused and deterministic. If set to 0, the model will use log probability to automatically increase the temperature until certain thresholds are hit.

Example:

0.5

language

string

The language of the input audio. Supplying the input language in ISO-639-1 format will improve accuracy and latency.

timestamp_granularities

enum<string>[]

The timestamp granularities to populate for this transcription. response_format must be set to verbose_json to use timestamp granularities. Either or both of these options are supported: "word" or "segment". Note: There is no additional latency for segment timestamps, but generating word timestamps incurs additional latency.

Available options:

word,

segment

Example:

["word", "segment"]

name

string

The name to display on the trace. If not specified, the default system name will be used.

fallbacks

object[]

Array of fallback models to use if primary model fails

Show child attributes

retry

object

Retry configuration for the request

Show child attributes

load_balancer

object

Load balancer configuration for the request.

Show child attributes

timeout

object

Timeout configuration to apply to the request. If the request exceeds the timeout, it will be retried or fallback to the next model if configured.

Show child attributes

orq

object

Show child attributes

file

The audio file object (not file name) to transcribe, in one of these formats: flac, mp3, mp4, mpeg, mpga, m4a, ogg, wav, or webm.

Response

Returns the transcription or verbose transcription

text

string

required

Create speech

Create translation

⌘I

AI & Execution

Access & Security

AI Router Features

API Reference

Create transcription

Authorizations

Body

Response