Orq.ai Documentation - AI Gateway & LLM Collaboration Platform

Authorizations

Authorization

string

header

required

Bearer authentication header of the form Bearer <token>, where <token> is your auth token.

Body

application/json

conversation

object

Conversation context for multi-turn interactions.

Show child attributes

fallbacks

object[] | null

Fallback models to try if the primary model fails. Each entry specifies a model in provider/model format.

Show child attributes

frequency_penalty

number<double>

Penalize new tokens based on their frequency in the text so far. Between -2.0 and 2.0.

identity

object

Identity/contact information for the end-user.

Show child attributes

input

Input to the model: a string or an array of input items (messages, files, etc.).

instructions

string

System prompt / instructions for the model.

limits

object

Bound agent-loop execution. Fields: max_iterations (LLM turns), max_execution_time (seconds), max_cost (USD; send 0 to disable a manifest-configured cap), max_depth (sub-agent nesting), tool_timeout (seconds). Body values override agent-manifest defaults.

Show child attributes

max_output_tokens

integer<int64>

Maximum number of tokens in the response output.

max_tool_calls

integer<int64>

Maximum number of tool call rounds in the agentic loop.

memory

object

Attach a memory store entity to enable persistent memory across requests. See Memory Stores documentation for setup.

Show child attributes

metadata

object

Developer-defined key-value pairs attached to the response (OpenAI spec: Map<string, string>). Non-string values are rejected with a 400.

Show child attributes

model

string

The model to use in provider/model format (e.g. openai/gpt-4o). Use agent/ to invoke a pre-configured agent from the orq.ai platform.

parallel_tool_calls

boolean

Whether to allow parallel tool calls.

presence_penalty

number<double>

Penalize new tokens based on their presence in the text so far. Between -2.0 and 2.0.

previous_response_id

string

The ID of a previous response to continue from. Requires store to be true (default) on the original response.

prompt_cache_key

string

Key for prompt caching across requests.

reasoning

object

Configure reasoning behavior. Set effort (none, minimal, low, medium, high, xhigh) to control how much the model thinks before answering. Higher effort means more reasoning tokens and better answers for complex tasks, at higher cost.

Show child attributes

retry

object

Retry configuration. Specify the number of retries and which HTTP status codes should trigger a retry.

Show child attributes

safety_identifier

string

Safety identifier for content filtering.

store

boolean

Whether to persist the response (default: true). When false, the response cannot be retrieved later and previous_response_id will not work for follow-up requests.

stream

boolean

If true, returns a stream of server-sent events.

stream_options

object

Show child attributes

temperature

number<double>

Sampling temperature between 0 and 2.

template_engine

enum<string>

Template engine for variable substitution in instructions. Defaults to the agent manifest's engine when invoking an agent, otherwise text.

Available options:

text,

jinja,

mustache

text

object

Configuration for text output.

Show child attributes

thread

object

Thread for grouping related requests.

Show child attributes

tool_choice

How the model should use the provided tools. Can be a string shorthand or a specific function selector.

Available options:

auto,

none,

required

tools

(Function · object | orq.ai Tool · object | MCP Tool · object)[]

Tools available to the model.

A tool definition. The "type" field determines the tool kind.

Function
orq.ai Tool
MCP Tool

Show child attributes

top_logprobs

integer<int64>

Number of most likely tokens to return at each position.

top_p

number<double>

Nucleus sampling parameter.

variables

object

Template variables for prompt substitution. Plain values fill {{variable}} placeholders in instructions. For secrets, use {"secret": true, "value": "sensitive-data"} — secrets are automatically passed to platform tools (Python, HTTP, MCP) and redacted from traces.

Show child attributes

Response

Returns a response object or a stream of events.

background

boolean

required

completed_at

integer<int64> | null

required

created_at

integer<int64>

required

error

object

required

Show child attributes

frequency_penalty

number<double>

required

string

required

incomplete_details

object

required

Show child attributes

input

any[] | null

required

Array of input items (messages, function call outputs, etc.)

instructions

string | null

required

max_output_tokens

integer<int64> | null

required

max_tool_calls

integer<int64> | null

required

metadata

object

required

Developer-defined key-value pairs attached to the response (OpenAI spec: Map<string, string>).

Show child attributes

model

string

required

object

string

required

Always "response"

output

any[] | null

required

Array of output items (messages, function calls, reasoning, etc.)

parallel_tool_calls

boolean

required

presence_penalty

number<double>

required

previous_response_id

string | null

required

prompt_cache_key

string | null

required

prompt_cache_retention

string | null

required

reasoning

object

required

Show child attributes

safety_identifier

string | null

required

service_tier

enum<string>

required

Available options:

auto,

default,

flex,

priority

status

enum<string>

required

Available options:

queued,

in_progress,

completed,

failed,

incomplete,

requires_action

store

boolean

required

temperature

number<double>

required

text

any

required

Text output configuration including format and verbosity

tool_choice

any

required

Tool choice setting: "auto", "none", "required", or a specific function

tools

any[] | null

required

Array of tool configurations used in this response

top_logprobs

integer<int64>

required

top_p

number<double>

required

truncation

enum<string>

required

Available options:

disabled,

auto

usage

object

required

Show child attributes

user

string | null

required

conversation

object

Show child attributes

memory

object

Show child attributes

variables

object

Show child attributes

Documentation Index

Authorizations

Body

Response