Skip to main content
POST
/
v2
/
gateway
/
responses
Create response
curl --request POST \
  --url https://api.orq.ai/v2/gateway/responses \
  --header 'Authorization: Bearer <token>' \
  --header 'Content-Type: application/json' \
  --data '
{
  "model": "<string>",
  "input": "<string>",
  "metadata": {},
  "temperature": 1,
  "top_p": 0.5,
  "previous_response_id": "<string>",
  "instructions": "<string>",
  "reasoning": {
    "effort": "low"
  },
  "max_output_tokens": 123,
  "text": {
    "format": {
      "type": "text"
    }
  },
  "include": [
    "code_interpreter_call.outputs"
  ],
  "parallel_tool_calls": true,
  "store": true,
  "tools": [
    {
      "type": "function",
      "name": "<string>",
      "parameters": {
        "type": "object",
        "properties": {},
        "required": [
          "<string>"
        ],
        "additionalProperties": true
      },
      "description": "<string>",
      "strict": true
    }
  ],
  "tool_choice": "none",
  "stream": false
}
'
{
  "id": "<string>",
  "object": "response",
  "created_at": 123,
  "status": "completed",
  "error": {
    "code": "<string>",
    "message": "<string>"
  },
  "incomplete_details": {
    "reason": "max_output_tokens"
  },
  "model": "<string>",
  "output": [
    {
      "id": "<string>",
      "type": "message",
      "role": "assistant",
      "status": "in_progress",
      "content": []
    }
  ],
  "parallel_tool_calls": true,
  "instructions": "<string>",
  "output_text": "<string>",
  "usage": {
    "input_tokens": 123,
    "output_tokens": 123,
    "total_tokens": 123,
    "input_tokens_details": {
      "cached_tokens": 123
    },
    "output_tokens_details": {
      "reasoning_tokens": 123,
      "accepted_prediction_tokens": 123,
      "rejected_prediction_tokens": 123
    }
  },
  "temperature": 123,
  "top_p": 123,
  "max_output_tokens": 123,
  "previous_response_id": "<string>",
  "metadata": {},
  "tool_choice": "none",
  "tools": [
    {
      "type": "function",
      "name": "<string>",
      "parameters": {
        "type": "object",
        "properties": {},
        "required": [
          "<string>"
        ],
        "additionalProperties": true
      },
      "description": "<string>",
      "strict": true
    }
  ],
  "reasoning": {
    "effort": "<string>",
    "summary": "<string>"
  },
  "store": true,
  "text": {
    "format": {
      "type": "text"
    }
  },
  "truncation": "disabled",
  "user": "<string>",
  "service_tier": "auto",
  "background": true,
  "top_logprobs": 10,
  "logprobs": true
}

Authorizations

Authorization
string
header
required

Bearer authentication header of the form Bearer <token>, where <token> is your auth token.

Body

application/json
model
string
required

ID of the model to use. You can use the List models API to see all of your available models.

input
required

The actual user input(s) for the model. Can be a simple string, or an array of structured input items (messages, tool outputs) representing a conversation history or complex input.

metadata
object

Developer-defined key-value pairs that will be included in response objects

temperature
number | null

What sampling temperature to use, between 0 and 2. Higher values like 0.8 will make the output more random, while lower values like 0.2 will make it more focused and deterministic.

Required range: 0 <= x <= 2
top_p
number | null

An alternative to sampling with temperature, called nucleus sampling, where the model considers the results of the tokens with top_p probability mass. So 0.1 means only the tokens comprising the top 10% probability mass are considered.

Required range: 0 <= x <= 1
previous_response_id
string | null

The ID of a previous response to continue the conversation from. The model will have access to the previous response context.

instructions
string | null

Developer-provided instructions that the model should follow. Overwrites the default system message.

reasoning
object

Configuration for reasoning models

max_output_tokens
integer | null

The maximum number of tokens that can be generated in the response

text
object
include
enum<string>[] | null

Specifies which (potentially large) fields to include in the response. By default, the results of Code Interpreter and file searches are excluded. Available options:

  • code_interpreter_call.outputs: Include the outputs of Code Interpreter tool calls
  • computer_call_output.output.image_url: Include the image URLs from computer use tool calls
  • file_search_call.results: Include the results of file search tool calls
  • message.input_image.image_url: Include URLs of input images
  • message.output_text.logprobs: Include log probabilities for output text (when logprobs is enabled)
  • reasoning.encrypted_content: Include encrypted reasoning content for reasoning models
Available options:
code_interpreter_call.outputs,
computer_call_output.output.image_url,
file_search_call.results,
message.input_image.image_url,
message.output_text.logprobs,
reasoning.encrypted_content
parallel_tool_calls
boolean | null

Whether to enable parallel function calling during tool use.

store
boolean | null
default:true

Whether to store this response for use in distillations or evals.

tools
object[]

A list of tools the model may call. Use this to provide a list of functions the model may generate JSON inputs for.

A function tool definition

  • Option 1
  • Option 2
  • Option 3
tool_choice

How the model should select which tool (or tools) to use when generating a response. Can be a string (none, auto, required) or an object to force a specific tool.

Available options:
none,
auto,
required
stream
boolean
default:false

Response

Returns a response object or a stream of events.

Represents the completed model response returned when stream is false

id
string
required

The unique identifier for the response

object
enum<string>
required

The object type, which is always "response"

Available options:
response
created_at
number
required

The Unix timestamp (in seconds) of when the response was created

status
enum<string>
required

The status of the response

Available options:
completed,
failed,
in_progress,
incomplete
error
object
required

The error that occurred, if any

incomplete_details
object
required

Details about why the response is incomplete

model
string
required

The model used to generate the response

output
object[]
required

The list of output items generated by the model

An assistant message output

  • Option 1
  • Option 2
  • Option 3
  • Option 4
parallel_tool_calls
boolean
required
instructions
string | null

The instructions provided for the response

output_text
string | null

A convenience field with the concatenated text from all text content parts

usage
object

Usage statistics for the response

temperature
number | null
top_p
number | null
max_output_tokens
integer | null
previous_response_id
string | null
metadata
object
tool_choice

Controls which (if any) tool is called by the model

Available options:
none,
auto,
required
tools
object[]

A function tool definition

  • Option 1
  • Option 2
  • Option 3
reasoning
object
store
boolean
text
object
truncation
enum<string> | null
default:disabled

Controls how the model handles inputs longer than the maximum token length

Available options:
auto,
disabled
user
string | null

A unique identifier representing your end-user

service_tier
enum<string> | null

The service tier used for processing the request

Available options:
auto,
default
background
boolean | null

Whether the response was processed in the background

top_logprobs
integer | null

The number of top log probabilities to return for each output token

Required range: 0 <= x <= 20
logprobs
boolean | null

Whether to return log probabilities of the output tokens