Skip to main content
The Responses API (/v3/router/responses) is orq.ai’s primary endpoint for interacting with LLMs. It implements the OpenResponses specification — an open, multi-provider, interoperable interface for language models — and extends it with orq.ai platform features.

Why Responses API?

The Responses API replaces the traditional chat completions pattern with a stateful, item-based model:
  • Stateful by default — responses are stored and can be continued with previous_response_id, eliminating the need to resend the full conversation history.
  • Semantic streaming — events like response.output_text.delta and response.function_call_arguments.delta instead of raw token chunks.
  • Native tool calling — built-in agentic loop where the model calls tools, receives results, and continues automatically.
  • Multi-provider — same API for OpenAI, Anthropic, Google, AWS Bedrock, Azure, Groq, and more.

Quick start

Send a simple text prompt:
curl -X POST https://api.orq.ai/v3/router/responses \
  -H "Authorization: Bearer $ORQ_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "model": "openai/gpt-4o",
    "input": "What is the capital of France?"
  }'

Model selection

The model field accepts two formats:
FormatExampleDescription
provider/modelopenai/gpt-4oDirect model invocation
agent/<key>agent/customer_supportInvoke a pre-configured agent from the orq.ai platform
When using agent/<key>, the agent’s instructions, tools, model, and settings are applied automatically. You can still override parameters like input, variables, and identity per request. Create agents in the Agent Studio or via the Agents API.
curl -X POST https://api.orq.ai/v3/router/responses \
  -H "Authorization: Bearer $ORQ_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "model": "agent/customer_support",
    "input": "I need help with my order #12345",
    "variables": {
      "customer_tier": "premium"
    }
  }'

Multi-turn conversations

Responses are stored by default (store: true). To continue a conversation, pass the previous_response_id:
# 1. Initial request
curl -X POST https://api.orq.ai/v3/router/responses \
  -H "Authorization: Bearer $ORQ_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{"model": "openai/gpt-4o", "input": "My name is Alice"}'

# 2. Follow-up — the model remembers the previous context
curl -X POST https://api.orq.ai/v3/router/responses \
  -H "Authorization: Bearer $ORQ_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "model": "openai/gpt-4o",
    "previous_response_id": "resp_01KP6DFC5FB7K7K10TVP60PF81",
    "input": "What is my name?"
  }'
Setting store to false disables persistence. The response cannot be retrieved later and previous_response_id will not work for follow-up requests.

Tool calling

Function tools

Define custom functions the model can call. When the model decides to use a tool, the response status is requires_action with a function_call output item. Provide the result in a follow-up request. See the Function Tool Continuation guide for a complete walkthrough.
curl -X POST https://api.orq.ai/v3/router/responses \
  -H "Authorization: Bearer $ORQ_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "model": "openai/gpt-4o",
    "input": "What is the weather in Paris?",
    "tools": [{
      "type": "function",
      "name": "get_weather",
      "description": "Get current weather for a location",
      "parameters": {
        "type": "object",
        "properties": { "location": { "type": "string" } },
        "required": ["location"]
      }
    }]
  }'

orq.ai platform tools

Use built-in tools without writing any server-side code:
ToolDescription
orq:current_dateReturns current date/time.
orq:google_searchPerforms a Google search and returns results.
orq:web_scraperScrapes web page content from a URL.
orq:mcpInvokes an MCP tool by tool_id.
orq:httpInvokes an HTTP tool by tool_id.
orq:functionInvokes a platform function tool by tool_id.
curl -X POST https://api.orq.ai/v3/router/responses \
  -H "Authorization: Bearer $ORQ_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "model": "openai/gpt-4o",
    "input": "What time is it?",
    "tools": [{ "type": "orq:current_date" }]
  }'

Reasoning

The reasoning parameter controls how much the model thinks before answering:
EffortDescription
noneNo reasoning
minimalVery brief reasoning
lowLight reasoning
mediumBalanced (default)
highThorough reasoning
xhighMaximum reasoning depth
{
  "reasoning": { "effort": "high" }
}

Variables and secrets

Use variables to substitute {{placeholder}} values in instructions or agent prompts. For sensitive data (API keys, tokens), wrap the value in a secret object — {"secret": true, "value": "..."}. Secrets don’t need to be referenced in the input or instructions — they are automatically passed to platform tools (Python code, HTTP, MCP) when executed. The secret value is:
  • Passed to platform tools — Python code tools, HTTP tools, and MCP tools receive the decrypted value at execution time
  • Stripped from the response — secret variable keys do not appear in response.variables
  • Redacted in OTEL traces — replaced with *** in all trace spans
  • Not persisted across continuations — you must re-send secrets with each request
{
  "input": "Fetch the latest data from the API",
  "variables": {
    "name": "Alice",
    "api_token": { "secret": true, "value": "sk-secret-123" }
  }
}

Secrets in platform tools

When an agent uses Python code tools, HTTP tools, or MCP tools, secret variables are available automatically:
  • Python tools: Variables are merged into the tool’s params and injected as local Python variables. A tool with result = f"Token: {api_token}" receives the decrypted value as the api_token variable at runtime. The tool output is returned to the model as-is, but the secret is redacted in traces.
  • HTTP tools: {{api_token}} in URLs, headers, or request bodies is replaced with the real value before the HTTP call is made.
  • Subagents: Secret variables propagate to subagents in multi-agent systems via the variables context — each subagent receives the same variables (including secrets) and can use them in its own tools. Secrets are redacted from all traces across the entire execution chain.

Multimodal input

Send images and files alongside text using the array input format:
{
  "model": "openai/gpt-4o",
  "input": [{
    "type": "message",
    "role": "user",
    "content": [
      { "type": "input_text", "text": "What is in this image?" },
      { "type": "input_image", "image_url": "https://picsum.photos/id/237/200/300", "detail": "low" }
    ]
  }]
}
Supported content part types:
TypeFieldsDescription
input_texttextPlain text
input_imageimage_url, file_id, detailImage from URL or uploaded file
input_filefile_id, file_data, file_url, filename, mime_typePDF or document

Streaming

Set stream: true to receive server-sent events. Key event types:
EventDescription
response.createdResponse object created
response.in_progressProcessing started
response.output_text.deltaText token streamed
response.output_text.doneText output complete
response.reasoning.deltaReasoning token streamed
response.reasoning.doneReasoning output complete
response.function_call_arguments.deltaFunction call argument streamed
response.function_call_arguments.doneFunction call arguments complete
response.output_item.addedNew output item (message, function_call, reasoning)
response.output_item.doneOutput item complete
response.completedFull response complete with usage

Memory

Attach a memory store entity to enable persistent memory across conversations:
{
  "model": "openai/gpt-4o",
  "input": "What do you remember about me?",
  "memory": { "entity_id": "mem_entity_123" }
}

API reference

Create Response

POST /v3/router/responses

Retrieve Response

GET /v3/router/responses/