The Responses API can invoke tools hosted on any MCP-compatible server. Point a request at an MCP server and the model can discover and call its tools inside the same agentic loop as built-in and custom tools — no extra round-trips through your application.
orq.ai supports two ways to reference an MCP server in a Responses API call:
| Mode | When to use |
|---|
| Pre-saved by key | Most cases. Save the server once under Tools, reference it by key from any Responses call in the workspace. Credentials stay encrypted at rest. |
| Inline | One-off calls, or servers you haven’t saved under Tools. Credentials are supplied in the request body (templated secrets still work). |
Both modes accept an allowed_tools filter so you can expose only the subset of tools relevant to a given call.
Quick start
Pre-saved server
First, save the MCP server as a tool — use the Create Tool endpoint or Tool Studio:
curl -X POST https://api.orq.ai/v2/tools \
-H "Authorization: Bearer $ORQ_API_KEY" \
-H "Content-Type: application/json" \
-d '{
"path": "Default",
"key": "linear_mcp",
"display_name": "Linear MCP",
"description": "Linear issue tracker via MCP",
"status": "live",
"type": "mcp",
"mcp": {
"server_url": "https://mcp.linear.app/mcp",
"connection_type": "http",
"headers": {
"Authorization": { "value": "Bearer lin_api_...", "encrypted": true }
}
}
}'
Then reference it in a Responses call by key:
curl -X POST https://api.orq.ai/v3/router/responses \
-H "Authorization: Bearer $ORQ_API_KEY" \
-H "Content-Type: application/json" \
-d '{
"model": "openai/gpt-4o-mini",
"input": "List the teams in Linear",
"tools": [
{ "type": "mcp", "key": "linear_mcp" }
]
}'
The model sees every tool the MCP server exposes. Credentials are decrypted at execution time and never leave orq.ai.
Inline server
Supply the server details directly in the request body. Use this for one-off calls or when you haven’t saved the server under Tools yet:
curl -X POST https://api.orq.ai/v3/router/responses \
-H "Authorization: Bearer $ORQ_API_KEY" \
-H "Content-Type: application/json" \
-d '{
"model": "openai/gpt-4o-mini",
"input": "List the teams in Linear",
"tools": [{
"type": "mcp",
"server_url": "https://mcp.linear.app/mcp",
"server_description": "Linear issue tracker",
"headers": {
"Authorization": "Bearer lin_api_..."
}
}]
}'
Inline servers accept the same fields as pre-saved ones. There’s no at-rest encryption — the secret rides in the request and is kept in memory only for the life of the call.
Large MCP servers can ship dozens of tools. Passing every one of them to the model inflates the prompt, slows tool selection, and sometimes exceeds provider limits. Use allowed_tools to narrow the list:
curl -X POST https://api.orq.ai/v3/router/responses \
-H "Authorization: Bearer $ORQ_API_KEY" \
-H "Content-Type: application/json" \
-d '{
"model": "openai/gpt-4o-mini",
"input": "List the teams and also list the issues in Linear",
"tools": [{
"type": "mcp",
"key": "linear_mcp",
"allowed_tools": ["list_teams", "list_issues"]
}]
}'
Three accepted shapes:
| Value | Meaning |
|---|
["tool_a", "tool_b"] | Only expose these tool names. |
{"tool_names": ["tool_a"]} | Same as above, object form. |
{"read_only": true} | Only expose tools the server marks readOnlyHint: true. Combine with tool_names to intersect. |
Tools not in the filter are invisible to the model and cannot be invoked, even if the model guesses their names.
Both inline and pre-saved modes use the same shape:
| Field | Type | Required | Description |
|---|
type | string | yes | "mcp". |
key | string | one of key/server_url | Key of a pre-saved MCP tool. |
server_url | string | one of key/server_url | HTTPS endpoint of the MCP server (inline mode). |
server_description | string | no | Short description; helps the model understand the server’s purpose. |
headers | object | no | HTTP headers sent with every MCP request. Values may reference {{variables}}. |
allowed_tools | array or object | no | Filter described above. |
server_url must use http or https and be reachable from orq.ai. URLs whose host is an IP literal in a loopback, link-local, private (RFC 1918), unspecified, or cloud-metadata range are rejected as invalid.
Authentication and secrets
Most production MCP servers require an auth header. Two options:
When creating the tool, mark each sensitive header as encrypted:
"headers": {
"Authorization": { "value": "Bearer sk-live-...", "encrypted": true }
}
Encrypted values are stored with workspace-scoped encryption, decrypted on each call, and redacted from traces.
Request-time variables
Any MCP header — pre-saved or inline — can include {{variable}} placeholders, resolved at call time from the request’s variables field. Combine with the Responses API’s secret wrapper to keep tokens out of logs.
Pre-saved: store the template in the tool’s headers:
"headers": {
"Authorization": "Bearer {{linear_token}}"
}
Then fill it in per call:
curl -X POST https://api.orq.ai/v3/router/responses \
-H "Authorization: Bearer $ORQ_API_KEY" \
-H "Content-Type: application/json" \
-d '{
"model": "openai/gpt-4o-mini",
"input": "List the teams in Linear",
"tools": [{ "type": "mcp", "key": "linear_mcp" }],
"variables": {
"linear_token": { "secret": true, "value": "lin_api_..." }
}
}'
Inline: put the template directly in the tool entry — same {{...}} syntax, same variables block:
curl -X POST https://api.orq.ai/v3/router/responses \
-H "Authorization: Bearer $ORQ_API_KEY" \
-H "Content-Type: application/json" \
-d '{
"model": "openai/gpt-4o-mini",
"input": "List the teams in Linear",
"tools": [{
"type": "mcp",
"server_url": "https://mcp.linear.app/mcp",
"headers": {
"Authorization": "Bearer {{linear_token}}"
}
}],
"variables": {
"linear_token": { "secret": true, "value": "lin_api_..." }
}
}'
Secrets are stripped from the stored response and redacted from traces. See Variables and secrets for the full semantics.
Each call connects to the MCP server and lists its tools before the model runs. For inline servers this happens on every request. For pre-saved servers orq.ai uses the tool catalog captured at save time, avoiding the extra round-trip — if the remote server adds new tools, refresh the orq.ai tool to refresh the catalog.
Streaming
MCP tool calls emit the same streaming events as function tools, plus three MCP-specific events you’ll see when stream: true:
| Event | Emitted when |
|---|
response.mcp_call.in_progress | The MCP tool starts executing. |
response.mcp_call.completed | Tool returned a successful result. |
response.mcp_call.failed | Tool raised an error or the connection failed. |
The item type on the response is mcp_call (not function_call) — match on that if you’re assembling output items on the client.
Observability
Every MCP tool invocation appears in traces as a child span of the agent loop, with attributes (OTel GenAI + MCP semantic conventions):
server.address — the MCP server URL.
mcp.session.id — the pre-saved tool’s key, or the inline server URL for ad-hoc calls.
mcp.method.name — always tools/call for tool invocations.
gen_ai.tool.name — the tool name the model called (e.g. list_teams).
gen_ai.tool.type — mcp (distinguishes MCP calls from function/HTTP/code tool calls on the same loop).
gen_ai.tool.call.id — the call ID matching the mcp_call item in the stored response.
gen_ai.tool.call.arguments — JSON-encoded arguments passed to the tool (secrets redacted).
Tokens, cost, and duration roll up into the parent agent/response span the same way function tool calls do.
Worked examples
curl -X POST https://api.orq.ai/v3/router/responses \
-H "Authorization: Bearer $ORQ_API_KEY" \
-H "Content-Type: application/json" \
-d '{
"model": "openai/gpt-4o-mini",
"input": "Find the most recent open issue in the Engineering team.",
"tools": [{ "type": "mcp", "key": "linear_mcp" }]
}'
Two pre-saved servers in the same call
Stack multiple MCP tools — each tools[] entry is its own server, each with its own filter:
curl -X POST https://api.orq.ai/v3/router/responses \
-H "Authorization: Bearer $ORQ_API_KEY" \
-H "Content-Type: application/json" \
-d '{
"model": "openai/gpt-4o-mini",
"input": "Find tickets from yesterday in Linear and the related Slack threads.",
"tools": [
{ "type": "mcp", "key": "linear_mcp", "allowed_tools": ["list_issues"] },
{ "type": "mcp", "key": "slack_mcp", "allowed_tools": ["search_messages"] }
]
}'
Inline server with per-user credentials
curl -X POST https://api.orq.ai/v3/router/responses \
-H "Authorization: Bearer $ORQ_API_KEY" \
-H "Content-Type: application/json" \
-d '{
"model": "openai/gpt-4o-mini",
"input": "Draft a status update based on my recent issues.",
"tools": [{
"type": "mcp",
"server_url": "https://mcp.linear.app/mcp",
"server_description": "Linear for the requesting user",
"headers": {
"Authorization": "Bearer {{user_linear_token}}"
},
"allowed_tools": ["list_issues", "list_comments"]
}],
"variables": {
"user_linear_token": { "secret": true, "value": "lin_api_..." }
}
}'
Read-only filter
Lock the model to tools the MCP server advertises as non-mutating:
curl -X POST https://api.orq.ai/v3/router/responses \
-H "Authorization: Bearer $ORQ_API_KEY" \
-H "Content-Type: application/json" \
-d '{
"model": "openai/gpt-4o-mini",
"input": "Summarize recent activity.",
"tools": [{
"type": "mcp",
"key": "linear_mcp",
"allowed_tools": { "read_only": true }
}]
}'
Error reference
User-correctable MCP failures surface as HTTP 400 with type: "invalid_request" on the Responses API call itself. The message field names the server (by key or URL) and the nature of the failure. Typical causes:
| Cause | Example message |
|---|
| Rejected server URL (bad scheme, IP literal in a disallowed range, or not a URL) | MCP server URL must not point to loopback, link-local, private, or unspecified addresses |
Pre-saved key does not resolve | failed to resolve MCP server "foo": tool not found |
Remote server refused the handshake (401/403 or similar) | mcp connect to "foo" failed: ... |
| Remote server unreachable or malformed response | mcp list tools from "foo" failed: ... |
Server-side failures (unexpected panics, downstream outages) surface as 500 with type: "internal_error".
Tool-call-time failures — e.g. an individual tool raising an error — are reported as an mcp_call output item with status: "failed" and the error content in output. The overall HTTP response is still 200 because the request itself succeeded; the failure is per-tool.
Limits
- Transport: Streamable HTTP (
connection_type: "http") and SSE (connection_type: "sse"). Streamable HTTP is the default and preferred.
- Tool discovery per call: up to 250 tools across all MCP servers in a single request.
- Per-tool call timeout: 10 minutes (inherits the platform’s tool execution timeout).
- Encrypted header size: 16 KB per header value.