curl --request PATCH \
--url https://api.orq.ai/v2/agents/{agent_key} \
--header 'Authorization: Bearer <token>' \
--header 'Content-Type: application/json' \
--data '
{
"key": "<string>",
"display_name": "<string>",
"project_id": "<string>",
"role": "<string>",
"description": "<string>",
"instructions": "<string>",
"system_prompt": "<string>",
"model": "<string>",
"fallback_models": [
"<string>"
],
"settings": {
"max_iterations": 100,
"max_execution_time": 300,
"tool_approval_required": "respect_tool",
"tools": [],
"evaluators": [
{
"id": "<string>",
"execute_on": "input",
"sample_rate": 50
}
],
"guardrails": [
{
"id": "<string>",
"execute_on": "input",
"sample_rate": 50
}
]
},
"path": "Default",
"memory_stores": [
"<string>"
],
"knowledge_bases": [
{
"knowledge_id": "customer-knowledge-base"
}
],
"team_of_agents": [
{
"key": "<string>",
"role": "<string>"
}
],
"variables": {}
}
'{
"_id": "<string>",
"key": "<string>",
"display_name": "<string>",
"workspace_id": "<string>",
"project_id": "<string>",
"role": "<string>",
"description": "<string>",
"instructions": "<string>",
"status": "live",
"model": {
"id": "<string>",
"integration_id": "<string>",
"parameters": {
"audio": {
"voice": "alloy",
"format": "wav"
},
"frequency_penalty": 123,
"max_tokens": 123,
"max_completion_tokens": 123,
"logprobs": true,
"top_logprobs": 10,
"n": 2,
"presence_penalty": 123,
"response_format": {
"type": "text"
},
"reasoning_effort": "<string>",
"verbosity": "<string>",
"seed": 123,
"stop": "<string>",
"stream_options": {
"include_usage": true
},
"thinking": {
"type": "enabled",
"budget_tokens": 123,
"thinking_level": "low"
},
"temperature": 1,
"top_p": 123,
"top_k": 123,
"tool_choice": "none",
"parallel_tool_calls": true,
"modalities": [
"text"
]
},
"retry": {
"count": 3,
"on_codes": [
429,
500,
502,
503,
504
]
},
"fallback_models": [
"<string>"
]
},
"path": "Default",
"memory_stores": [
"<string>"
],
"team_of_agents": [
{
"key": "<string>",
"role": "<string>"
}
],
"created_by_id": "<string>",
"updated_by_id": "<string>",
"created": "<string>",
"updated": "<string>",
"system_prompt": "<string>",
"settings": {
"max_execution_time": 300,
"max_iterations": 100,
"tool_approval_required": "respect_tool",
"tools": []
},
"version_hash": "<string>",
"metrics": {
"total_cost": 0
},
"variables": {},
"knowledge_bases": [
{
"knowledge_id": "customer-knowledge-base"
}
]
}Modifies an existing agent’s configuration with partial updates. Supports updating any aspect of the agent including model assignments (primary and fallback), instructions, tools, knowledge bases, memory stores, and execution parameters. Only the fields provided in the request body will be updated; all other fields remain unchanged. Changes take effect immediately for new agent invocations.
curl --request PATCH \
--url https://api.orq.ai/v2/agents/{agent_key} \
--header 'Authorization: Bearer <token>' \
--header 'Content-Type: application/json' \
--data '
{
"key": "<string>",
"display_name": "<string>",
"project_id": "<string>",
"role": "<string>",
"description": "<string>",
"instructions": "<string>",
"system_prompt": "<string>",
"model": "<string>",
"fallback_models": [
"<string>"
],
"settings": {
"max_iterations": 100,
"max_execution_time": 300,
"tool_approval_required": "respect_tool",
"tools": [],
"evaluators": [
{
"id": "<string>",
"execute_on": "input",
"sample_rate": 50
}
],
"guardrails": [
{
"id": "<string>",
"execute_on": "input",
"sample_rate": 50
}
]
},
"path": "Default",
"memory_stores": [
"<string>"
],
"knowledge_bases": [
{
"knowledge_id": "customer-knowledge-base"
}
],
"team_of_agents": [
{
"key": "<string>",
"role": "<string>"
}
],
"variables": {}
}
'{
"_id": "<string>",
"key": "<string>",
"display_name": "<string>",
"workspace_id": "<string>",
"project_id": "<string>",
"role": "<string>",
"description": "<string>",
"instructions": "<string>",
"status": "live",
"model": {
"id": "<string>",
"integration_id": "<string>",
"parameters": {
"audio": {
"voice": "alloy",
"format": "wav"
},
"frequency_penalty": 123,
"max_tokens": 123,
"max_completion_tokens": 123,
"logprobs": true,
"top_logprobs": 10,
"n": 2,
"presence_penalty": 123,
"response_format": {
"type": "text"
},
"reasoning_effort": "<string>",
"verbosity": "<string>",
"seed": 123,
"stop": "<string>",
"stream_options": {
"include_usage": true
},
"thinking": {
"type": "enabled",
"budget_tokens": 123,
"thinking_level": "low"
},
"temperature": 1,
"top_p": 123,
"top_k": 123,
"tool_choice": "none",
"parallel_tool_calls": true,
"modalities": [
"text"
]
},
"retry": {
"count": 3,
"on_codes": [
429,
500,
502,
503,
504
]
},
"fallback_models": [
"<string>"
]
},
"path": "Default",
"memory_stores": [
"<string>"
],
"team_of_agents": [
{
"key": "<string>",
"role": "<string>"
}
],
"created_by_id": "<string>",
"updated_by_id": "<string>",
"created": "<string>",
"updated": "<string>",
"system_prompt": "<string>",
"settings": {
"max_execution_time": 300,
"max_iterations": 100,
"tool_approval_required": "respect_tool",
"tools": []
},
"version_hash": "<string>",
"metrics": {
"total_cost": 0
},
"variables": {},
"knowledge_bases": [
{
"knowledge_id": "customer-knowledge-base"
}
]
}Bearer authentication header of the form Bearer <token>, where <token> is your auth token.
The unique key of the agent to update
Request body for updating an existing agent via the API. Uses simplified tool input format.
1A custom system prompt template for the agent. If omitted, the default template is used.
1Model configuration for agent execution. Can be a simple model ID string or a configuration object with optional behavior parameters and retry settings.
Optional array of fallback models used when the primary model fails. Fallbacks are attempted in order. All models must support tool calling.
Fallback model for automatic failover when primary model request fails. Supports optional parameter overrides. Can be a simple model ID string or a configuration object with model-specific parameters. Fallbacks are tried in order.
Show child attributes
Maximum iterations(llm calls) before the agent will stop executing.
x > 1Maximum time (in seconds) for the agent thinking process. This does not include the time for tool calls and sub agent calls. It will be loosely enforced, the in progress LLM calls will not be terminated and the last assistant message will be returned.
x > 2If all, the agent will require approval for all tools. If respect_tool, the agent will require approval for tools that have the requires_approval flag set to true. If none, the agent will not require approval for any tools.
all, respect_tool, none Tools available to the agent. Built-in tools only need a type, while custom tools (http, code, function) must reference pre-created tools by key or id.
Tool configuration for agent create/update operations. Built-in tools only require a type, while custom tools (HTTP, Code, Function, MCP) must reference pre-created tools by key or id.
Configuration for an evaluator applied to the agent
Show child attributes
Unique key or identifier of the evaluator
Determines whether the evaluator runs on the agent input (user message) or output (agent response).
input, output The percentage of executions to evaluate with this evaluator (1-100). For example, a value of 50 means the evaluator will run on approximately half of the executions.
1 <= x <= 100Configuration for a guardrail applied to the agent
Show child attributes
Unique key or identifier of the evaluator
Determines whether the evaluator runs on the agent input (user message) or output (agent response).
input, output The percentage of executions to evaluate with this evaluator (1-100). For example, a value of 50 means the evaluator will run on approximately half of the executions.
1 <= x <= 100Entity storage path in the format: project/folder/subfolder/...
The first element identifies the project, followed by nested folders (auto-created as needed).
With project-based API keys, the first element is treated as a folder name, as the project is predetermined by the API key.
"Default"
Array of memory store identifiers. Accepts both memory store IDs and keys.
The agents that are accessible to this orchestrator. The main agent can hand off to these agents to perform tasks.
Agent configuration successfully updated. Returns the complete updated agent manifest reflecting all changes made.
Unique identifier for the agent within the workspace
1The status of the agent. Live is the latest version of the agent. Draft is a version that is not yet published. Pending is a version that is pending approval. Published is a version that was live and has been replaced by a new version.
live, draft, pending, published Show child attributes
The database ID of the primary model
Optional integration ID for custom model configurations
Model behavior parameters (snake_case) stored as part of the agent configuration. These become the default parameters used when the agent is executed. Commonly used: temperature (0-1, controls randomness), max_completion_tokens (response length), top_p (nucleus sampling). Advanced: frequency_penalty, presence_penalty, response_format (JSON/structured output), reasoning_effort (for o1/thinking models), seed (reproducibility), stop sequences. Model-specific support varies. Runtime parameters in agent execution requests can override these defaults.
Show child attributes
Parameters for audio output. Required when audio output is requested with modalities: ["audio"]. Learn more.
Show child attributes
The voice the model uses to respond. Supported voices are alloy, echo, fable, onyx, nova, and shimmer.
alloy, echo, fable, onyx, nova, shimmer Specifies the output audio format. Must be one of wav, mp3, flac, opus, or pcm16.
wav, mp3, flac, opus, pcm16 Number between -2.0 and 2.0. Positive values penalize new tokens based on their existing frequency in the text so far, decreasing the model's likelihood to repeat the same line verbatim.
[Deprecated]. The maximum number of tokens that can be generated in the chat completion. This value can be used to control costs for text generated via API.
This value is now deprecated in favor of max_completion_tokens, and is not compatible with o1 series models.
An upper bound for the number of tokens that can be generated for a completion, including visible output tokens and reasoning tokens
Whether to return log probabilities of the output tokens or not. If true, returns the log probabilities of each output token returned in the content of message.
An integer between 0 and 20 specifying the number of most likely tokens to return at each token position, each with an associated log probability. logprobs must be set to true if this parameter is used.
0 <= x <= 20How many chat completion choices to generate for each input message. Note that you will be charged based on the number of generated tokens across all of the choices. Keep n as 1 to minimize costs.
x >= 1Number between -2.0 and 2.0. Positive values penalize new tokens based on whether they appear in the text so far, increasing the model's likelihood to talk about new topics.
Constrains effort on reasoning for reasoning models. Reducing reasoning effort can result in faster responses and fewer tokens used on reasoning in a response.
Adjusts response verbosity. Lower levels yield shorter answers.
If specified, our system will make a best effort to sample deterministically, such that repeated requests with the same seed and parameters should return the same result.
Up to 4 sequences where the API will stop generating further tokens.
Options for streaming response. Only set this when you set stream: true.
Show child attributes
If set, an additional chunk will be streamed before the data: [DONE] message. The usage field on this chunk shows the token usage statistics for the entire request, and the choices field will always be an empty array. All other chunks will also include a usage field, but with a null value.
Show child attributes
Enables or disables the thinking mode capability
enabled, disabled Determines how many tokens the model can use for its internal reasoning process. Larger budgets can enable more thorough analysis for complex problems, improving response quality. Must be ≥1024 and less than max_tokens.
The level of reasoning the model should use. This setting is supported only by gemini-3 models. If budget_tokens is specified and thinking_level is available, budget_tokens will be ignored.
low, high What sampling temperature to use, between 0 and 2. Higher values like 0.8 will make the output more random, while lower values like 0.2 will make it more focused and deterministic.
0 <= x <= 2An alternative to sampling with temperature, called nucleus sampling, where the model considers the results of the tokens with top_p probability mass.
Limits the model to consider only the top k most likely tokens at each step.
Controls which (if any) tool is called by the model.
none, auto, required Whether to enable parallel function calling during tool use.
Output types that you would like the model to generate. Most models are capable of generating text, which is the default: ["text"]. The gpt-4o-audio-preview model can also be used to generate audio. To request that this model generate both text and audio responses, you can use: ["text", "audio"].
text, audio Retry configuration for model requests. Allows customizing retry count (1-5) and HTTP status codes that trigger retries. Default codes: [429]. Common codes: 500 (internal error), 429 (rate limit), 502/503/504 (gateway errors).
Optional array of fallback models (string IDs or config objects) that will be used automatically in order if the primary model fails
Fallback model for automatic failover when primary model request fails. Supports optional parameter overrides. Can be a simple model ID string or a configuration object with model-specific parameters. Fallbacks are tried in order.
Entity storage path in the format: project/folder/subfolder/...
The first element identifies the project, followed by nested folders (auto-created as needed).
With project-based API keys, the first element is treated as a folder name, as the project is predetermined by the API key.
"Default"
Array of memory store identifiers. Accepts both memory store IDs and keys.
The agents that are accessible to this orchestrator. The main agent can hand off to these agents to perform tasks.
1Show child attributes
Maximum iterations(llm calls) before the agent will stop executing.
x > 1Maximum time (in seconds) for the agent thinking process. This does not include the time for tool calls and sub agent calls. It will be loosely enforced, the in progress LLM calls will not be terminated and the last assistant message will be returned.
x > 2If all, the agent will require approval for all tools. If respect_tool, the agent will require approval for tools that have the requires_approval flag set to true. If none, the agent will not require approval for any tools.
all, respect_tool, none Show child attributes
The id of the resource
Optional tool key for custom tools
Optional tool description
Nested tool ID for MCP tools (identifies specific tool within MCP server)
Tool execution timeout in seconds (default: 2 minutes, max: 10 minutes)
1 <= x <= 600Configuration for an evaluator applied to the agent
Show child attributes
Unique key or identifier of the evaluator
Determines whether the evaluator runs on the agent input (user message) or output (agent response).
input, output The percentage of executions to evaluate with this evaluator (1-100). For example, a value of 50 means the evaluator will run on approximately half of the executions.
1 <= x <= 100Configuration for a guardrail applied to the agent
Show child attributes
Unique key or identifier of the evaluator
Determines whether the evaluator runs on the agent input (user message) or output (agent response).
input, output The percentage of executions to evaluate with this evaluator (1-100). For example, a value of 50 means the evaluator will run on approximately half of the executions.
1 <= x <= 100Was this page helpful?