> ## Documentation Index
> Fetch the complete documentation index at: https://docs.orq.ai/llms.txt
> Use this file to discover all available pages before exploring further.

# Retrieve agent

> Retrieves detailed information about a specific agent identified by its unique key or identifier. Returns the complete agent manifest including configuration settings, model assignments (primary and fallback), tools, knowledge bases, memory stores, instructions, and execution parameters. Use this endpoint to fetch the current state and configuration of an individual agent.


## OpenAPI

````yaml get /v2/agents/{agent_key}
openapi: 3.1.0
info:
  title: orq.ai API
  version: '2.0'
  description: orq.ai API documentation
servers:
  - url: https://api.orq.ai
security:
  - ApiKey: []
tags:
  - description: List models available through the AI Router.
    name: Models
  - name: Guardrail Rules
  - name: Policies
  - name: Routing Rules
  - name: API keys
    description: >-
      API keys authenticate programmatic access to the workspace. The unified
      key model exposes opaque tokens, per-domain access grants, and budget /
      rate-limit constraints (see ADR 0001 and ADR 0002).
  - name: Budgets
    description: >-
      Budgets govern spend, token usage, and request rate across six scopes:
      workspace, project, identity, api-key, provider, and model. A budget is
      hierarchical and defense-in-depth — every applicable budget is a hard
      gate, and the most restrictive one wins per dimension (see ADR 0007).
  - name: Documentation
    description: >-
      Search the orq.ai documentation. Proxies the workspace's query to the
      hosted docs search index.
  - name: Files
    description: File upload and retrieval operations.
  - name: Identities
    description: >-
      Identities represent end users from your system for usage and engagement
      tracking.
  - name: Projects
    description: Projects organize resources within a workspace
  - name: Skills
    description: >-
      Skills are modular instructions you can use to codify processes and
      conventions
  - name: Responses
  - description: >-
      Run agents on a cadence — cron, interval, or one-off. Minimum firing
      interval is 1 hour.
    name: Agent Schedules
  - name: Embeddings
  - name: Reporting
    description: >-
      GenAI reporting API over canonical analytics rollups. Accepts a metric
      name, time range, grain, group-by, and filters; returns a typed time
      series and optional totals.
externalDocs:
  url: https://docs.orq.ai
  description: orq.ai Documentation
paths:
  /v2/agents/{agent_key}:
    get:
      tags:
        - Agents
      summary: Retrieve agent
      description: >-
        Retrieves detailed information about a specific agent identified by its
        unique key or identifier. Returns the complete agent manifest including
        configuration settings, model assignments (primary and fallback), tools,
        knowledge bases, memory stores, instructions, and execution parameters.
        Use this endpoint to fetch the current state and configuration of an
        individual agent.
      operationId: RetrieveAgentRequest
      parameters:
        - schema:
            type: string
            description: The unique key of the agent to retrieve
          required: true
          description: The unique key of the agent to retrieve
          name: agent_key
          in: path
      responses:
        '200':
          description: >-
            Agent successfully retrieved. Returns the complete agent manifest
            with all configuration details, including models, tools, knowledge
            bases, and execution settings.
          content:
            application/json:
              schema:
                type: object
                properties:
                  _id:
                    type: string
                  key:
                    type: string
                    pattern: ^[A-Za-z][A-Za-z0-9]*([._-][A-Za-z0-9]+)*$
                    description: Unique identifier for the agent within the workspace
                  display_name:
                    type: string
                  project_id:
                    type: string
                  created_by_id:
                    type:
                      - string
                      - 'null'
                  updated_by_id:
                    type:
                      - string
                      - 'null'
                  created:
                    type: string
                  updated:
                    type: string
                  status:
                    type: string
                    enum:
                      - live
                      - draft
                      - pending
                      - published
                    description: >-
                      The status of the agent. `Live` is the latest version of
                      the agent. `Draft` is a version that is not yet published.
                      `Pending` is a version that is pending approval.
                      `Published` is a version that was live and has been
                      replaced by a new version.
                  version:
                    type: string
                    description: Current semantic version of the agent manifest.
                  path:
                    type: string
                    description: >-
                      Entity storage path.


                      With workspace-level API keys, use the format
                      `project/folder/subfolder/...`. The first element
                      identifies the project, followed by nested folders
                      (auto-created as needed). Example: `Default/agents`.


                      With project-level API keys, the project is predetermined
                      by the API key, so the path is relative to that project.
                      Example: `agents`. For backward compatibility, a leading
                      project name is ignored when it matches the scoped
                      project.
                    example: Default
                  memory_stores:
                    type: array
                    items:
                      type: string
                    default: []
                    description: >-
                      Array of memory store identifiers. Accepts both memory
                      store IDs and keys.
                  team_of_agents:
                    type: array
                    items:
                      type: object
                      properties:
                        key:
                          type: string
                          description: The unique key of the agent within the workspace
                        role:
                          type: string
                          description: >-
                            The role of the agent in this context. This is used
                            to give extra information to the leader to help it
                            decide which agent to hand off to.
                      required:
                        - key
                    default: []
                    description: >-
                      The agents that are accessible to this orchestrator. The
                      main agent can hand off to these agents to perform tasks.
                  skills:
                    type: array
                    items:
                      type: string
                    description: >-
                      List of skills that the agent can utilize. This field
                      allows you to specify which skills the agent has access
                      to, enabling more complex and dynamic behavior.
                  metrics:
                    type: object
                    properties:
                      total_cost:
                        type: number
                        minimum: 0
                        default: 0
                    default:
                      total_cost: 0
                  variables:
                    type: object
                    additionalProperties: {}
                    description: Extracted variables from agent instructions
                  knowledge_bases:
                    type: array
                    items:
                      type: object
                      properties:
                        knowledge_id:
                          type: string
                          description: Unique identifier of the knowledge base to search
                          example: customer-knowledge-base
                      required:
                        - knowledge_id
                    description: Agent knowledge bases reference
                  source:
                    type: string
                    enum:
                      - internal
                      - external
                      - experiment
                  engine:
                    type: string
                    enum:
                      - text
                      - jinja
                      - mustache
                    default: text
                  type:
                    type: string
                    enum:
                      - internal
                      - a2a
                    default: internal
                    description: >-
                      Agent type: internal (Orquesta-managed) or a2a (external
                      A2A-compliant)
                  role:
                    type: string
                    minLength: 1
                  description:
                    type: string
                  system_prompt:
                    type:
                      - string
                      - 'null'
                    minLength: 1
                  instructions:
                    type: string
                  settings:
                    type: object
                    properties:
                      max_iterations:
                        type: integer
                        exclusiveMinimum: 0
                        maximum: 100
                        minimum: 1
                        default: 100
                        description: >-
                          Maximum iterations(llm calls) before the agent will
                          stop executing.
                      max_execution_time:
                        type: integer
                        minimum: 2
                        exclusiveMinimum: 0
                        maximum: 600
                        default: 600
                        description: >-
                          Maximum time (in seconds) for the agent thinking
                          process. This does not include the time for tool calls
                          and sub agent calls. It will be loosely enforced, the
                          in progress LLM calls will not be terminated and the
                          last assistant message will be returned.
                      max_cost:
                        type: number
                        minimum: 0
                        default: 0
                        description: >-
                          Maximum cost in USD for the agent execution. When the
                          accumulated cost exceeds this limit, the agent will
                          stop executing. Set to 0 for unlimited. Only supported
                          in v3 responses
                      tool_approval_required:
                        type: string
                        enum:
                          - all
                          - respect_tool
                          - none
                        default: respect_tool
                        description: >-
                          If all, the agent will require approval for all tools.
                          If respect_tool, the agent will require approval for
                          tools that have the requires_approval flag set to
                          true. If none, the agent will not require approval for
                          any tools.
                      tools:
                        type: array
                        items:
                          type: object
                          properties:
                            id:
                              type: string
                              format: ulid
                              pattern: ^[0-9A-HJKMNP-TV-Z]{26}$
                              readOnly: true
                              description: The id of the resource
                            key:
                              type: string
                              description: Optional tool key for custom tools
                            action_type:
                              type: string
                            display_name:
                              type: string
                            description:
                              type: string
                              description: Optional tool description
                            requires_approval:
                              type: boolean
                              default: false
                            tool_id:
                              type: string
                              description: >-
                                Nested tool ID for MCP tools (identifies
                                specific tool within MCP server)
                            conditions:
                              type: array
                              items:
                                type: object
                                properties:
                                  condition:
                                    type: string
                                    description: The argument of the tool call to evaluate
                                  operator:
                                    type: string
                                    description: The operator to use
                                  value:
                                    type: string
                                    description: The value to compare against
                                required:
                                  - condition
                                  - operator
                                  - value
                              default: []
                            timeout:
                              type: number
                              minimum: 1
                              maximum: 600
                              default: 120
                              description: >-
                                Tool execution timeout in seconds (default: 2
                                minutes, max: 10 minutes)
                          required:
                            - id
                            - action_type
                        default: []
                      evaluators:
                        type: array
                        items:
                          type: object
                          properties:
                            id:
                              type: string
                              description: Unique key or identifier of the evaluator
                            sample_rate:
                              type: number
                              minimum: 1
                              maximum: 100
                              default: 50
                              description: >-
                                The percentage of executions to evaluate with
                                this evaluator (1-100). For example, a value of
                                50 means the evaluator will run on approximately
                                half of the executions.
                            execute_on:
                              type: string
                              enum:
                                - input
                                - output
                              description: >-
                                Determines whether the evaluator runs on the
                                agent input (user message) or output (agent
                                response).
                          required:
                            - id
                            - execute_on
                        title: Agent evaluator configuration
                        description: Configuration for an evaluator applied to the agent
                      guardrails:
                        type: array
                        items:
                          type: object
                          properties:
                            id:
                              type: string
                              description: Unique key or identifier of the evaluator
                            sample_rate:
                              type: number
                              minimum: 1
                              maximum: 100
                              default: 50
                              description: >-
                                The percentage of executions to evaluate with
                                this evaluator (1-100). For example, a value of
                                50 means the evaluator will run on approximately
                                half of the executions.
                            execute_on:
                              type: string
                              enum:
                                - input
                                - output
                              description: >-
                                Determines whether the evaluator runs on the
                                agent input (user message) or output (agent
                                response).
                          required:
                            - id
                            - execute_on
                        title: Agent guardrail configuration
                        description: Configuration for a guardrail applied to the agent
                    default:
                      max_execution_time: 600
                      max_iterations: 100
                      max_cost: 0
                      tool_approval_required: respect_tool
                      tools: []
                  model:
                    type: object
                    properties:
                      id:
                        type: string
                        description: The database ID of the primary model
                      integration_id:
                        type:
                          - string
                          - 'null'
                        description: >-
                          Optional integration ID for custom model
                          configurations
                      parameters:
                        type: object
                        properties:
                          name:
                            description: >-
                              The name to display on the trace. If not
                              specified, the default system name will be used.
                            type: string
                          frequency_penalty:
                            type:
                              - number
                              - 'null'
                            description: >-
                              Number between -2.0 and 2.0. Positive values
                              penalize new tokens based on their existing
                              frequency in the text so far, decreasing the
                              model's likelihood to repeat the same line
                              verbatim.
                          max_tokens:
                            type:
                              - integer
                              - 'null'
                            description: >-
                              `[Deprecated]`. The maximum number of tokens that
                              can be generated in the chat completion. This
                              value can be used to control costs for text
                              generated via API. 

                               This value is now `deprecated` in favor of `max_completion_tokens`, and is not compatible with o1 series models.
                          max_completion_tokens:
                            type:
                              - integer
                              - 'null'
                            exclusiveMinimum: 0
                            description: >-
                              An upper bound for the number of tokens that can
                              be generated for a completion, including visible
                              output tokens and reasoning tokens
                          presence_penalty:
                            type:
                              - number
                              - 'null'
                            description: >-
                              Number between -2.0 and 2.0. Positive values
                              penalize new tokens based on whether they appear
                              in the text so far, increasing the model's
                              likelihood to talk about new topics.
                          response_format:
                            oneOf:
                              - type: object
                                properties:
                                  type:
                                    type: string
                                    enum:
                                      - text
                                required:
                                  - type
                                title: Text
                                description: >-


                                  Default response format. Used to generate text
                                  responses
                              - type: object
                                properties:
                                  type:
                                    type: string
                                    enum:
                                      - json_object
                                required:
                                  - type
                                title: JSON object
                                description: >-


                                  JSON object response format. An older method
                                  of generating JSON responses. Using
                                  `json_schema` is recommended for models that
                                  support it. Note that the model will not
                                  generate JSON without a system or user message
                                  instructing it to do so.
                              - type: object
                                properties:
                                  type:
                                    enum:
                                      - json_schema
                                    type: string
                                  json_schema:
                                    type: object
                                    properties:
                                      description:
                                        type: string
                                        description: >-
                                          A description of what the response
                                          format is for, used by the model to
                                          determine how to respond in the format.
                                      name:
                                        type: string
                                        description: >-
                                          The name of the response format. Must be
                                          a-z, A-Z, 0-9, or contain underscores
                                          and dashes, with a maximum length of 64.
                                      schema:
                                        description: >-
                                          The schema for the response format,
                                          described as a JSON Schema object.
                                      strict:
                                        type: boolean
                                        default: false
                                        description: >-
                                          Whether to enable strict schema
                                          adherence when generating the output. If
                                          set to true, the model will always
                                          follow the exact schema defined in the
                                          schema field. Only a subset of JSON
                                          Schema is supported when strict is true.
                                    required:
                                      - name
                                required:
                                  - type
                                  - json_schema
                                title: JSON schema
                                description: >-


                                  JSON Schema response format. Used to generate
                                  structured JSON responses
                            description: >-
                              An object specifying the format that the model
                              must output
                          reasoning_effort:
                            type: string
                            enum:
                              - none
                              - minimal
                              - low
                              - medium
                              - high
                              - xhigh
                            description: >-
                              Constrains effort on reasoning for [reasoning
                              models](https://platform.openai.com/docs/guides/reasoning).
                              Currently supported values are `none`, `minimal`,
                              `low`, `medium`, `high`, and `xhigh`. Reducing
                              reasoning effort can result in faster responses
                              and fewer tokens used on reasoning in a response.


                              - `gpt-5.1` defaults to `none`, which does not
                              perform reasoning. The supported reasoning values
                              for `gpt-5.1` are `none`, `low`, `medium`, and
                              `high`. Tool calls are supported for all reasoning
                              values in gpt-5.1.

                              - All models before `gpt-5.1` default to `medium`
                              reasoning effort, and do not support `none`.

                              - The `gpt-5-pro` model defaults to (and only
                              supports) `high` reasoning effort.

                              - `xhigh` is currently only supported for
                              `gpt-5.1-codex-max`.


                              Any of "none", "minimal", "low", "medium", "high",
                              "xhigh".
                          verbosity:
                            type: string
                            description: >-
                              Adjusts response verbosity. Lower levels yield
                              shorter answers.
                          seed:
                            type:
                              - number
                              - 'null'
                            description: >-
                              If specified, our system will make a best effort
                              to sample deterministically, such that repeated
                              requests with the same seed and parameters should
                              return the same result.
                          stop:
                            anyOf:
                              - type: string
                              - type: array
                                items:
                                  type: string
                                maxItems: 4
                              - type: 'null'
                            description: >-
                              Up to 4 sequences where the API will stop
                              generating further tokens.
                          thinking:
                            oneOf:
                              - $ref: >-
                                  #/components/schemas/ThinkingConfigDisabledSchema
                              - $ref: >-
                                  #/components/schemas/ThinkingConfigEnabledSchema
                              - $ref: >-
                                  #/components/schemas/ThinkingConfigAdaptiveSchema
                            discriminator:
                              propertyName: type
                              mapping:
                                disabled:
                                  $ref: >-
                                    #/components/schemas/ThinkingConfigDisabledSchema
                                enabled:
                                  $ref: >-
                                    #/components/schemas/ThinkingConfigEnabledSchema
                                adaptive:
                                  $ref: >-
                                    #/components/schemas/ThinkingConfigAdaptiveSchema
                          temperature:
                            type:
                              - number
                              - 'null'
                            minimum: 0
                            maximum: 2
                            description: >-
                              What sampling temperature to use, between 0 and 2.
                              Higher values like 0.8 will make the output more
                              random, while lower values like 0.2 will make it
                              more focused and deterministic.
                          top_p:
                            type:
                              - number
                              - 'null'
                            minimum: 0
                            maximum: 1
                            description: >-
                              An alternative to sampling with temperature,
                              called nucleus sampling, where the model considers
                              the results of the tokens with top_p probability
                              mass. 
                          top_k:
                            type:
                              - number
                              - 'null'
                            description: >-
                              Limits the model to consider only the top k most
                              likely tokens at each step.
                          tool_choice:
                            anyOf:
                              - type: string
                                enum:
                                  - none
                                  - auto
                                  - required
                              - type: object
                                properties:
                                  type:
                                    type: string
                                    enum:
                                      - function
                                    description: >-
                                      The type of the tool. Currently, only
                                      function is supported.
                                  function:
                                    type: object
                                    properties:
                                      name:
                                        type: string
                                        description: The name of the function to call.
                                    required:
                                      - name
                                required:
                                  - function
                            description: >-
                              Controls which (if any) tool is called by the
                              model.
                          parallel_tool_calls:
                            type: boolean
                            description: >-
                              Whether to enable parallel function calling during
                              tool use.
                          modalities:
                            type:
                              - array
                              - 'null'
                            items:
                              type: string
                              enum:
                                - text
                                - audio
                            description: >-
                              Output types that you would like the model to
                              generate. Most models are capable of generating
                              text, which is the default: ["text"]. The
                              gpt-4o-audio-preview model can also be used to
                              generate audio. To request that this model
                              generate both text and audio responses, you can
                              use: ["text", "audio"].
                          guardrails:
                            type: array
                            items:
                              type: object
                              properties:
                                id:
                                  anyOf:
                                    - type: string
                                      enum:
                                        - orq_pii_detection
                                        - orq_sexual_moderation
                                        - orq_harmful_moderation
                                      description: The key of the guardrail.
                                    - type: string
                                      description: >-
                                        Unique key or identifier of the
                                        evaluator
                                execute_on:
                                  type: string
                                  enum:
                                    - input
                                    - output
                                  description: >-
                                    Determines whether the guardrail runs on the
                                    input (user message) or output (model
                                    response).
                              required:
                                - id
                                - execute_on
                            description: A list of guardrails to apply to the request.
                          fallbacks:
                            type: array
                            items:
                              type: object
                              properties:
                                model:
                                  type: string
                                  description: Fallback model identifier
                                  example: openai/gpt-4o-mini
                              required:
                                - model
                            description: >-
                              Array of fallback models to use if primary model
                              fails
                          cache:
                            type: object
                            properties:
                              ttl:
                                type: number
                                minimum: 1
                                maximum: 259200
                                default: 1800
                                description: >-
                                  Time to live for cached responses in seconds.
                                  Maximum 259200 seconds (3 days).
                                example: 3600
                              type:
                                type: string
                                enum:
                                  - exact_match
                            required:
                              - type
                            description: Cache configuration for the request.
                          load_balancer:
                            oneOf:
                              - type: object
                                properties:
                                  type:
                                    type: string
                                    enum:
                                      - weight_based
                                  models:
                                    type: array
                                    items:
                                      type: object
                                      properties:
                                        model:
                                          type: string
                                          description: Model identifier for load balancing
                                          example: openai/gpt-4o
                                        weight:
                                          type: number
                                          minimum: 0.001
                                          maximum: 1
                                          default: 0.5
                                          description: >-
                                            Weight assigned to this model for load
                                            balancing
                                          example: 0.7
                                      required:
                                        - model
                                required:
                                  - type
                                  - models
                            description: Load balancer configuration for the request.
                            example:
                              type: weight_based
                              models:
                                - model: openai/gpt-4o
                                  weight: 0.7
                                - model: anthropic/claude-3-5-sonnet
                                  weight: 0.3
                          timeout:
                            type: object
                            properties:
                              call_timeout:
                                type: number
                                minimum: 1
                                description: Timeout value in milliseconds
                                example: 30000
                            required:
                              - call_timeout
                            description: >-
                              Timeout configuration to apply to the request. If
                              the request exceeds the timeout, it will be
                              retried or fallback to the next model if
                              configured.
                          cache_control:
                            type: object
                            properties:
                              type:
                                type: string
                                enum:
                                  - ephemeral
                                description: >-
                                  Create a cache control breakpoint at this
                                  content block. Accepts only the value
                                  "ephemeral".
                              ttl:
                                type: string
                                enum:
                                  - 5m
                                  - 1h
                                default: 5m
                                description: >-
                                  The time-to-live for the cache control
                                  breakpoint. This may be one of the following
                                  values:


                                  - `5m`: 5 minutes

                                  - `1h`: 1 hour


                                  Defaults to `5m`. Only supported by
                                  `Anthropic` Claude models.
                            required:
                              - type
                            description: >-
                              Provider-level prompt caching configuration
                              applied to the request. Creates a cache control
                              breakpoint covering the request content. Only
                              supported by `Anthropic` Claude models.
                          prompt_cache_key:
                            type: string
                            description: >-
                              Used by OpenAI to cache responses for similar
                              requests to optimize your cache hit rates.
                              Replaces the legacy `user` field for prompt
                              caching.
                        description: >-
                          Model behavior parameters (snake_case) stored as part
                          of the agent configuration. These become the default
                          parameters used when the agent is executed. Commonly
                          used: temperature (0-1, controls randomness),
                          max_completion_tokens (response length), top_p
                          (nucleus sampling). Advanced: frequency_penalty,
                          presence_penalty, response_format (JSON/structured
                          output), reasoning_effort (for o1/thinking models),
                          seed (reproducibility), stop sequences. Model-specific
                          support varies. Runtime parameters in agent execution
                          requests can override these defaults.
                      retry:
                        type: object
                        properties:
                          count:
                            type: number
                            minimum: 1
                            maximum: 5
                            default: 3
                            description: Number of retry attempts (1-5)
                            example: 3
                          on_codes:
                            type: array
                            items:
                              type: number
                              minimum: 100
                              maximum: 599
                            minItems: 1
                            description: HTTP status codes that trigger retry logic
                            example:
                              - 429
                              - 500
                              - 502
                              - 503
                              - 504
                        description: >-
                          Retry configuration for model requests. Allows
                          customizing retry count (1-5) and HTTP status codes
                          that trigger retries. Default codes: [429]. Common
                          codes: 500 (internal error), 429 (rate limit),
                          502/503/504 (gateway errors).
                      fallback_models:
                        type:
                          - array
                          - 'null'
                        items:
                          anyOf:
                            - type: string
                              description: >-
                                A fallback model ID string (e.g.,
                                `openai/gpt-4o-mini`). Will be used if the
                                primary model request fails. Must support tool
                                calling.
                            - type: object
                              properties:
                                id:
                                  type: string
                                  description: >-
                                    A fallback model ID string. Must support
                                    tool calling.
                                parameters:
                                  type: object
                                  properties:
                                    name:
                                      description: >-
                                        The name to display on the trace. If not
                                        specified, the default system name will
                                        be used.
                                      type: string
                                    frequency_penalty:
                                      type:
                                        - number
                                        - 'null'
                                      description: >-
                                        Number between -2.0 and 2.0. Positive
                                        values penalize new tokens based on
                                        their existing frequency in the text so
                                        far, decreasing the model's likelihood
                                        to repeat the same line verbatim.
                                    max_tokens:
                                      type:
                                        - integer
                                        - 'null'
                                      description: >-
                                        `[Deprecated]`. The maximum number of
                                        tokens that can be generated in the chat
                                        completion. This value can be used to
                                        control costs for text generated via
                                        API. 

                                         This value is now `deprecated` in favor of `max_completion_tokens`, and is not compatible with o1 series models.
                                    max_completion_tokens:
                                      type:
                                        - integer
                                        - 'null'
                                      exclusiveMinimum: 0
                                      description: >-
                                        An upper bound for the number of tokens
                                        that can be generated for a completion,
                                        including visible output tokens and
                                        reasoning tokens
                                    presence_penalty:
                                      type:
                                        - number
                                        - 'null'
                                      description: >-
                                        Number between -2.0 and 2.0. Positive
                                        values penalize new tokens based on
                                        whether they appear in the text so far,
                                        increasing the model's likelihood to
                                        talk about new topics.
                                    response_format:
                                      oneOf:
                                        - type: object
                                          properties:
                                            type:
                                              type: string
                                              enum:
                                                - text
                                          required:
                                            - type
                                          title: Text
                                          description: >-


                                            Default response format. Used to
                                            generate text responses
                                        - type: object
                                          properties:
                                            type:
                                              type: string
                                              enum:
                                                - json_object
                                          required:
                                            - type
                                          title: JSON object
                                          description: >-


                                            JSON object response format. An older
                                            method of generating JSON responses.
                                            Using `json_schema` is recommended for
                                            models that support it. Note that the
                                            model will not generate JSON without a
                                            system or user message instructing it to
                                            do so.
                                        - type: object
                                          properties:
                                            type:
                                              enum:
                                                - json_schema
                                              type: string
                                            json_schema:
                                              type: object
                                              properties:
                                                description:
                                                  type: string
                                                  description: >-
                                                    A description of what the response
                                                    format is for, used by the model to
                                                    determine how to respond in the format.
                                                name:
                                                  type: string
                                                  description: >-
                                                    The name of the response format. Must be
                                                    a-z, A-Z, 0-9, or contain underscores
                                                    and dashes, with a maximum length of 64.
                                                schema:
                                                  description: >-
                                                    The schema for the response format,
                                                    described as a JSON Schema object.
                                                strict:
                                                  type: boolean
                                                  default: false
                                                  description: >-
                                                    Whether to enable strict schema
                                                    adherence when generating the output. If
                                                    set to true, the model will always
                                                    follow the exact schema defined in the
                                                    schema field. Only a subset of JSON
                                                    Schema is supported when strict is true.
                                              required:
                                                - name
                                          required:
                                            - type
                                            - json_schema
                                          title: JSON schema
                                          description: >-


                                            JSON Schema response format. Used to
                                            generate structured JSON responses
                                      description: >-
                                        An object specifying the format that the
                                        model must output
                                    reasoning_effort:
                                      type: string
                                      enum:
                                        - none
                                        - minimal
                                        - low
                                        - medium
                                        - high
                                        - xhigh
                                      description: >-
                                        Constrains effort on reasoning for
                                        [reasoning
                                        models](https://platform.openai.com/docs/guides/reasoning).
                                        Currently supported values are `none`,
                                        `minimal`, `low`, `medium`, `high`, and
                                        `xhigh`. Reducing reasoning effort can
                                        result in faster responses and fewer
                                        tokens used on reasoning in a response.


                                        - `gpt-5.1` defaults to `none`, which
                                        does not perform reasoning. The
                                        supported reasoning values for `gpt-5.1`
                                        are `none`, `low`, `medium`, and `high`.
                                        Tool calls are supported for all
                                        reasoning values in gpt-5.1.

                                        - All models before `gpt-5.1` default to
                                        `medium` reasoning effort, and do not
                                        support `none`.

                                        - The `gpt-5-pro` model defaults to (and
                                        only supports) `high` reasoning effort.

                                        - `xhigh` is currently only supported
                                        for `gpt-5.1-codex-max`.


                                        Any of "none", "minimal", "low",
                                        "medium", "high", "xhigh".
                                    verbosity:
                                      type: string
                                      description: >-
                                        Adjusts response verbosity. Lower levels
                                        yield shorter answers.
                                    seed:
                                      type:
                                        - number
                                        - 'null'
                                      description: >-
                                        If specified, our system will make a
                                        best effort to sample deterministically,
                                        such that repeated requests with the
                                        same seed and parameters should return
                                        the same result.
                                    stop:
                                      anyOf:
                                        - type: string
                                        - type: array
                                          items:
                                            type: string
                                          maxItems: 4
                                        - type: 'null'
                                      description: >-
                                        Up to 4 sequences where the API will
                                        stop generating further tokens.
                                    thinking:
                                      oneOf:
                                        - $ref: >-
                                            #/components/schemas/ThinkingConfigDisabledSchema
                                        - $ref: >-
                                            #/components/schemas/ThinkingConfigEnabledSchema
                                        - $ref: >-
                                            #/components/schemas/ThinkingConfigAdaptiveSchema
                                      discriminator:
                                        propertyName: type
                                        mapping:
                                          disabled:
                                            $ref: >-
                                              #/components/schemas/ThinkingConfigDisabledSchema
                                          enabled:
                                            $ref: >-
                                              #/components/schemas/ThinkingConfigEnabledSchema
                                          adaptive:
                                            $ref: >-
                                              #/components/schemas/ThinkingConfigAdaptiveSchema
                                    temperature:
                                      type:
                                        - number
                                        - 'null'
                                      minimum: 0
                                      maximum: 2
                                      description: >-
                                        What sampling temperature to use,
                                        between 0 and 2. Higher values like 0.8
                                        will make the output more random, while
                                        lower values like 0.2 will make it more
                                        focused and deterministic.
                                    top_p:
                                      type:
                                        - number
                                        - 'null'
                                      minimum: 0
                                      maximum: 1
                                      description: >-
                                        An alternative to sampling with
                                        temperature, called nucleus sampling,
                                        where the model considers the results of
                                        the tokens with top_p probability mass. 
                                    top_k:
                                      type:
                                        - number
                                        - 'null'
                                      description: >-
                                        Limits the model to consider only the
                                        top k most likely tokens at each step.
                                    tool_choice:
                                      anyOf:
                                        - type: string
                                          enum:
                                            - none
                                            - auto
                                            - required
                                        - type: object
                                          properties:
                                            type:
                                              type: string
                                              enum:
                                                - function
                                              description: >-
                                                The type of the tool. Currently, only
                                                function is supported.
                                            function:
                                              type: object
                                              properties:
                                                name:
                                                  type: string
                                                  description: The name of the function to call.
                                              required:
                                                - name
                                          required:
                                            - function
                                      description: >-
                                        Controls which (if any) tool is called
                                        by the model.
                                    parallel_tool_calls:
                                      type: boolean
                                      description: >-
                                        Whether to enable parallel function
                                        calling during tool use.
                                    modalities:
                                      type:
                                        - array
                                        - 'null'
                                      items:
                                        type: string
                                        enum:
                                          - text
                                          - audio
                                      description: >-
                                        Output types that you would like the
                                        model to generate. Most models are
                                        capable of generating text, which is the
                                        default: ["text"]. The
                                        gpt-4o-audio-preview model can also be
                                        used to generate audio. To request that
                                        this model generate both text and audio
                                        responses, you can use: ["text",
                                        "audio"].
                                    guardrails:
                                      type: array
                                      items:
                                        type: object
                                        properties:
                                          id:
                                            anyOf:
                                              - type: string
                                                enum:
                                                  - orq_pii_detection
                                                  - orq_sexual_moderation
                                                  - orq_harmful_moderation
                                                description: The key of the guardrail.
                                              - type: string
                                                description: >-
                                                  Unique key or identifier of the
                                                  evaluator
                                          execute_on:
                                            type: string
                                            enum:
                                              - input
                                              - output
                                            description: >-
                                              Determines whether the guardrail runs on
                                              the input (user message) or output
                                              (model response).
                                        required:
                                          - id
                                          - execute_on
                                      description: >-
                                        A list of guardrails to apply to the
                                        request.
                                    fallbacks:
                                      type: array
                                      items:
                                        type: object
                                        properties:
                                          model:
                                            type: string
                                            description: Fallback model identifier
                                            example: openai/gpt-4o-mini
                                        required:
                                          - model
                                      description: >-
                                        Array of fallback models to use if
                                        primary model fails
                                    cache:
                                      type: object
                                      properties:
                                        ttl:
                                          type: number
                                          minimum: 1
                                          maximum: 259200
                                          default: 1800
                                          description: >-
                                            Time to live for cached responses in
                                            seconds. Maximum 259200 seconds (3
                                            days).
                                          example: 3600
                                        type:
                                          type: string
                                          enum:
                                            - exact_match
                                      required:
                                        - type
                                      description: Cache configuration for the request.
                                    load_balancer:
                                      oneOf:
                                        - type: object
                                          properties:
                                            type:
                                              type: string
                                              enum:
                                                - weight_based
                                            models:
                                              type: array
                                              items:
                                                type: object
                                                properties:
                                                  model:
                                                    type: string
                                                    description: Model identifier for load balancing
                                                    example: openai/gpt-4o
                                                  weight:
                                                    type: number
                                                    minimum: 0.001
                                                    maximum: 1
                                                    default: 0.5
                                                    description: >-
                                                      Weight assigned to this model for load
                                                      balancing
                                                    example: 0.7
                                                required:
                                                  - model
                                          required:
                                            - type
                                            - models
                                      description: >-
                                        Load balancer configuration for the
                                        request.
                                      example:
                                        type: weight_based
                                        models:
                                          - model: openai/gpt-4o
                                            weight: 0.7
                                          - model: anthropic/claude-3-5-sonnet
                                            weight: 0.3
                                    timeout:
                                      type: object
                                      properties:
                                        call_timeout:
                                          type: number
                                          minimum: 1
                                          description: Timeout value in milliseconds
                                          example: 30000
                                      required:
                                        - call_timeout
                                      description: >-
                                        Timeout configuration to apply to the
                                        request. If the request exceeds the
                                        timeout, it will be retried or fallback
                                        to the next model if configured.
                                    cache_control:
                                      type: object
                                      properties:
                                        type:
                                          type: string
                                          enum:
                                            - ephemeral
                                          description: >-
                                            Create a cache control breakpoint at
                                            this content block. Accepts only the
                                            value "ephemeral".
                                        ttl:
                                          type: string
                                          enum:
                                            - 5m
                                            - 1h
                                          default: 5m
                                          description: >-
                                            The time-to-live for the cache control
                                            breakpoint. This may be one of the
                                            following values:


                                            - `5m`: 5 minutes

                                            - `1h`: 1 hour


                                            Defaults to `5m`. Only supported by
                                            `Anthropic` Claude models.
                                      required:
                                        - type
                                      description: >-
                                        Provider-level prompt caching
                                        configuration applied to the request.
                                        Creates a cache control breakpoint
                                        covering the request content. Only
                                        supported by `Anthropic` Claude models.
                                    prompt_cache_key:
                                      type: string
                                      description: >-
                                        Used by OpenAI to cache responses for
                                        similar requests to optimize your cache
                                        hit rates. Replaces the legacy `user`
                                        field for prompt caching.
                                  description: >-
                                    Optional model parameters specific to this
                                    fallback model. Overrides primary model
                                    parameters if this fallback is used.
                                retry:
                                  type: object
                                  properties:
                                    count:
                                      type: number
                                      minimum: 1
                                      maximum: 5
                                      default: 3
                                      description: Number of retry attempts (1-5)
                                      example: 3
                                    on_codes:
                                      type: array
                                      items:
                                        type: number
                                        minimum: 100
                                        maximum: 599
                                      minItems: 1
                                      description: >-
                                        HTTP status codes that trigger retry
                                        logic
                                      example:
                                        - 429
                                        - 500
                                        - 502
                                        - 503
                                        - 504
                                  description: >-
                                    Retry configuration for this fallback model.
                                    Allows customizing retry count (1-5) and
                                    HTTP status codes that trigger retries.
                              required:
                                - id
                              description: >-
                                Fallback model configuration with optional
                                parameters and retry settings.
                          title: Fallback Model Configuration
                          description: >-
                            Fallback model for automatic failover when primary
                            model request fails. Supports optional parameter
                            overrides. Can be a simple model ID string or a
                            configuration object with model-specific parameters.
                            Fallbacks are tried in order.
                        description: >-
                          Optional array of fallback models (string IDs or
                          config objects) that will be used automatically in
                          order if the primary model fails
                    required:
                      - id
                required:
                  - _id
                  - key
                  - project_id
                  - status
                  - path
                  - skills
                  - role
                  - description
                  - instructions
                  - model
        '404':
          description: >-
            Agent not found. The specified agent key does not exist in the
            workspace or you do not have permission to access it.
          content:
            application/json:
              schema:
                type: object
                properties:
                  message:
                    type: string
                required:
                  - message
components:
  schemas:
    ThinkingConfigDisabledSchema:
      type: object
      properties:
        type:
          type: string
          enum:
            - disabled
          description: Disables the thinking mode capability
      required:
        - type
      title: Thinking config disabled
      description: Disables the thinking mode capability
    ThinkingConfigEnabledSchema:
      type: object
      properties:
        type:
          type: string
          enum:
            - enabled
          description: Enables or disables the thinking mode capability
        budget_tokens:
          type: number
          description: >-
            Determines how many tokens the model can use for its internal
            reasoning process. Larger budgets can enable more thorough analysis
            for complex problems, improving response quality. Must be ≥1024 and
            less than `max_tokens`.
        thinking_level:
          type: string
          enum:
            - low
            - medium
            - high
          description: >-
            The level of reasoning the model should use. This setting is
            supported only by `gemini-3` models. If budget_tokens is specified
            and `thinking_level` is available, `budget_tokens` will be ignored.
      required:
        - type
        - budget_tokens
      title: Thinking config enabled
      description: Enables the thinking mode capability
    ThinkingConfigAdaptiveSchema:
      type: object
      properties:
        type:
          type: string
          enum:
            - adaptive
          description: >-
            Lets the model dynamically determine when and how much to use
            extended thinking based on the complexity of each request. Supported
            on Claude Opus 4.6 and Sonnet 4.6.
      required:
        - type
      title: Thinking config adaptive
      description: >-
        Enables adaptive thinking mode where the model dynamically determines
        thinking depth
  securitySchemes:
    ApiKey:
      type: http
      scheme: bearer
      bearerFormat: JWT

````