> ## Documentation Index
> Fetch the complete documentation index at: https://docs.orq.ai/llms.txt
> Use this file to discover all available pages before exploring further.

# List agents

> Retrieves a comprehensive list of agents configured in your workspace. Supports pagination for large datasets and returns agents sorted by creation date (newest first). Each agent in the response includes its complete configuration: model settings with fallback options, instructions, tools, knowledge bases, memory stores, and execution parameters. Use pagination parameters to efficiently navigate through large collections of agents.


## OpenAPI

````yaml get /v2/agents
openapi: 3.1.0
info:
  title: orq.ai API
  version: '2.0'
  description: orq.ai API documentation
servers:
  - url: https://api.orq.ai
security:
  - ApiKey: []
tags:
  - description: List models available through the AI Router.
    name: Models
  - name: Guardrail Rules
  - name: Policies
  - name: Routing Rules
  - name: API keys
    description: >-
      API keys authenticate programmatic access to the workspace. The unified
      key model exposes opaque tokens, per-domain access grants, and budget /
      rate-limit constraints (see ADR 0001 and ADR 0002).
  - name: Budgets
    description: >-
      Budgets govern spend, token usage, and request rate across six scopes:
      workspace, project, identity, api-key, provider, and model. A budget is
      hierarchical and defense-in-depth — every applicable budget is a hard
      gate, and the most restrictive one wins per dimension (see ADR 0007).
  - name: Documentation
    description: >-
      Search the orq.ai documentation. Proxies the workspace's query to the
      hosted docs search index.
  - name: Files
    description: File upload and retrieval operations.
  - name: Identities
    description: >-
      Identities represent end users from your system for usage and engagement
      tracking.
  - name: Projects
    description: Projects organize resources within a workspace
  - name: Skills
    description: >-
      Skills are modular instructions you can use to codify processes and
      conventions
  - name: Responses
  - description: >-
      Run agents on a cadence — cron, interval, or one-off. Minimum firing
      interval is 1 hour.
    name: Agent Schedules
  - name: Embeddings
  - name: Reporting
    description: >-
      GenAI reporting API over canonical analytics rollups. Accepts a metric
      name, time range, grain, group-by, and filters; returns a typed time
      series and optional totals.
externalDocs:
  url: https://docs.orq.ai
  description: orq.ai Documentation
paths:
  /v2/agents:
    get:
      tags:
        - Agents
      summary: List agents
      description: >-
        Retrieves a comprehensive list of agents configured in your workspace.
        Supports pagination for large datasets and returns agents sorted by
        creation date (newest first). Each agent in the response includes its
        complete configuration: model settings with fallback options,
        instructions, tools, knowledge bases, memory stores, and execution
        parameters. Use pagination parameters to efficiently navigate through
        large collections of agents.
      operationId: ListAgents
      parameters:
        - schema:
            type: number
            minimum: 1
            maximum: 200
            description: >-
              A limit on the number of objects to be returned. Limit can range
              between 1 and 200. When not provided, returns all agents without
              pagination.
          required: false
          description: >-
            A limit on the number of objects to be returned. Limit can range
            between 1 and 200. When not provided, returns all agents without
            pagination.
          name: limit
          in: query
        - schema:
            type: string
            description: >-
              A cursor for use in pagination. `starting_after` is an object ID
              that defines your place in the list. For instance, if you make a
              list request and receive 20 objects, ending with
              `01JJ1HDHN79XAS7A01WB3HYSDB`, your subsequent call can include
              `after=01JJ1HDHN79XAS7A01WB3HYSDB` in order to fetch the next page
              of the list.
          required: false
          description: >-
            A cursor for use in pagination. `starting_after` is an object ID
            that defines your place in the list. For instance, if you make a
            list request and receive 20 objects, ending with
            `01JJ1HDHN79XAS7A01WB3HYSDB`, your subsequent call can include
            `after=01JJ1HDHN79XAS7A01WB3HYSDB` in order to fetch the next page
            of the list.
          name: starting_after
          in: query
        - schema:
            type: string
            description: >-
              A cursor for use in pagination. `ending_before` is an object ID
              that defines your place in the list. For instance, if you make a
              list request and receive 20 objects, starting with
              `01JJ1HDHN79XAS7A01WB3HYSDB`, your subsequent call can include
              `before=01JJ1HDHN79XAS7A01WB3HYSDB` in order to fetch the previous
              page of the list.
          required: false
          description: >-
            A cursor for use in pagination. `ending_before` is an object ID that
            defines your place in the list. For instance, if you make a list
            request and receive 20 objects, starting with
            `01JJ1HDHN79XAS7A01WB3HYSDB`, your subsequent call can include
            `before=01JJ1HDHN79XAS7A01WB3HYSDB` in order to fetch the previous
            page of the list.
          name: ending_before
          in: query
        - schema:
            type: string
            enum:
              - internal
            description: Filter agents by type
          required: false
          description: Filter agents by type
          name: type
          in: query
      responses:
        '200':
          description: >-
            Successfully retrieved the list of agents. Returns a paginated
            response containing agent manifests with complete configurations,
            including primary and fallback models, tools, knowledge bases, and
            execution settings.
          content:
            application/json:
              schema:
                type: object
                properties:
                  object:
                    type: string
                    enum:
                      - list
                  data:
                    type: array
                    items:
                      type: object
                      properties:
                        _id:
                          type: string
                        key:
                          type: string
                          pattern: ^[A-Za-z][A-Za-z0-9]*([._-][A-Za-z0-9]+)*$
                          description: Unique identifier for the agent within the workspace
                        display_name:
                          type: string
                        created_by_id:
                          type:
                            - string
                            - 'null'
                        updated_by_id:
                          type:
                            - string
                            - 'null'
                        created:
                          type: string
                        updated:
                          type: string
                        status:
                          type: string
                          enum:
                            - live
                            - draft
                            - pending
                            - published
                          description: >-
                            The status of the agent. `Live` is the latest
                            version of the agent. `Draft` is a version that is
                            not yet published. `Pending` is a version that is
                            pending approval. `Published` is a version that was
                            live and has been replaced by a new version.
                        version:
                          type: string
                          description: Current semantic version of the agent manifest.
                        path:
                          type: string
                          description: >-
                            Entity storage path.


                            With workspace-level API keys, use the format
                            `project/folder/subfolder/...`. The first element
                            identifies the project, followed by nested folders
                            (auto-created as needed). Example: `Default/agents`.


                            With project-level API keys, the project is
                            predetermined by the API key, so the path is
                            relative to that project. Example: `agents`. For
                            backward compatibility, a leading project name is
                            ignored when it matches the scoped project.
                          example: Default
                        memory_stores:
                          type: array
                          items:
                            type: string
                          default: []
                          description: >-
                            Array of memory store identifiers. Accepts both
                            memory store IDs and keys.
                        team_of_agents:
                          type: array
                          items:
                            type: object
                            properties:
                              key:
                                type: string
                                description: >-
                                  The unique key of the agent within the
                                  workspace
                              role:
                                type: string
                                description: >-
                                  The role of the agent in this context. This is
                                  used to give extra information to the leader
                                  to help it decide which agent to hand off to.
                            required:
                              - key
                          default: []
                          description: >-
                            The agents that are accessible to this orchestrator.
                            The main agent can hand off to these agents to
                            perform tasks.
                        skills:
                          type: array
                          items:
                            type: string
                          description: >-
                            List of skills that the agent can utilize. This
                            field allows you to specify which skills the agent
                            has access to, enabling more complex and dynamic
                            behavior.
                        metrics:
                          type: object
                          properties:
                            total_cost:
                              type: number
                              minimum: 0
                              default: 0
                          default:
                            total_cost: 0
                        variables:
                          type: object
                          additionalProperties: {}
                          description: Extracted variables from agent instructions
                        knowledge_bases:
                          type: array
                          items:
                            type: object
                            properties:
                              knowledge_id:
                                type: string
                                description: >-
                                  Unique identifier of the knowledge base to
                                  search
                                example: customer-knowledge-base
                            required:
                              - knowledge_id
                          description: Agent knowledge bases reference
                        source:
                          type: string
                          enum:
                            - internal
                            - external
                            - experiment
                        engine:
                          type: string
                          enum:
                            - text
                            - jinja
                            - mustache
                          default: text
                        type:
                          type: string
                          enum:
                            - internal
                            - a2a
                          default: internal
                          description: >-
                            Agent type: internal (Orquesta-managed) or a2a
                            (external A2A-compliant)
                        role:
                          type: string
                          minLength: 1
                        description:
                          type: string
                        system_prompt:
                          type:
                            - string
                            - 'null'
                          minLength: 1
                        instructions:
                          type: string
                        settings:
                          type: object
                          properties:
                            max_iterations:
                              type: integer
                              exclusiveMinimum: 0
                              maximum: 100
                              minimum: 1
                              default: 100
                              description: >-
                                Maximum iterations(llm calls) before the agent
                                will stop executing.
                            max_execution_time:
                              type: integer
                              minimum: 2
                              exclusiveMinimum: 0
                              maximum: 600
                              default: 600
                              description: >-
                                Maximum time (in seconds) for the agent thinking
                                process. This does not include the time for tool
                                calls and sub agent calls. It will be loosely
                                enforced, the in progress LLM calls will not be
                                terminated and the last assistant message will
                                be returned.
                            max_cost:
                              type: number
                              minimum: 0
                              default: 0
                              description: >-
                                Maximum cost in USD for the agent execution.
                                When the accumulated cost exceeds this limit,
                                the agent will stop executing. Set to 0 for
                                unlimited. Only supported in v3 responses
                            tool_approval_required:
                              type: string
                              enum:
                                - all
                                - respect_tool
                                - none
                              default: respect_tool
                              description: >-
                                If all, the agent will require approval for all
                                tools. If respect_tool, the agent will require
                                approval for tools that have the
                                requires_approval flag set to true. If none, the
                                agent will not require approval for any tools.
                            tools:
                              type: array
                              items:
                                type: object
                                properties:
                                  id:
                                    type: string
                                    format: ulid
                                    pattern: ^[0-9A-HJKMNP-TV-Z]{26}$
                                    readOnly: true
                                    description: The id of the resource
                                  key:
                                    type: string
                                    description: Optional tool key for custom tools
                                  action_type:
                                    type: string
                                  display_name:
                                    type: string
                                  description:
                                    type: string
                                    description: Optional tool description
                                  requires_approval:
                                    type: boolean
                                    default: false
                                  tool_id:
                                    type: string
                                    description: >-
                                      Nested tool ID for MCP tools (identifies
                                      specific tool within MCP server)
                                  conditions:
                                    type: array
                                    items:
                                      type: object
                                      properties:
                                        condition:
                                          type: string
                                          description: >-
                                            The argument of the tool call to
                                            evaluate
                                        operator:
                                          type: string
                                          description: The operator to use
                                        value:
                                          type: string
                                          description: The value to compare against
                                      required:
                                        - condition
                                        - operator
                                        - value
                                    default: []
                                  timeout:
                                    type: number
                                    minimum: 1
                                    maximum: 600
                                    default: 120
                                    description: >-
                                      Tool execution timeout in seconds
                                      (default: 2 minutes, max: 10 minutes)
                                required:
                                  - id
                                  - action_type
                              default: []
                            evaluators:
                              type: array
                              items:
                                type: object
                                properties:
                                  id:
                                    type: string
                                    description: Unique key or identifier of the evaluator
                                  sample_rate:
                                    type: number
                                    minimum: 1
                                    maximum: 100
                                    default: 50
                                    description: >-
                                      The percentage of executions to evaluate
                                      with this evaluator (1-100). For example,
                                      a value of 50 means the evaluator will run
                                      on approximately half of the executions.
                                  execute_on:
                                    type: string
                                    enum:
                                      - input
                                      - output
                                    description: >-
                                      Determines whether the evaluator runs on
                                      the agent input (user message) or output
                                      (agent response).
                                required:
                                  - id
                                  - execute_on
                              title: Agent evaluator configuration
                              description: >-
                                Configuration for an evaluator applied to the
                                agent
                            guardrails:
                              type: array
                              items:
                                type: object
                                properties:
                                  id:
                                    type: string
                                    description: Unique key or identifier of the evaluator
                                  sample_rate:
                                    type: number
                                    minimum: 1
                                    maximum: 100
                                    default: 50
                                    description: >-
                                      The percentage of executions to evaluate
                                      with this evaluator (1-100). For example,
                                      a value of 50 means the evaluator will run
                                      on approximately half of the executions.
                                  execute_on:
                                    type: string
                                    enum:
                                      - input
                                      - output
                                    description: >-
                                      Determines whether the evaluator runs on
                                      the agent input (user message) or output
                                      (agent response).
                                required:
                                  - id
                                  - execute_on
                              title: Agent guardrail configuration
                              description: >-
                                Configuration for a guardrail applied to the
                                agent
                          default:
                            max_execution_time: 600
                            max_iterations: 100
                            max_cost: 0
                            tool_approval_required: respect_tool
                            tools: []
                        model:
                          type: object
                          properties:
                            id:
                              type: string
                              description: The database ID of the primary model
                            integration_id:
                              type:
                                - string
                                - 'null'
                              description: >-
                                Optional integration ID for custom model
                                configurations
                            parameters:
                              type: object
                              properties:
                                name:
                                  description: >-
                                    The name to display on the trace. If not
                                    specified, the default system name will be
                                    used.
                                  type: string
                                frequency_penalty:
                                  type:
                                    - number
                                    - 'null'
                                  description: >-
                                    Number between -2.0 and 2.0. Positive values
                                    penalize new tokens based on their existing
                                    frequency in the text so far, decreasing the
                                    model's likelihood to repeat the same line
                                    verbatim.
                                max_tokens:
                                  type:
                                    - integer
                                    - 'null'
                                  description: >-
                                    `[Deprecated]`. The maximum number of tokens
                                    that can be generated in the chat
                                    completion. This value can be used to
                                    control costs for text generated via API. 

                                     This value is now `deprecated` in favor of `max_completion_tokens`, and is not compatible with o1 series models.
                                max_completion_tokens:
                                  type:
                                    - integer
                                    - 'null'
                                  exclusiveMinimum: 0
                                  description: >-
                                    An upper bound for the number of tokens that
                                    can be generated for a completion, including
                                    visible output tokens and reasoning tokens
                                presence_penalty:
                                  type:
                                    - number
                                    - 'null'
                                  description: >-
                                    Number between -2.0 and 2.0. Positive values
                                    penalize new tokens based on whether they
                                    appear in the text so far, increasing the
                                    model's likelihood to talk about new topics.
                                response_format:
                                  oneOf:
                                    - type: object
                                      properties:
                                        type:
                                          type: string
                                          enum:
                                            - text
                                      required:
                                        - type
                                      title: Text
                                      description: >-


                                        Default response format. Used to
                                        generate text responses
                                    - type: object
                                      properties:
                                        type:
                                          type: string
                                          enum:
                                            - json_object
                                      required:
                                        - type
                                      title: JSON object
                                      description: >-


                                        JSON object response format. An older
                                        method of generating JSON responses.
                                        Using `json_schema` is recommended for
                                        models that support it. Note that the
                                        model will not generate JSON without a
                                        system or user message instructing it to
                                        do so.
                                    - type: object
                                      properties:
                                        type:
                                          enum:
                                            - json_schema
                                          type: string
                                        json_schema:
                                          type: object
                                          properties:
                                            description:
                                              type: string
                                              description: >-
                                                A description of what the response
                                                format is for, used by the model to
                                                determine how to respond in the format.
                                            name:
                                              type: string
                                              description: >-
                                                The name of the response format. Must be
                                                a-z, A-Z, 0-9, or contain underscores
                                                and dashes, with a maximum length of 64.
                                            schema:
                                              description: >-
                                                The schema for the response format,
                                                described as a JSON Schema object.
                                            strict:
                                              type: boolean
                                              default: false
                                              description: >-
                                                Whether to enable strict schema
                                                adherence when generating the output. If
                                                set to true, the model will always
                                                follow the exact schema defined in the
                                                schema field. Only a subset of JSON
                                                Schema is supported when strict is true.
                                          required:
                                            - name
                                      required:
                                        - type
                                        - json_schema
                                      title: JSON schema
                                      description: >-


                                        JSON Schema response format. Used to
                                        generate structured JSON responses
                                  description: >-
                                    An object specifying the format that the
                                    model must output
                                reasoning_effort:
                                  type: string
                                  enum:
                                    - none
                                    - minimal
                                    - low
                                    - medium
                                    - high
                                    - xhigh
                                  description: >-
                                    Constrains effort on reasoning for
                                    [reasoning
                                    models](https://platform.openai.com/docs/guides/reasoning).
                                    Currently supported values are `none`,
                                    `minimal`, `low`, `medium`, `high`, and
                                    `xhigh`. Reducing reasoning effort can
                                    result in faster responses and fewer tokens
                                    used on reasoning in a response.


                                    - `gpt-5.1` defaults to `none`, which does
                                    not perform reasoning. The supported
                                    reasoning values for `gpt-5.1` are `none`,
                                    `low`, `medium`, and `high`. Tool calls are
                                    supported for all reasoning values in
                                    gpt-5.1.

                                    - All models before `gpt-5.1` default to
                                    `medium` reasoning effort, and do not
                                    support `none`.

                                    - The `gpt-5-pro` model defaults to (and
                                    only supports) `high` reasoning effort.

                                    - `xhigh` is currently only supported for
                                    `gpt-5.1-codex-max`.


                                    Any of "none", "minimal", "low", "medium",
                                    "high", "xhigh".
                                verbosity:
                                  type: string
                                  description: >-
                                    Adjusts response verbosity. Lower levels
                                    yield shorter answers.
                                seed:
                                  type:
                                    - number
                                    - 'null'
                                  description: >-
                                    If specified, our system will make a best
                                    effort to sample deterministically, such
                                    that repeated requests with the same seed
                                    and parameters should return the same
                                    result.
                                stop:
                                  anyOf:
                                    - type: string
                                    - type: array
                                      items:
                                        type: string
                                      maxItems: 4
                                    - type: 'null'
                                  description: >-
                                    Up to 4 sequences where the API will stop
                                    generating further tokens.
                                thinking:
                                  oneOf:
                                    - $ref: >-
                                        #/components/schemas/ThinkingConfigDisabledSchema
                                    - $ref: >-
                                        #/components/schemas/ThinkingConfigEnabledSchema
                                    - $ref: >-
                                        #/components/schemas/ThinkingConfigAdaptiveSchema
                                  discriminator:
                                    propertyName: type
                                    mapping:
                                      disabled:
                                        $ref: >-
                                          #/components/schemas/ThinkingConfigDisabledSchema
                                      enabled:
                                        $ref: >-
                                          #/components/schemas/ThinkingConfigEnabledSchema
                                      adaptive:
                                        $ref: >-
                                          #/components/schemas/ThinkingConfigAdaptiveSchema
                                temperature:
                                  type:
                                    - number
                                    - 'null'
                                  minimum: 0
                                  maximum: 2
                                  description: >-
                                    What sampling temperature to use, between 0
                                    and 2. Higher values like 0.8 will make the
                                    output more random, while lower values like
                                    0.2 will make it more focused and
                                    deterministic.
                                top_p:
                                  type:
                                    - number
                                    - 'null'
                                  minimum: 0
                                  maximum: 1
                                  description: >-
                                    An alternative to sampling with temperature,
                                    called nucleus sampling, where the model
                                    considers the results of the tokens with
                                    top_p probability mass. 
                                top_k:
                                  type:
                                    - number
                                    - 'null'
                                  description: >-
                                    Limits the model to consider only the top k
                                    most likely tokens at each step.
                                tool_choice:
                                  anyOf:
                                    - type: string
                                      enum:
                                        - none
                                        - auto
                                        - required
                                    - type: object
                                      properties:
                                        type:
                                          type: string
                                          enum:
                                            - function
                                          description: >-
                                            The type of the tool. Currently, only
                                            function is supported.
                                        function:
                                          type: object
                                          properties:
                                            name:
                                              type: string
                                              description: The name of the function to call.
                                          required:
                                            - name
                                      required:
                                        - function
                                  description: >-
                                    Controls which (if any) tool is called by
                                    the model.
                                parallel_tool_calls:
                                  type: boolean
                                  description: >-
                                    Whether to enable parallel function calling
                                    during tool use.
                                modalities:
                                  type:
                                    - array
                                    - 'null'
                                  items:
                                    type: string
                                    enum:
                                      - text
                                      - audio
                                  description: >-
                                    Output types that you would like the model
                                    to generate. Most models are capable of
                                    generating text, which is the default:
                                    ["text"]. The gpt-4o-audio-preview model can
                                    also be used to generate audio. To request
                                    that this model generate both text and audio
                                    responses, you can use: ["text", "audio"].
                                guardrails:
                                  type: array
                                  items:
                                    type: object
                                    properties:
                                      id:
                                        anyOf:
                                          - type: string
                                            enum:
                                              - orq_pii_detection
                                              - orq_sexual_moderation
                                              - orq_harmful_moderation
                                            description: The key of the guardrail.
                                          - type: string
                                            description: >-
                                              Unique key or identifier of the
                                              evaluator
                                      execute_on:
                                        type: string
                                        enum:
                                          - input
                                          - output
                                        description: >-
                                          Determines whether the guardrail runs on
                                          the input (user message) or output
                                          (model response).
                                    required:
                                      - id
                                      - execute_on
                                  description: >-
                                    A list of guardrails to apply to the
                                    request.
                                fallbacks:
                                  type: array
                                  items:
                                    type: object
                                    properties:
                                      model:
                                        type: string
                                        description: Fallback model identifier
                                        example: openai/gpt-4o-mini
                                    required:
                                      - model
                                  description: >-
                                    Array of fallback models to use if primary
                                    model fails
                                cache:
                                  type: object
                                  properties:
                                    ttl:
                                      type: number
                                      minimum: 1
                                      maximum: 259200
                                      default: 1800
                                      description: >-
                                        Time to live for cached responses in
                                        seconds. Maximum 259200 seconds (3
                                        days).
                                      example: 3600
                                    type:
                                      type: string
                                      enum:
                                        - exact_match
                                  required:
                                    - type
                                  description: Cache configuration for the request.
                                load_balancer:
                                  oneOf:
                                    - type: object
                                      properties:
                                        type:
                                          type: string
                                          enum:
                                            - weight_based
                                        models:
                                          type: array
                                          items:
                                            type: object
                                            properties:
                                              model:
                                                type: string
                                                description: Model identifier for load balancing
                                                example: openai/gpt-4o
                                              weight:
                                                type: number
                                                minimum: 0.001
                                                maximum: 1
                                                default: 0.5
                                                description: >-
                                                  Weight assigned to this model for load
                                                  balancing
                                                example: 0.7
                                            required:
                                              - model
                                      required:
                                        - type
                                        - models
                                  description: Load balancer configuration for the request.
                                  example:
                                    type: weight_based
                                    models:
                                      - model: openai/gpt-4o
                                        weight: 0.7
                                      - model: anthropic/claude-3-5-sonnet
                                        weight: 0.3
                                timeout:
                                  type: object
                                  properties:
                                    call_timeout:
                                      type: number
                                      minimum: 1
                                      description: Timeout value in milliseconds
                                      example: 30000
                                  required:
                                    - call_timeout
                                  description: >-
                                    Timeout configuration to apply to the
                                    request. If the request exceeds the timeout,
                                    it will be retried or fallback to the next
                                    model if configured.
                                cache_control:
                                  type: object
                                  properties:
                                    type:
                                      type: string
                                      enum:
                                        - ephemeral
                                      description: >-
                                        Create a cache control breakpoint at
                                        this content block. Accepts only the
                                        value "ephemeral".
                                    ttl:
                                      type: string
                                      enum:
                                        - 5m
                                        - 1h
                                      default: 5m
                                      description: >-
                                        The time-to-live for the cache control
                                        breakpoint. This may be one of the
                                        following values:


                                        - `5m`: 5 minutes

                                        - `1h`: 1 hour


                                        Defaults to `5m`. Only supported by
                                        `Anthropic` Claude models.
                                  required:
                                    - type
                                  description: >-
                                    Provider-level prompt caching configuration
                                    applied to the request. Creates a cache
                                    control breakpoint covering the request
                                    content. Only supported by `Anthropic`
                                    Claude models.
                                prompt_cache_key:
                                  type: string
                                  description: >-
                                    Used by OpenAI to cache responses for
                                    similar requests to optimize your cache hit
                                    rates. Replaces the legacy `user` field for
                                    prompt caching.
                              description: >-
                                Model behavior parameters (snake_case) stored as
                                part of the agent configuration. These become
                                the default parameters used when the agent is
                                executed. Commonly used: temperature (0-1,
                                controls randomness), max_completion_tokens
                                (response length), top_p (nucleus sampling).
                                Advanced: frequency_penalty, presence_penalty,
                                response_format (JSON/structured output),
                                reasoning_effort (for o1/thinking models), seed
                                (reproducibility), stop sequences.
                                Model-specific support varies. Runtime
                                parameters in agent execution requests can
                                override these defaults.
                            retry:
                              type: object
                              properties:
                                count:
                                  type: number
                                  minimum: 1
                                  maximum: 5
                                  default: 3
                                  description: Number of retry attempts (1-5)
                                  example: 3
                                on_codes:
                                  type: array
                                  items:
                                    type: number
                                    minimum: 100
                                    maximum: 599
                                  minItems: 1
                                  description: HTTP status codes that trigger retry logic
                                  example:
                                    - 429
                                    - 500
                                    - 502
                                    - 503
                                    - 504
                              description: >-
                                Retry configuration for model requests. Allows
                                customizing retry count (1-5) and HTTP status
                                codes that trigger retries. Default codes:
                                [429]. Common codes: 500 (internal error), 429
                                (rate limit), 502/503/504 (gateway errors).
                            fallback_models:
                              type:
                                - array
                                - 'null'
                              items:
                                anyOf:
                                  - type: string
                                    description: >-
                                      A fallback model ID string (e.g.,
                                      `openai/gpt-4o-mini`). Will be used if the
                                      primary model request fails. Must support
                                      tool calling.
                                  - type: object
                                    properties:
                                      id:
                                        type: string
                                        description: >-
                                          A fallback model ID string. Must support
                                          tool calling.
                                      parameters:
                                        type: object
                                        properties:
                                          name:
                                            description: >-
                                              The name to display on the trace. If not
                                              specified, the default system name will
                                              be used.
                                            type: string
                                          frequency_penalty:
                                            type:
                                              - number
                                              - 'null'
                                            description: >-
                                              Number between -2.0 and 2.0. Positive
                                              values penalize new tokens based on
                                              their existing frequency in the text so
                                              far, decreasing the model's likelihood
                                              to repeat the same line verbatim.
                                          max_tokens:
                                            type:
                                              - integer
                                              - 'null'
                                            description: >-
                                              `[Deprecated]`. The maximum number of
                                              tokens that can be generated in the chat
                                              completion. This value can be used to
                                              control costs for text generated via
                                              API. 

                                               This value is now `deprecated` in favor of `max_completion_tokens`, and is not compatible with o1 series models.
                                          max_completion_tokens:
                                            type:
                                              - integer
                                              - 'null'
                                            exclusiveMinimum: 0
                                            description: >-
                                              An upper bound for the number of tokens
                                              that can be generated for a completion,
                                              including visible output tokens and
                                              reasoning tokens
                                          presence_penalty:
                                            type:
                                              - number
                                              - 'null'
                                            description: >-
                                              Number between -2.0 and 2.0. Positive
                                              values penalize new tokens based on
                                              whether they appear in the text so far,
                                              increasing the model's likelihood to
                                              talk about new topics.
                                          response_format:
                                            oneOf:
                                              - type: object
                                                properties:
                                                  type:
                                                    type: string
                                                    enum:
                                                      - text
                                                required:
                                                  - type
                                                title: Text
                                                description: >-


                                                  Default response format. Used to
                                                  generate text responses
                                              - type: object
                                                properties:
                                                  type:
                                                    type: string
                                                    enum:
                                                      - json_object
                                                required:
                                                  - type
                                                title: JSON object
                                                description: >-


                                                  JSON object response format. An older
                                                  method of generating JSON responses.
                                                  Using `json_schema` is recommended for
                                                  models that support it. Note that the
                                                  model will not generate JSON without a
                                                  system or user message instructing it to
                                                  do so.
                                              - type: object
                                                properties:
                                                  type:
                                                    enum:
                                                      - json_schema
                                                    type: string
                                                  json_schema:
                                                    type: object
                                                    properties:
                                                      description:
                                                        type: string
                                                        description: >-
                                                          A description of what the response
                                                          format is for, used by the model to
                                                          determine how to respond in the format.
                                                      name:
                                                        type: string
                                                        description: >-
                                                          The name of the response format. Must be
                                                          a-z, A-Z, 0-9, or contain underscores
                                                          and dashes, with a maximum length of 64.
                                                      schema:
                                                        description: >-
                                                          The schema for the response format,
                                                          described as a JSON Schema object.
                                                      strict:
                                                        type: boolean
                                                        default: false
                                                        description: >-
                                                          Whether to enable strict schema
                                                          adherence when generating the output. If
                                                          set to true, the model will always
                                                          follow the exact schema defined in the
                                                          schema field. Only a subset of JSON
                                                          Schema is supported when strict is true.
                                                    required:
                                                      - name
                                                required:
                                                  - type
                                                  - json_schema
                                                title: JSON schema
                                                description: >-


                                                  JSON Schema response format. Used to
                                                  generate structured JSON responses
                                            description: >-
                                              An object specifying the format that the
                                              model must output
                                          reasoning_effort:
                                            type: string
                                            enum:
                                              - none
                                              - minimal
                                              - low
                                              - medium
                                              - high
                                              - xhigh
                                            description: >-
                                              Constrains effort on reasoning for
                                              [reasoning
                                              models](https://platform.openai.com/docs/guides/reasoning).
                                              Currently supported values are `none`,
                                              `minimal`, `low`, `medium`, `high`, and
                                              `xhigh`. Reducing reasoning effort can
                                              result in faster responses and fewer
                                              tokens used on reasoning in a response.


                                              - `gpt-5.1` defaults to `none`, which
                                              does not perform reasoning. The
                                              supported reasoning values for `gpt-5.1`
                                              are `none`, `low`, `medium`, and `high`.
                                              Tool calls are supported for all
                                              reasoning values in gpt-5.1.

                                              - All models before `gpt-5.1` default to
                                              `medium` reasoning effort, and do not
                                              support `none`.

                                              - The `gpt-5-pro` model defaults to (and
                                              only supports) `high` reasoning effort.

                                              - `xhigh` is currently only supported
                                              for `gpt-5.1-codex-max`.


                                              Any of "none", "minimal", "low",
                                              "medium", "high", "xhigh".
                                          verbosity:
                                            type: string
                                            description: >-
                                              Adjusts response verbosity. Lower levels
                                              yield shorter answers.
                                          seed:
                                            type:
                                              - number
                                              - 'null'
                                            description: >-
                                              If specified, our system will make a
                                              best effort to sample deterministically,
                                              such that repeated requests with the
                                              same seed and parameters should return
                                              the same result.
                                          stop:
                                            anyOf:
                                              - type: string
                                              - type: array
                                                items:
                                                  type: string
                                                maxItems: 4
                                              - type: 'null'
                                            description: >-
                                              Up to 4 sequences where the API will
                                              stop generating further tokens.
                                          thinking:
                                            oneOf:
                                              - $ref: >-
                                                  #/components/schemas/ThinkingConfigDisabledSchema
                                              - $ref: >-
                                                  #/components/schemas/ThinkingConfigEnabledSchema
                                              - $ref: >-
                                                  #/components/schemas/ThinkingConfigAdaptiveSchema
                                            discriminator:
                                              propertyName: type
                                              mapping:
                                                disabled:
                                                  $ref: >-
                                                    #/components/schemas/ThinkingConfigDisabledSchema
                                                enabled:
                                                  $ref: >-
                                                    #/components/schemas/ThinkingConfigEnabledSchema
                                                adaptive:
                                                  $ref: >-
                                                    #/components/schemas/ThinkingConfigAdaptiveSchema
                                          temperature:
                                            type:
                                              - number
                                              - 'null'
                                            minimum: 0
                                            maximum: 2
                                            description: >-
                                              What sampling temperature to use,
                                              between 0 and 2. Higher values like 0.8
                                              will make the output more random, while
                                              lower values like 0.2 will make it more
                                              focused and deterministic.
                                          top_p:
                                            type:
                                              - number
                                              - 'null'
                                            minimum: 0
                                            maximum: 1
                                            description: >-
                                              An alternative to sampling with
                                              temperature, called nucleus sampling,
                                              where the model considers the results of
                                              the tokens with top_p probability mass. 
                                          top_k:
                                            type:
                                              - number
                                              - 'null'
                                            description: >-
                                              Limits the model to consider only the
                                              top k most likely tokens at each step.
                                          tool_choice:
                                            anyOf:
                                              - type: string
                                                enum:
                                                  - none
                                                  - auto
                                                  - required
                                              - type: object
                                                properties:
                                                  type:
                                                    type: string
                                                    enum:
                                                      - function
                                                    description: >-
                                                      The type of the tool. Currently, only
                                                      function is supported.
                                                  function:
                                                    type: object
                                                    properties:
                                                      name:
                                                        type: string
                                                        description: The name of the function to call.
                                                    required:
                                                      - name
                                                required:
                                                  - function
                                            description: >-
                                              Controls which (if any) tool is called
                                              by the model.
                                          parallel_tool_calls:
                                            type: boolean
                                            description: >-
                                              Whether to enable parallel function
                                              calling during tool use.
                                          modalities:
                                            type:
                                              - array
                                              - 'null'
                                            items:
                                              type: string
                                              enum:
                                                - text
                                                - audio
                                            description: >-
                                              Output types that you would like the
                                              model to generate. Most models are
                                              capable of generating text, which is the
                                              default: ["text"]. The
                                              gpt-4o-audio-preview model can also be
                                              used to generate audio. To request that
                                              this model generate both text and audio
                                              responses, you can use: ["text",
                                              "audio"].
                                          guardrails:
                                            type: array
                                            items:
                                              type: object
                                              properties:
                                                id:
                                                  anyOf:
                                                    - type: string
                                                      enum:
                                                        - orq_pii_detection
                                                        - orq_sexual_moderation
                                                        - orq_harmful_moderation
                                                      description: The key of the guardrail.
                                                    - type: string
                                                      description: >-
                                                        Unique key or identifier of the
                                                        evaluator
                                                execute_on:
                                                  type: string
                                                  enum:
                                                    - input
                                                    - output
                                                  description: >-
                                                    Determines whether the guardrail runs on
                                                    the input (user message) or output
                                                    (model response).
                                              required:
                                                - id
                                                - execute_on
                                            description: >-
                                              A list of guardrails to apply to the
                                              request.
                                          fallbacks:
                                            type: array
                                            items:
                                              type: object
                                              properties:
                                                model:
                                                  type: string
                                                  description: Fallback model identifier
                                                  example: openai/gpt-4o-mini
                                              required:
                                                - model
                                            description: >-
                                              Array of fallback models to use if
                                              primary model fails
                                          cache:
                                            type: object
                                            properties:
                                              ttl:
                                                type: number
                                                minimum: 1
                                                maximum: 259200
                                                default: 1800
                                                description: >-
                                                  Time to live for cached responses in
                                                  seconds. Maximum 259200 seconds (3
                                                  days).
                                                example: 3600
                                              type:
                                                type: string
                                                enum:
                                                  - exact_match
                                            required:
                                              - type
                                            description: Cache configuration for the request.
                                          load_balancer:
                                            oneOf:
                                              - type: object
                                                properties:
                                                  type:
                                                    type: string
                                                    enum:
                                                      - weight_based
                                                  models:
                                                    type: array
                                                    items:
                                                      type: object
                                                      properties:
                                                        model:
                                                          type: string
                                                          description: Model identifier for load balancing
                                                          example: openai/gpt-4o
                                                        weight:
                                                          type: number
                                                          minimum: 0.001
                                                          maximum: 1
                                                          default: 0.5
                                                          description: >-
                                                            Weight assigned to this model for load
                                                            balancing
                                                          example: 0.7
                                                      required:
                                                        - model
                                                required:
                                                  - type
                                                  - models
                                            description: >-
                                              Load balancer configuration for the
                                              request.
                                            example:
                                              type: weight_based
                                              models:
                                                - model: openai/gpt-4o
                                                  weight: 0.7
                                                - model: anthropic/claude-3-5-sonnet
                                                  weight: 0.3
                                          timeout:
                                            type: object
                                            properties:
                                              call_timeout:
                                                type: number
                                                minimum: 1
                                                description: Timeout value in milliseconds
                                                example: 30000
                                            required:
                                              - call_timeout
                                            description: >-
                                              Timeout configuration to apply to the
                                              request. If the request exceeds the
                                              timeout, it will be retried or fallback
                                              to the next model if configured.
                                          cache_control:
                                            type: object
                                            properties:
                                              type:
                                                type: string
                                                enum:
                                                  - ephemeral
                                                description: >-
                                                  Create a cache control breakpoint at
                                                  this content block. Accepts only the
                                                  value "ephemeral".
                                              ttl:
                                                type: string
                                                enum:
                                                  - 5m
                                                  - 1h
                                                default: 5m
                                                description: >-
                                                  The time-to-live for the cache control
                                                  breakpoint. This may be one of the
                                                  following values:


                                                  - `5m`: 5 minutes

                                                  - `1h`: 1 hour


                                                  Defaults to `5m`. Only supported by
                                                  `Anthropic` Claude models.
                                            required:
                                              - type
                                            description: >-
                                              Provider-level prompt caching
                                              configuration applied to the request.
                                              Creates a cache control breakpoint
                                              covering the request content. Only
                                              supported by `Anthropic` Claude models.
                                          prompt_cache_key:
                                            type: string
                                            description: >-
                                              Used by OpenAI to cache responses for
                                              similar requests to optimize your cache
                                              hit rates. Replaces the legacy `user`
                                              field for prompt caching.
                                        description: >-
                                          Optional model parameters specific to
                                          this fallback model. Overrides primary
                                          model parameters if this fallback is
                                          used.
                                      retry:
                                        type: object
                                        properties:
                                          count:
                                            type: number
                                            minimum: 1
                                            maximum: 5
                                            default: 3
                                            description: Number of retry attempts (1-5)
                                            example: 3
                                          on_codes:
                                            type: array
                                            items:
                                              type: number
                                              minimum: 100
                                              maximum: 599
                                            minItems: 1
                                            description: >-
                                              HTTP status codes that trigger retry
                                              logic
                                            example:
                                              - 429
                                              - 500
                                              - 502
                                              - 503
                                              - 504
                                        description: >-
                                          Retry configuration for this fallback
                                          model. Allows customizing retry count
                                          (1-5) and HTTP status codes that trigger
                                          retries.
                                    required:
                                      - id
                                    description: >-
                                      Fallback model configuration with optional
                                      parameters and retry settings.
                                title: Fallback Model Configuration
                                description: >-
                                  Fallback model for automatic failover when
                                  primary model request fails. Supports optional
                                  parameter overrides. Can be a simple model ID
                                  string or a configuration object with
                                  model-specific parameters. Fallbacks are tried
                                  in order.
                              description: >-
                                Optional array of fallback models (string IDs or
                                config objects) that will be used automatically
                                in order if the primary model fails
                          required:
                            - id
                      required:
                        - _id
                        - key
                        - status
                        - path
                        - skills
                        - role
                        - description
                        - instructions
                        - model
                  has_more:
                    type: boolean
                required:
                  - object
                  - data
                  - has_more
components:
  schemas:
    ThinkingConfigDisabledSchema:
      type: object
      properties:
        type:
          type: string
          enum:
            - disabled
          description: Disables the thinking mode capability
      required:
        - type
      title: Thinking config disabled
      description: Disables the thinking mode capability
    ThinkingConfigEnabledSchema:
      type: object
      properties:
        type:
          type: string
          enum:
            - enabled
          description: Enables or disables the thinking mode capability
        budget_tokens:
          type: number
          description: >-
            Determines how many tokens the model can use for its internal
            reasoning process. Larger budgets can enable more thorough analysis
            for complex problems, improving response quality. Must be ≥1024 and
            less than `max_tokens`.
        thinking_level:
          type: string
          enum:
            - low
            - medium
            - high
          description: >-
            The level of reasoning the model should use. This setting is
            supported only by `gemini-3` models. If budget_tokens is specified
            and `thinking_level` is available, `budget_tokens` will be ignored.
      required:
        - type
        - budget_tokens
      title: Thinking config enabled
      description: Enables the thinking mode capability
    ThinkingConfigAdaptiveSchema:
      type: object
      properties:
        type:
          type: string
          enum:
            - adaptive
          description: >-
            Lets the model dynamically determine when and how much to use
            extended thinking based on the complexity of each request. Supported
            on Claude Opus 4.6 and Sonnet 4.6.
      required:
        - type
      title: Thinking config adaptive
      description: >-
        Enables adaptive thinking mode where the model dynamically determines
        thinking depth
  securitySchemes:
    ApiKey:
      type: http
      scheme: bearer
      bearerFormat: JWT

````