> ## Documentation Index
> Fetch the complete documentation index at: https://docs.orq.ai/llms.txt
> Use this file to discover all available pages before exploring further.

# Create chat completion

> Creates a model response for the given chat conversation with support for retries, fallbacks, prompts, and variables.


## OpenAPI

````yaml post /v2/router/chat/completions
openapi: 3.1.0
info:
  title: orq.ai API
  version: '2.0'
  description: orq.ai API documentation
servers:
  - url: https://api.orq.ai
security:
  - ApiKey: []
tags:
  - description: List models available through the AI Router.
    name: Models
  - name: Guardrail Rules
  - name: Policies
  - name: Routing Rules
  - name: API keys
    description: >-
      API keys authenticate programmatic access to the workspace. The unified
      key model exposes opaque tokens, per-domain access grants, and budget /
      rate-limit constraints (see ADR 0001 and ADR 0002).
  - name: Budgets
    description: >-
      Budgets govern spend, token usage, and request rate across six scopes:
      workspace, project, identity, api-key, provider, and model. A budget is
      hierarchical and defense-in-depth — every applicable budget is a hard
      gate, and the most restrictive one wins per dimension (see ADR 0007).
  - name: Documentation
    description: >-
      Search the orq.ai documentation. Proxies the workspace's query to the
      hosted docs search index.
  - name: Files
    description: File upload and retrieval operations.
  - name: Identities
    description: >-
      Identities represent end users from your system for usage and engagement
      tracking.
  - name: Projects
    description: Projects organize resources within a workspace
  - name: Skills
    description: >-
      Skills are modular instructions you can use to codify processes and
      conventions
  - name: Responses
  - description: >-
      Run agents on a cadence — cron, interval, or one-off. Minimum firing
      interval is 1 hour.
    name: Agent Schedules
  - name: Embeddings
  - name: Reporting
    description: >-
      GenAI reporting API over canonical analytics rollups. Accepts a metric
      name, time range, grain, group-by, and filters; returns a typed time
      series and optional totals.
externalDocs:
  url: https://docs.orq.ai
  description: orq.ai Documentation
paths:
  /v2/router/chat/completions:
    post:
      tags:
        - Chat
      summary: Create chat completion
      description: >-
        Creates a model response for the given chat conversation with support
        for retries, fallbacks, prompts, and variables.
      operationId: createChatCompletion
      requestBody:
        required: true
        content:
          application/json:
            schema:
              allOf:
                - type: object
                  properties:
                    messages:
                      type: array
                      items:
                        oneOf:
                          - type: object
                            properties:
                              role:
                                type: string
                                enum:
                                  - system
                                description: >-
                                  The role of the messages author, in this case
                                  `system`.
                              content:
                                anyOf:
                                  - type: string
                                    description: The contents of the system message.
                                  - type: array
                                    items:
                                      $ref: >-
                                        #/components/schemas/TextContentPartSchema
                                    minItems: 1
                                    description: >-
                                      An array of content parts with a defined
                                      type. For system messages, only type
                                      `text` is supported.
                                description: The contents of the system message.
                              name:
                                type: string
                                description: >-
                                  An optional name for the participant. Provides
                                  the model information to differentiate between
                                  participants of the same role.
                            required:
                              - role
                              - content
                            title: System message
                            description: >-
                              Developer-provided instructions that the model
                              should follow, regardless of messages sent by the
                              user.
                          - type: object
                            properties:
                              role:
                                type: string
                                enum:
                                  - developer
                                description: >-
                                  The role of the messages author, in this case 
                                  `developer`.
                              content:
                                anyOf:
                                  - type: string
                                    description: The contents of the system message.
                                  - type: array
                                    items:
                                      $ref: >-
                                        #/components/schemas/TextContentPartSchema
                                    minItems: 1
                                    description: >-
                                      An array of content parts with a defined
                                      type. For system messages, only type
                                      `text` is supported.
                                description: The contents of the developer message.
                              name:
                                type: string
                                description: >-
                                  An optional name for the participant. Provides
                                  the model information to differentiate between
                                  participants of the same role.
                            required:
                              - role
                              - content
                            title: Developer message
                          - type: object
                            properties:
                              role:
                                type: string
                                enum:
                                  - user
                                description: >-
                                  The role of the messages author, in this case
                                  `user`.
                              name:
                                type: string
                                description: >-
                                  An optional name for the participant. Provides
                                  the model information to differentiate between
                                  participants of the same role.
                              content:
                                anyOf:
                                  - type: string
                                    description: The text contents of the message.
                                  - type: array
                                    items:
                                      oneOf:
                                        - $ref: >-
                                            #/components/schemas/TextContentPartSchema
                                        - $ref: >-
                                            #/components/schemas/ImageContentPartSchema
                                        - $ref: >-
                                            #/components/schemas/AudioContentPartSchema
                                        - type: object
                                          properties:
                                            type:
                                              type: string
                                              enum:
                                                - file
                                              description: >-
                                                The type of the content part. Always
                                                `file`.
                                            cache_control:
                                              type: object
                                              properties:
                                                type:
                                                  type: string
                                                  enum:
                                                    - ephemeral
                                                  description: >-
                                                    Create a cache control breakpoint at
                                                    this content block. Accepts only the
                                                    value "ephemeral".
                                                ttl:
                                                  type: string
                                                  enum:
                                                    - 5m
                                                    - 1h
                                                  default: 5m
                                                  description: >-
                                                    The time-to-live for the cache control
                                                    breakpoint. This may be one of the
                                                    following values:


                                                    - `5m`: 5 minutes

                                                    - `1h`: 1 hour


                                                    Defaults to `5m`. Only supported by
                                                    `Anthropic` Claude models.
                                              required:
                                                - type
                                            file:
                                              $ref: >-
                                                #/components/schemas/FileContentPartSchema
                                          required:
                                            - type
                                            - file
                                    description: >-
                                      An array of content parts with a defined
                                      type. Supported options differ based on
                                      the model being used to generate the
                                      response. Can contain text, image, or
                                      audio inputs.
                                description: The contents of the user message.
                            required:
                              - role
                              - content
                            title: User message
                          - type: object
                            properties:
                              content:
                                anyOf:
                                  - type: string
                                    description: The contents of the assistant message.
                                  - type: array
                                    items:
                                      oneOf:
                                        - $ref: >-
                                            #/components/schemas/TextContentPartSchema
                                        - $ref: '#/components/schemas/RefusalPartSchema'
                                        - $ref: '#/components/schemas/ReasoningPartSchema'
                                        - $ref: >-
                                            #/components/schemas/RedactedReasoningPartSchema
                                      discriminator:
                                        propertyName: type
                                        mapping:
                                          text:
                                            $ref: >-
                                              #/components/schemas/TextContentPartSchema
                                          refusal:
                                            $ref: '#/components/schemas/RefusalPartSchema'
                                          reasoning:
                                            $ref: '#/components/schemas/ReasoningPartSchema'
                                          redacted_reasoning:
                                            $ref: >-
                                              #/components/schemas/RedactedReasoningPartSchema
                                    description: >-
                                      An array of content parts with a defined
                                      type. Can be one or more of type `text`,
                                      or exactly one of type `refusal`.
                                  - type: 'null'
                                description: >-
                                  The contents of the assistant message.
                                  Required unless `tool_calls` or
                                  `function_call` is specified.
                              refusal:
                                type:
                                  - string
                                  - 'null'
                                description: The refusal message by the assistant.
                              role:
                                type: string
                                enum:
                                  - assistant
                                description: >-
                                  The role of the messages author, in this case
                                  `assistant`.
                              name:
                                type: string
                                description: >-
                                  An optional name for the participant. Provides
                                  the model information to differentiate between
                                  participants of the same role.
                              audio:
                                type:
                                  - object
                                  - 'null'
                                properties:
                                  id:
                                    type: string
                                    description: >-
                                      Unique identifier for a previous audio
                                      response from the model.
                                required:
                                  - id
                                description: >-
                                  Data about a previous audio response from the
                                  model. 
                              tool_calls:
                                type: array
                                items:
                                  type: object
                                  properties:
                                    id:
                                      type: string
                                      description: The ID of the tool call.
                                    type:
                                      type: string
                                      enum:
                                        - function
                                      description: >-
                                        The type of the tool. Currently, only
                                        `function` is supported.
                                    function:
                                      type: object
                                      properties:
                                        name:
                                          type: string
                                          description: The name of the function to call.
                                        arguments:
                                          type: string
                                          description: >-
                                            The arguments to call the function with,
                                            as generated by the model in JSON
                                            format. Note that the model does not
                                            always generate valid JSON, and may
                                            hallucinate parameters not defined by
                                            your function schema. Validate the
                                            arguments in your code before calling
                                            your function.
                                    thought_signature:
                                      type: string
                                      description: >-
                                        Encrypted representation of the model
                                        internal reasoning state during function
                                        calling. Required by Gemini 3 models
                                        when continuing a conversation after a
                                        tool call.
                                  required:
                                    - id
                                    - type
                                    - function
                                description: >-
                                  The tool calls generated by the model, such as
                                  function calls.
                            required:
                              - role
                            title: Assistant message
                          - type: object
                            properties:
                              role:
                                type: string
                                enum:
                                  - tool
                                description: >-
                                  The role of the messages author, in this case
                                  tool.
                              content:
                                anyOf:
                                  - type: string
                                  - type: array
                                    items:
                                      oneOf:
                                        - $ref: >-
                                            #/components/schemas/TextContentPartSchema
                                      discriminator:
                                        propertyName: type
                                        mapping:
                                          text:
                                            $ref: >-
                                              #/components/schemas/TextContentPartSchema
                                description: The contents of the tool message.
                              tool_call_id:
                                type:
                                  - string
                                  - 'null'
                                description: Tool call that this message is responding to.
                              cache_control:
                                type: object
                                properties:
                                  type:
                                    type: string
                                    enum:
                                      - ephemeral
                                    description: >-
                                      Create a cache control breakpoint at this
                                      content block. Accepts only the value
                                      "ephemeral".
                                  ttl:
                                    type: string
                                    enum:
                                      - 5m
                                      - 1h
                                    default: 5m
                                    description: >-
                                      The time-to-live for the cache control
                                      breakpoint. This may be one of the
                                      following values:


                                      - `5m`: 5 minutes

                                      - `1h`: 1 hour


                                      Defaults to `5m`. Only supported by
                                      `Anthropic` Claude models.
                                required:
                                  - type
                            required:
                              - role
                              - content
                              - tool_call_id
                            title: Tool message
                      description: A list of messages comprising the conversation so far.
                    model:
                      type: string
                      description: >-
                        Model ID used to generate the response, like
                        `openai/gpt-4o` or
                        `anthropic/claude-haiku-4-5-20251001`. The AI Gateway
                        offers a wide range of models with different
                        capabilities, performance characteristics, and price
                        points. Refer to the (Supported
                        models)[/docs/proxy/supported-models] to browse
                        available models.
                    metadata:
                      type: object
                      additionalProperties:
                        type: string
                        maxLength: 512
                      description: >-
                        Set of 16 key-value pairs that can be attached to an
                        object. This can be useful for storing additional
                        information about the object in a structured format.
                        Keys can have a maximum length of 64 characters and
                        values can have a maximum length of 512 characters.
                    name:
                      description: >-
                        The name to display on the trace. If not specified, the
                        default system name will be used.
                      type: string
                    audio:
                      type:
                        - object
                        - 'null'
                      properties:
                        voice:
                          type: string
                          enum:
                            - alloy
                            - echo
                            - fable
                            - onyx
                            - nova
                            - shimmer
                          description: >-
                            The voice the model uses to respond. Supported
                            voices are alloy, echo, fable, onyx, nova, and
                            shimmer.
                        format:
                          type: string
                          enum:
                            - wav
                            - mp3
                            - flac
                            - opus
                            - pcm16
                          description: >-
                            Specifies the output audio format. Must be one of
                            wav, mp3, flac, opus, or pcm16.
                      required:
                        - voice
                        - format
                      description: >-
                        Parameters for audio output. Required when audio output
                        is requested with modalities: ["audio"]. Learn more.
                    frequency_penalty:
                      type:
                        - number
                        - 'null'
                      description: >-
                        Number between -2.0 and 2.0. Positive values penalize
                        new tokens based on their existing frequency in the text
                        so far, decreasing the model's likelihood to repeat the
                        same line verbatim.
                    max_tokens:
                      type:
                        - integer
                        - 'null'
                      description: >-
                        `[Deprecated]`. The maximum number of tokens that can be
                        generated in the chat completion. This value can be used
                        to control costs for text generated via API. 

                         This value is now `deprecated` in favor of `max_completion_tokens`, and is not compatible with o1 series models.
                    max_completion_tokens:
                      type:
                        - integer
                        - 'null'
                      exclusiveMinimum: 0
                      description: >-
                        An upper bound for the number of tokens that can be
                        generated for a completion, including visible output
                        tokens and reasoning tokens
                    logprobs:
                      type:
                        - boolean
                        - 'null'
                      description: >-
                        Whether to return log probabilities of the output tokens
                        or not. If true, returns the log probabilities of each
                        output token returned in the content of message.
                    top_logprobs:
                      type:
                        - integer
                        - 'null'
                      minimum: 0
                      maximum: 20
                      description: >-
                        An integer between 0 and 20 specifying the number of
                        most likely tokens to return at each token position,
                        each with an associated log probability. logprobs must
                        be set to true if this parameter is used.
                    'n':
                      type:
                        - integer
                        - 'null'
                      minimum: 1
                      description: >-
                        How many chat completion choices to generate for each
                        input message. Note that you will be charged based on
                        the number of generated tokens across all of the
                        choices. Keep n as 1 to minimize costs.
                    presence_penalty:
                      type:
                        - number
                        - 'null'
                      description: >-
                        Number between -2.0 and 2.0. Positive values penalize
                        new tokens based on whether they appear in the text so
                        far, increasing the model's likelihood to talk about new
                        topics.
                    response_format:
                      oneOf:
                        - type: object
                          properties:
                            type:
                              type: string
                              enum:
                                - text
                          required:
                            - type
                          title: Text
                          description: >-


                            Default response format. Used to generate text
                            responses
                        - type: object
                          properties:
                            type:
                              type: string
                              enum:
                                - json_object
                          required:
                            - type
                          title: JSON object
                          description: >-


                            JSON object response format. An older method of
                            generating JSON responses. Using `json_schema` is
                            recommended for models that support it. Note that
                            the model will not generate JSON without a system or
                            user message instructing it to do so.
                        - type: object
                          properties:
                            type:
                              enum:
                                - json_schema
                              type: string
                            json_schema:
                              type: object
                              properties:
                                description:
                                  type: string
                                  description: >-
                                    A description of what the response format is
                                    for, used by the model to determine how to
                                    respond in the format.
                                name:
                                  type: string
                                  description: >-
                                    The name of the response format. Must be
                                    a-z, A-Z, 0-9, or contain underscores and
                                    dashes, with a maximum length of 64.
                                schema:
                                  description: >-
                                    The schema for the response format,
                                    described as a JSON Schema object.
                                strict:
                                  type: boolean
                                  default: false
                                  description: >-
                                    Whether to enable strict schema adherence
                                    when generating the output. If set to true,
                                    the model will always follow the exact
                                    schema defined in the schema field. Only a
                                    subset of JSON Schema is supported when
                                    strict is true.
                              required:
                                - name
                          required:
                            - type
                            - json_schema
                          title: JSON schema
                          description: >-


                            JSON Schema response format. Used to generate
                            structured JSON responses
                      description: >-
                        An object specifying the format that the model must
                        output
                    reasoning_effort:
                      type: string
                      enum:
                        - none
                        - minimal
                        - low
                        - medium
                        - high
                        - xhigh
                      description: >-
                        Constrains effort on reasoning for [reasoning
                        models](https://platform.openai.com/docs/guides/reasoning).
                        Currently supported values are `none`, `minimal`, `low`,
                        `medium`, `high`, and `xhigh`. Reducing reasoning effort
                        can result in faster responses and fewer tokens used on
                        reasoning in a response.


                        - `gpt-5.1` defaults to `none`, which does not perform
                        reasoning. The supported reasoning values for `gpt-5.1`
                        are `none`, `low`, `medium`, and `high`. Tool calls are
                        supported for all reasoning values in gpt-5.1.

                        - All models before `gpt-5.1` default to `medium`
                        reasoning effort, and do not support `none`.

                        - The `gpt-5-pro` model defaults to (and only supports)
                        `high` reasoning effort.

                        - `xhigh` is currently only supported for
                        `gpt-5.1-codex-max`.


                        Any of "none", "minimal", "low", "medium", "high",
                        "xhigh".
                    verbosity:
                      type: string
                      description: >-
                        Adjusts response verbosity. Lower levels yield shorter
                        answers.
                    seed:
                      type:
                        - number
                        - 'null'
                      description: >-
                        If specified, our system will make a best effort to
                        sample deterministically, such that repeated requests
                        with the same seed and parameters should return the same
                        result.
                    stop:
                      anyOf:
                        - type: string
                        - type: array
                          items:
                            type: string
                          maxItems: 4
                        - type: 'null'
                      description: >-
                        Up to 4 sequences where the API will stop generating
                        further tokens.
                    stream_options:
                      type:
                        - object
                        - 'null'
                      properties:
                        include_usage:
                          type: boolean
                          description: >-
                            If set, an additional chunk will be streamed before
                            the data: [DONE] message. The usage field on this
                            chunk shows the token usage statistics for the
                            entire request, and the choices field will always be
                            an empty array. All other chunks will also include a
                            usage field, but with a null value.
                      description: >-
                        Options for streaming response. Only set this when you
                        set stream: true.
                    thinking:
                      oneOf:
                        - $ref: '#/components/schemas/ThinkingConfigDisabledSchema'
                        - $ref: '#/components/schemas/ThinkingConfigEnabledSchema'
                        - $ref: '#/components/schemas/ThinkingConfigAdaptiveSchema'
                      discriminator:
                        propertyName: type
                        mapping:
                          disabled:
                            $ref: '#/components/schemas/ThinkingConfigDisabledSchema'
                          enabled:
                            $ref: '#/components/schemas/ThinkingConfigEnabledSchema'
                          adaptive:
                            $ref: '#/components/schemas/ThinkingConfigAdaptiveSchema'
                    temperature:
                      type:
                        - number
                        - 'null'
                      minimum: 0
                      maximum: 2
                      description: >-
                        What sampling temperature to use, between 0 and 2.
                        Higher values like 0.8 will make the output more random,
                        while lower values like 0.2 will make it more focused
                        and deterministic.
                    top_p:
                      type:
                        - number
                        - 'null'
                      minimum: 0
                      maximum: 1
                      description: >-
                        An alternative to sampling with temperature, called
                        nucleus sampling, where the model considers the results
                        of the tokens with top_p probability mass. 
                    top_k:
                      type:
                        - number
                        - 'null'
                      description: >-
                        Limits the model to consider only the top k most likely
                        tokens at each step.
                    tools:
                      type: array
                      items:
                        type: object
                        properties:
                          type:
                            type: string
                            enum:
                              - function
                            description: >-
                              The type of the tool. Currently, only function is
                              supported.
                          function:
                            type: object
                            properties:
                              name:
                                type: string
                                description: The name of the function to call.
                              description:
                                type: string
                                description: >-
                                  A description of what the function does, used
                                  by the model to choose when and how to call
                                  the function.
                              parameters:
                                type: object
                                properties:
                                  type:
                                    type: string
                                    enum:
                                      - object
                                  properties:
                                    type: object
                                    additionalProperties: {}
                                  required:
                                    type: array
                                    items:
                                      type: string
                                  additionalProperties:
                                    type: boolean
                                required:
                                  - type
                                  - properties
                                description: >-
                                  The parameters the functions accepts,
                                  described as a JSON Schema object
                              strict:
                                type: boolean
                                description: >-
                                  Whether to enable strict schema adherence when
                                  generating the function call. 
                            required:
                              - name
                        required:
                          - function
                      description: A list of tools the model may call.
                    tool_choice:
                      anyOf:
                        - type: string
                          enum:
                            - none
                            - auto
                            - required
                        - type: object
                          properties:
                            type:
                              type: string
                              enum:
                                - function
                              description: >-
                                The type of the tool. Currently, only function
                                is supported.
                            function:
                              type: object
                              properties:
                                name:
                                  type: string
                                  description: The name of the function to call.
                              required:
                                - name
                          required:
                            - function
                      description: Controls which (if any) tool is called by the model.
                    parallel_tool_calls:
                      type: boolean
                      description: >-
                        Whether to enable parallel function calling during tool
                        use.
                    modalities:
                      type:
                        - array
                        - 'null'
                      items:
                        type: string
                        enum:
                          - text
                          - audio
                      description: >-
                        Output types that you would like the model to generate.
                        Most models are capable of generating text, which is the
                        default: ["text"]. The gpt-4o-audio-preview model can
                        also be used to generate audio. To request that this
                        model generate both text and audio responses, you can
                        use: ["text", "audio"].
                    guardrails:
                      type: array
                      items:
                        type: object
                        properties:
                          id:
                            anyOf:
                              - type: string
                                enum:
                                  - orq_pii_detection
                                  - orq_sexual_moderation
                                  - orq_harmful_moderation
                                description: The key of the guardrail.
                              - type: string
                                description: Unique key or identifier of the evaluator
                          execute_on:
                            type: string
                            enum:
                              - input
                              - output
                            description: >-
                              Determines whether the guardrail runs on the input
                              (user message) or output (model response).
                        required:
                          - id
                          - execute_on
                      description: A list of guardrails to apply to the request.
                    fallbacks:
                      type: array
                      items:
                        type: object
                        properties:
                          model:
                            type: string
                            description: Fallback model identifier
                            example: openai/gpt-4o-mini
                        required:
                          - model
                      description: Array of fallback models to use if primary model fails
                    retry:
                      type: object
                      properties:
                        count:
                          type: number
                          minimum: 1
                          maximum: 5
                          default: 3
                          description: Number of retry attempts (1-5)
                          example: 3
                        on_codes:
                          type: array
                          items:
                            type: number
                            minimum: 100
                            maximum: 599
                          minItems: 1
                          description: HTTP status codes that trigger retry logic
                          example:
                            - 429
                            - 500
                            - 502
                            - 503
                            - 504
                      description: Retry configuration for the request
                    cache:
                      type: object
                      properties:
                        ttl:
                          type: number
                          minimum: 1
                          maximum: 259200
                          default: 1800
                          description: >-
                            Time to live for cached responses in seconds.
                            Maximum 259200 seconds (3 days).
                          example: 3600
                        type:
                          type: string
                          enum:
                            - exact_match
                      required:
                        - type
                      description: Cache configuration for the request.
                    load_balancer:
                      oneOf:
                        - type: object
                          properties:
                            type:
                              type: string
                              enum:
                                - weight_based
                            models:
                              type: array
                              items:
                                type: object
                                properties:
                                  model:
                                    type: string
                                    description: Model identifier for load balancing
                                    example: openai/gpt-4o
                                  weight:
                                    type: number
                                    minimum: 0.001
                                    maximum: 1
                                    default: 0.5
                                    description: >-
                                      Weight assigned to this model for load
                                      balancing
                                    example: 0.7
                                required:
                                  - model
                          required:
                            - type
                            - models
                      description: Load balancer configuration for the request.
                      example:
                        type: weight_based
                        models:
                          - model: openai/gpt-4o
                            weight: 0.7
                          - model: anthropic/claude-3-5-sonnet
                            weight: 0.3
                    timeout:
                      type: object
                      properties:
                        call_timeout:
                          type: number
                          minimum: 1
                          description: Timeout value in milliseconds
                          example: 30000
                      required:
                        - call_timeout
                      description: >-
                        Timeout configuration to apply to the request. If the
                        request exceeds the timeout, it will be retried or
                        fallback to the next model if configured.
                    variables:
                      type: object
                      additionalProperties: {}
                      description: >-
                        Variables to substitute in message templates. Uses
                        f-string syntax ({{variableName}}) by default. For
                        advanced templating with Jinja or Mustache syntax, use
                        in conjunction with `template_engine`.
                      example:
                        customer_name: John Smith
                        product_name: Premium Plan
                    cache_control:
                      type: object
                      properties:
                        type:
                          type: string
                          enum:
                            - ephemeral
                          description: >-
                            Create a cache control breakpoint at this content
                            block. Accepts only the value "ephemeral".
                        ttl:
                          type: string
                          enum:
                            - 5m
                            - 1h
                          default: 5m
                          description: >-
                            The time-to-live for the cache control breakpoint.
                            This may be one of the following values:


                            - `5m`: 5 minutes

                            - `1h`: 1 hour


                            Defaults to `5m`. Only supported by `Anthropic`
                            Claude models.
                      required:
                        - type
                      description: >-
                        Provider-level prompt caching configuration applied to
                        the request. Creates a cache control breakpoint covering
                        the request content. Only supported by `Anthropic`
                        Claude models.
                    prompt_cache_key:
                      type: string
                      description: >-
                        Used by OpenAI to cache responses for similar requests
                        to optimize your cache hit rates. Replaces the legacy
                        `user` field for prompt caching.
                    orq:
                      type: object
                      properties:
                        name:
                          description: >-
                            The name to display on the trace. If not specified,
                            the default system name will be used.
                          type: string
                        retry:
                          type: object
                          properties:
                            count:
                              type: number
                              minimum: 1
                              maximum: 5
                              default: 3
                              description: Number of retry attempts (1-5)
                              example: 3
                            on_codes:
                              type: array
                              items:
                                type: number
                                minimum: 100
                                maximum: 599
                              minItems: 1
                              description: HTTP status codes that trigger retry logic
                              example:
                                - 429
                                - 500
                                - 502
                                - 503
                                - 504
                          description: Retry configuration for the request
                        fallbacks:
                          type: array
                          items:
                            type: object
                            properties:
                              model:
                                type: string
                                description: Fallback model identifier
                                example: openai/gpt-4o-mini
                            required:
                              - model
                          description: >-
                            Array of fallback models to use if primary model
                            fails
                        prompt:
                          type: object
                          properties:
                            id:
                              type: string
                              description: Unique identifier of the prompt to use
                              example: prompt_01ARZ3NDEKTSV4RRFFQ69G5FAV
                            version:
                              type: string
                              enum:
                                - latest
                              description: >-
                                Version of the prompt to use (currently only
                                "latest" supported)
                              example: latest
                          required:
                            - id
                            - version
                          description: Prompt configuration for the request
                        identity:
                          $ref: '#/components/schemas/PublicIdentity'
                        contact:
                          $ref: '#/components/schemas/PublicContact'
                        thread:
                          type: object
                          properties:
                            id:
                              type: string
                              description: >-
                                Unique thread identifier to group related
                                invocations.
                              example: thread_01ARZ3NDEKTSV4RRFFQ69G5FAV
                            tags:
                              type: array
                              items:
                                type: string
                              description: >-
                                Optional tags to differentiate or categorize
                                threads
                              example:
                                - customer-support
                                - priority-high
                          required:
                            - id
                          description: Thread information to group related requests
                        inputs:
                          anyOf:
                            - type: object
                              additionalProperties: {}
                            - type: array
                              items:
                                type: object
                                properties:
                                  key:
                                    type: string
                                  value: {}
                                  is_pii:
                                    type: boolean
                                required:
                                  - key
                          deprecated: true
                          description: >-
                            @deprecated Use top-level `variables` field instead.
                            Values to replace in the prompt messages using
                            {{variableName}} syntax.
                          example:
                            customer_name: John Smith
                            product_name: Premium Plan
                            issue_type: billing
                        cache:
                          type: object
                          properties:
                            ttl:
                              type: number
                              minimum: 1
                              maximum: 259200
                              default: 1800
                              description: >-
                                Time to live for cached responses in seconds.
                                Maximum 259200 seconds (3 days).
                              example: 3600
                            type:
                              type: string
                              enum:
                                - exact_match
                          required:
                            - type
                          description: Cache configuration for the request.
                        knowledge_bases:
                          type: array
                          items:
                            type: object
                            properties:
                              top_k:
                                type: integer
                                minimum: 1
                                maximum: 100
                                description: >-
                                  The number of results to return. If not
                                  provided, will default to the knowledge base
                                  configured `top_k`.
                              threshold:
                                type: number
                                minimum: 0
                                maximum: 1
                                description: >-
                                  The threshold to apply to the search. If not
                                  provided, will default to the knowledge base
                                  configured `threshold`
                              search_type:
                                type: string
                                enum:
                                  - vector_search
                                  - keyword_search
                                  - hybrid_search
                                default: hybrid_search
                                description: >-
                                  The type of search to perform. If not
                                  provided, will default to the knowledge base
                                  configured `retrieval_type`
                              filter_by:
                                anyOf:
                                  - type: object
                                    additionalProperties:
                                      anyOf:
                                        - type: object
                                          properties:
                                            eq:
                                              anyOf:
                                                - type: string
                                                  title: string
                                                  description: String
                                                - type: number
                                                  title: number
                                                  description: Number
                                                - type: boolean
                                                  title: boolean
                                                  description: Boolean
                                          required:
                                            - eq
                                          title: eq
                                          description: Equal to
                                        - type: object
                                          properties:
                                            ne:
                                              anyOf:
                                                - type: string
                                                  title: string
                                                  description: String
                                                - type: number
                                                  title: number
                                                  description: Number
                                                - type: boolean
                                                  title: boolean
                                                  description: Boolean
                                          required:
                                            - ne
                                          title: ne
                                          description: Not equal to
                                        - type: object
                                          properties:
                                            gt:
                                              type: number
                                          required:
                                            - gt
                                          title: gt
                                          description: Greater than
                                        - type: object
                                          properties:
                                            gte:
                                              type: number
                                          required:
                                            - gte
                                          title: gte
                                          description: Greater than or equal to
                                        - type: object
                                          properties:
                                            lt:
                                              type: number
                                          required:
                                            - lt
                                          title: lt
                                          description: Less than
                                        - type: object
                                          properties:
                                            lte:
                                              type: number
                                          required:
                                            - lte
                                          title: lte
                                          description: Less than or equal to
                                        - type: object
                                          properties:
                                            in:
                                              type: array
                                              items:
                                                anyOf:
                                                  - type: string
                                                    title: string
                                                    description: String
                                                  - type: number
                                                    title: number
                                                    description: Number
                                                  - type: boolean
                                                    title: boolean
                                                    description: Boolean
                                          required:
                                            - in
                                          title: in
                                          description: In
                                        - type: object
                                          properties:
                                            nin:
                                              type: array
                                              items:
                                                anyOf:
                                                  - type: string
                                                    title: string
                                                    description: String
                                                  - type: number
                                                    title: number
                                                    description: Number
                                                  - type: boolean
                                                    title: boolean
                                                    description: Boolean
                                          required:
                                            - nin
                                          title: nin
                                          description: Not in
                                        - type: object
                                          properties:
                                            exists:
                                              type: boolean
                                          required:
                                            - exists
                                          title: exists
                                          description: Exists
                                    title: Search operator
                                  - type: object
                                    properties:
                                      and:
                                        type: array
                                        items:
                                          type: object
                                          additionalProperties:
                                            anyOf:
                                              - type: object
                                                properties:
                                                  eq:
                                                    anyOf:
                                                      - type: string
                                                        title: string
                                                        description: String
                                                      - type: number
                                                        title: number
                                                        description: Number
                                                      - type: boolean
                                                        title: boolean
                                                        description: Boolean
                                                required:
                                                  - eq
                                                title: eq
                                                description: Equal to
                                              - type: object
                                                properties:
                                                  ne:
                                                    anyOf:
                                                      - type: string
                                                        title: string
                                                        description: String
                                                      - type: number
                                                        title: number
                                                        description: Number
                                                      - type: boolean
                                                        title: boolean
                                                        description: Boolean
                                                required:
                                                  - ne
                                                title: ne
                                                description: Not equal to
                                              - type: object
                                                properties:
                                                  gt:
                                                    type: number
                                                required:
                                                  - gt
                                                title: gt
                                                description: Greater than
                                              - type: object
                                                properties:
                                                  gte:
                                                    type: number
                                                required:
                                                  - gte
                                                title: gte
                                                description: Greater than or equal to
                                              - type: object
                                                properties:
                                                  lt:
                                                    type: number
                                                required:
                                                  - lt
                                                title: lt
                                                description: Less than
                                              - type: object
                                                properties:
                                                  lte:
                                                    type: number
                                                required:
                                                  - lte
                                                title: lte
                                                description: Less than or equal to
                                              - type: object
                                                properties:
                                                  in:
                                                    type: array
                                                    items:
                                                      anyOf:
                                                        - type: string
                                                          title: string
                                                          description: String
                                                        - type: number
                                                          title: number
                                                          description: Number
                                                        - type: boolean
                                                          title: boolean
                                                          description: Boolean
                                                required:
                                                  - in
                                                title: in
                                                description: In
                                              - type: object
                                                properties:
                                                  nin:
                                                    type: array
                                                    items:
                                                      anyOf:
                                                        - type: string
                                                          title: string
                                                          description: String
                                                        - type: number
                                                          title: number
                                                          description: Number
                                                        - type: boolean
                                                          title: boolean
                                                          description: Boolean
                                                required:
                                                  - nin
                                                title: nin
                                                description: Not in
                                              - type: object
                                                properties:
                                                  exists:
                                                    type: boolean
                                                required:
                                                  - exists
                                                title: exists
                                                description: Exists
                                    required:
                                      - and
                                    title: and
                                    description: And
                                  - type: object
                                    properties:
                                      or:
                                        type: array
                                        items:
                                          type: object
                                          additionalProperties:
                                            anyOf:
                                              - type: object
                                                properties:
                                                  eq:
                                                    anyOf:
                                                      - type: string
                                                        title: string
                                                        description: String
                                                      - type: number
                                                        title: number
                                                        description: Number
                                                      - type: boolean
                                                        title: boolean
                                                        description: Boolean
                                                required:
                                                  - eq
                                                title: eq
                                                description: Equal to
                                              - type: object
                                                properties:
                                                  ne:
                                                    anyOf:
                                                      - type: string
                                                        title: string
                                                        description: String
                                                      - type: number
                                                        title: number
                                                        description: Number
                                                      - type: boolean
                                                        title: boolean
                                                        description: Boolean
                                                required:
                                                  - ne
                                                title: ne
                                                description: Not equal to
                                              - type: object
                                                properties:
                                                  gt:
                                                    type: number
                                                required:
                                                  - gt
                                                title: gt
                                                description: Greater than
                                              - type: object
                                                properties:
                                                  gte:
                                                    type: number
                                                required:
                                                  - gte
                                                title: gte
                                                description: Greater than or equal to
                                              - type: object
                                                properties:
                                                  lt:
                                                    type: number
                                                required:
                                                  - lt
                                                title: lt
                                                description: Less than
                                              - type: object
                                                properties:
                                                  lte:
                                                    type: number
                                                required:
                                                  - lte
                                                title: lte
                                                description: Less than or equal to
                                              - type: object
                                                properties:
                                                  in:
                                                    type: array
                                                    items:
                                                      anyOf:
                                                        - type: string
                                                          title: string
                                                          description: String
                                                        - type: number
                                                          title: number
                                                          description: Number
                                                        - type: boolean
                                                          title: boolean
                                                          description: Boolean
                                                required:
                                                  - in
                                                title: in
                                                description: In
                                              - type: object
                                                properties:
                                                  nin:
                                                    type: array
                                                    items:
                                                      anyOf:
                                                        - type: string
                                                          title: string
                                                          description: String
                                                        - type: number
                                                          title: number
                                                          description: Number
                                                        - type: boolean
                                                          title: boolean
                                                          description: Boolean
                                                required:
                                                  - nin
                                                title: nin
                                                description: Not in
                                              - type: object
                                                properties:
                                                  exists:
                                                    type: boolean
                                                required:
                                                  - exists
                                                title: exists
                                                description: Exists
                                    required:
                                      - or
                                    title: or
                                    description: Or
                                description: >-
                                  The metadata filter to apply to the search.
                                  Check the [Searching a Knowledge
                                  Base](https://docs.orq.ai/docs/knowledge/api#knowledge-base-search)
                                  for more information.
                              search_options:
                                type: object
                                properties:
                                  include_vectors:
                                    type: boolean
                                    description: Whether to include the vector in the chunk
                                  include_metadata:
                                    type: boolean
                                    description: >-
                                      Whether to include the metadata in the
                                      chunk
                                  include_scores:
                                    type: boolean
                                    description: Whether to include the scores in the chunk
                                description: Additional search options
                              rerank_config:
                                type: object
                                properties:
                                  model:
                                    type: string
                                    description: >-
                                      The name of the rerank model to use. Refer
                                      to the [model
                                      list](https://docs.orq.ai/docs/proxy#/rerank-models).
                                    example: cohere/rerank-multilingual-v3.0
                                  threshold:
                                    type: number
                                    minimum: 0
                                    maximum: 1
                                    default: 0
                                    description: >-
                                      The threshold value used to filter the
                                      rerank results, only documents with a
                                      relevance score greater than the threshold
                                      will be returned
                                  top_k:
                                    type: integer
                                    minimum: 1
                                    maximum: 100
                                    default: 10
                                    description: >-
                                      The number of top results to return after
                                      reranking. If not provided, will default
                                      to the knowledge base configured `top_k`.
                                required:
                                  - model
                                description: >-
                                  Override the rerank configuration for this
                                  search. If not provided, will use the
                                  knowledge base configured rerank settings.
                              agentic_rag_config:
                                type: object
                                properties:
                                  model:
                                    type: string
                                    description: >-
                                      The name of the model for the Agent to
                                      use. Refer to the [model
                                      list](https://docs.orq.ai/docs/proxy#/chat-models).
                                required:
                                  - model
                                description: >-
                                  Override the agentic RAG configuration for
                                  this search. If not provided, will use the
                                  knowledge base configured agentic RAG
                                  settings.
                              knowledge_id:
                                type: string
                                description: >-
                                  Unique identifier of the knowledge base to
                                  search
                                example: customer-knowledge-base
                              query:
                                type: string
                                description: >-
                                  The query to use to search the knowledge base.
                                  If not provided we will use the last user
                                  message from the messages of the requests
                            required:
                              - knowledge_id
                        load_balancer:
                          oneOf:
                            - type: object
                              properties:
                                type:
                                  type: string
                                  enum:
                                    - weight_based
                                models:
                                  type: array
                                  items:
                                    type: object
                                    properties:
                                      model:
                                        type: string
                                        description: Model identifier for load balancing
                                        example: openai/gpt-4o
                                      weight:
                                        type: number
                                        minimum: 0.001
                                        maximum: 1
                                        default: 0.5
                                        description: >-
                                          Weight assigned to this model for load
                                          balancing
                                        example: 0.7
                                    required:
                                      - model
                              required:
                                - type
                                - models
                          description: >-
                            Array of models with weights for load balancing
                            requests
                          example:
                            type: weight_based
                            models:
                              - model: openai/gpt-4o
                                weight: 0.7
                              - model: anthropic/claude-3-5-sonnet
                                weight: 0.3
                        timeout:
                          type: object
                          properties:
                            call_timeout:
                              type: number
                              minimum: 1
                              description: Timeout value in milliseconds
                              example: 30000
                          required:
                            - call_timeout
                          description: >-
                            Timeout configuration to apply to the request. If
                            the request exceeds the timeout, it will be retried
                            or fallback to the next model if configured.
                      description: >-
                        Leverage Orq's intelligent routing capabilities to
                        enhance your AI application with enterprise-grade
                        reliability and observability. Orq provides automatic
                        request management including retries on failures, model
                        fallbacks for high availability, identity-level
                        analytics tracking, conversation threading, and dynamic
                        prompt templating with variable substitution.
                      deprecated: true
                      example:
                        retry:
                          count: 3
                          on_codes:
                            - 429
                            - 500
                            - 502
                        fallbacks:
                          - model: openai/gpt-5
                          - model: anthropic/claude-4-opus
                        identity:
                          id: identity_01ARZ3NDEKTSV4RRFFQ69G5FAV
                          display_name: Jane Doe
                          email: jane.doe@example.com
                        thread:
                          id: thread_01ARZ3NDEKTSV4RRFFQ69G5FAV
                          tags:
                            - customer-support
                        inputs:
                          customer_name: John Smith
                          issue_type: billing
                        cache:
                          ttl: 3600
                          type: exact_match
                        knowledge_bases:
                          - knowledge_id: knowledge_01ARZ3NDEKTSV4RRFFQ69G5FAV
                            top_k: 5
                        timeout:
                          call_timeout: 30000
                  required:
                    - messages
                    - model
                - type: object
                  properties:
                    stream:
                      type: boolean
                      default: false
      responses:
        '200':
          description: >-
            Returns a chat completion object, or a streamed sequence of chat
            completion chunk objects if the request is streamed.
          content:
            application/json:
              schema:
                type: object
                properties:
                  id:
                    type: string
                    description: A unique identifier for the chat completion.
                  choices:
                    type: array
                    items:
                      type: object
                      properties:
                        finish_reason:
                          type:
                            - string
                            - 'null'
                          enum:
                            - stop
                            - length
                            - tool_calls
                            - content_filter
                            - function_call
                            - null
                          description: The reason the model stopped generating tokens.
                        index:
                          type: number
                          default: 0
                          description: The index of the choice in the list of choices.
                        message:
                          type: object
                          properties:
                            content:
                              type:
                                - string
                                - 'null'
                            refusal:
                              type:
                                - string
                                - 'null'
                            tool_calls:
                              type: array
                              items:
                                type: object
                                properties:
                                  index:
                                    type: number
                                  id:
                                    type: string
                                  type:
                                    type: string
                                    enum:
                                      - function
                                  function:
                                    type: object
                                    properties:
                                      name:
                                        type: string
                                        description: >-
                                          The name of the function to be called.
                                          Must be a-z, A-Z, 0-9, or contain
                                          underscores and dashes, with a maximum
                                          length of 64.
                                      arguments:
                                        type: string
                                        description: >-
                                          The arguments to call the function with,
                                          as generated by the model in JSON
                                          format. Note that the model does not
                                          always generate valid JSON, and may
                                          hallucinate parameters not defined by
                                          your function schema. Validate the
                                          arguments in your code before calling
                                          your function.
                                  thought_signature:
                                    type: string
                                    description: >-
                                      Encrypted representation of the model
                                      internal reasoning state during function
                                      calling. Required by Gemini 3 models when
                                      continuing a conversation after a tool
                                      call.
                            role:
                              type: string
                              enum:
                                - assistant
                            reasoning:
                              type:
                                - string
                                - 'null'
                              description: Internal thought process of the model
                            reasoning_signature:
                              type:
                                - string
                                - 'null'
                              description: >-
                                The signature holds a cryptographic token which
                                verifies that the thinking block was generated
                                by the model, and is verified when thinking is
                                part of a multiturn conversation. This value
                                should not be modified and should always be sent
                                to the API when the reasoning is redacted.
                                Currently only supported by `Anthropic`.
                            redacted_reasoning:
                              type: string
                              description: >-
                                Occasionally the model's internal reasoning will
                                be flagged by the safety systems of the
                                provider. When this occurs, the provider will
                                encrypt the reasoning. These redacted reasoning
                                is decrypted when passed back to the API,
                                allowing the model to continue its response
                                without losing context.
                            audio:
                              type:
                                - object
                                - 'null'
                              properties:
                                id:
                                  type: string
                                expires_at:
                                  type: integer
                                data:
                                  type: string
                                transcript:
                                  type: string
                              required:
                                - id
                                - expires_at
                                - data
                                - transcript
                              description: >-
                                If the audio output modality is requested, this
                                object contains data about the audio response
                                from the model.
                          description: A chat completion message generated by the model.
                        logprobs:
                          type:
                            - object
                            - 'null'
                          properties:
                            content:
                              type:
                                - array
                                - 'null'
                              items:
                                type: object
                                properties:
                                  token:
                                    type: string
                                    description: The token.
                                  logprob:
                                    type: number
                                    description: >-
                                      The log probability of this token, if it
                                      is within the top 20 most likely tokens.
                                      Otherwise, the value -9999.0 is used to
                                      signify that the token is very unlikely.
                                  bytes:
                                    type:
                                      - array
                                      - 'null'
                                    items:
                                      type: number
                                    description: >-
                                      A list of integers representing the UTF-8
                                      bytes representation of the token.
                                  top_logprobs:
                                    type: array
                                    items:
                                      type: object
                                      properties:
                                        token:
                                          type: string
                                          description: The token.
                                        logprob:
                                          type: number
                                          description: >-
                                            The log probability of this token, if it
                                            is within the top 20 most likely tokens.
                                            Otherwise, the value -9999.0 is used to
                                            signify that the token is very unlikely.
                                        bytes:
                                          type:
                                            - array
                                            - 'null'
                                          items:
                                            type: number
                                          description: >-
                                            A list of integers representing the
                                            UTF-8 bytes representation of the token.
                                      required:
                                        - token
                                        - logprob
                                        - bytes
                                    description: >-
                                      List of the most likely tokens and their
                                      log probability, at this token position.
                                required:
                                  - token
                                  - logprob
                                  - bytes
                                  - top_logprobs
                              description: >-
                                A list of message content tokens with log
                                probability information.
                            refusal:
                              type:
                                - array
                                - 'null'
                              items:
                                type: object
                                properties:
                                  token:
                                    type: string
                                    description: The token.
                                  logprob:
                                    type: number
                                    description: >-
                                      The log probability of this token, if it
                                      is within the top 20 most likely tokens.
                                      Otherwise, the value -9999.0 is used to
                                      signify that the token is very unlikely.
                                  bytes:
                                    type:
                                      - array
                                      - 'null'
                                    items:
                                      type: number
                                    description: >-
                                      A list of integers representing the UTF-8
                                      bytes representation of the token.
                                  top_logprobs:
                                    type: array
                                    items:
                                      type: object
                                      properties:
                                        token:
                                          type: string
                                          description: The token.
                                        logprob:
                                          type: number
                                          description: >-
                                            The log probability of this token, if it
                                            is within the top 20 most likely tokens.
                                            Otherwise, the value -9999.0 is used to
                                            signify that the token is very unlikely.
                                        bytes:
                                          type:
                                            - array
                                            - 'null'
                                          items:
                                            type: number
                                          description: >-
                                            A list of integers representing the
                                            UTF-8 bytes representation of the token.
                                      required:
                                        - token
                                        - logprob
                                        - bytes
                                    description: >-
                                      List of the most likely tokens and their
                                      log probability, at this token position.
                                required:
                                  - token
                                  - logprob
                                  - bytes
                                  - top_logprobs
                              description: >-
                                A list of message refusal tokens with log
                                probability information.
                          required:
                            - content
                            - refusal
                          description: Log probability information for the choice.
                      required:
                        - finish_reason
                        - message
                    description: >-
                      A list of chat completion choices. Can be more than one if
                      n is greater than 1.
                  created:
                    type: number
                    description: >-
                      The Unix timestamp (in seconds) of when the chat
                      completion was created.
                  model:
                    type: string
                    description: The model used for the chat completion.
                  system_fingerprint:
                    type:
                      - string
                      - 'null'
                    description: >-
                      This fingerprint represents the backend configuration that
                      the model runs with.
                  usage:
                    type:
                      - object
                      - 'null'
                    properties:
                      completion_tokens:
                        type: number
                        description: Number of tokens in the generated completion.
                      prompt_tokens:
                        type: number
                        description: Number of tokens in the prompt.
                      total_tokens:
                        type: number
                        description: >-
                          Total number of tokens used in the request (prompt +
                          completion).
                      prompt_tokens_details:
                        type:
                          - object
                          - 'null'
                        properties:
                          cached_tokens:
                            type:
                              - integer
                              - 'null'
                          cache_creation_tokens:
                            type:
                              - integer
                              - 'null'
                          audio_tokens:
                            type:
                              - integer
                              - 'null'
                            description: >-
                              The number of audio input tokens consumed by the
                              request.
                      completion_tokens_details:
                        type:
                          - object
                          - 'null'
                        properties:
                          reasoning_tokens:
                            type:
                              - number
                              - 'null'
                          accepted_prediction_tokens:
                            type:
                              - number
                              - 'null'
                          rejected_prediction_tokens:
                            type:
                              - number
                              - 'null'
                          audio_tokens:
                            type:
                              - integer
                              - 'null'
                            description: >-
                              The number of audio output tokens produced by the
                              response.
                    description: Usage statistics for the completion request.
                  object:
                    type: string
                    enum:
                      - chat.completion
                required:
                  - id
                  - choices
                  - created
                  - model
                  - object
                description: >-
                  Represents a chat completion response returned by model, based
                  on the provided input.
            text/event-stream:
              schema:
                type: object
                properties:
                  id:
                    type: string
                    description: A unique identifier for the chat completion.
                  choices:
                    type: array
                    items:
                      type: object
                      properties:
                        finish_reason:
                          type:
                            - string
                            - 'null'
                          enum:
                            - stop
                            - length
                            - tool_calls
                            - content_filter
                            - function_call
                            - null
                          description: The reason the model stopped generating tokens.
                        index:
                          type: number
                          default: 0
                          description: The index of the choice in the list of choices.
                        logprobs:
                          type:
                            - object
                            - 'null'
                          properties:
                            content:
                              type:
                                - array
                                - 'null'
                              items:
                                type: object
                                properties:
                                  token:
                                    type: string
                                    description: The token.
                                  logprob:
                                    type: number
                                    description: >-
                                      The log probability of this token, if it
                                      is within the top 20 most likely tokens.
                                      Otherwise, the value -9999.0 is used to
                                      signify that the token is very unlikely.
                                  bytes:
                                    type:
                                      - array
                                      - 'null'
                                    items:
                                      type: number
                                    description: >-
                                      A list of integers representing the UTF-8
                                      bytes representation of the token.
                                  top_logprobs:
                                    type: array
                                    items:
                                      type: object
                                      properties:
                                        token:
                                          type: string
                                          description: The token.
                                        logprob:
                                          type: number
                                          description: >-
                                            The log probability of this token, if it
                                            is within the top 20 most likely tokens.
                                            Otherwise, the value -9999.0 is used to
                                            signify that the token is very unlikely.
                                        bytes:
                                          type:
                                            - array
                                            - 'null'
                                          items:
                                            type: number
                                          description: >-
                                            A list of integers representing the
                                            UTF-8 bytes representation of the token.
                                      required:
                                        - token
                                        - logprob
                                        - bytes
                                    description: >-
                                      List of the most likely tokens and their
                                      log probability, at this token position.
                                required:
                                  - token
                                  - logprob
                                  - bytes
                                  - top_logprobs
                              description: >-
                                A list of message content tokens with log
                                probability information.
                            refusal:
                              type:
                                - array
                                - 'null'
                              items:
                                type: object
                                properties:
                                  token:
                                    type: string
                                    description: The token.
                                  logprob:
                                    type: number
                                    description: >-
                                      The log probability of this token, if it
                                      is within the top 20 most likely tokens.
                                      Otherwise, the value -9999.0 is used to
                                      signify that the token is very unlikely.
                                  bytes:
                                    type:
                                      - array
                                      - 'null'
                                    items:
                                      type: number
                                    description: >-
                                      A list of integers representing the UTF-8
                                      bytes representation of the token.
                                  top_logprobs:
                                    type: array
                                    items:
                                      type: object
                                      properties:
                                        token:
                                          type: string
                                          description: The token.
                                        logprob:
                                          type: number
                                          description: >-
                                            The log probability of this token, if it
                                            is within the top 20 most likely tokens.
                                            Otherwise, the value -9999.0 is used to
                                            signify that the token is very unlikely.
                                        bytes:
                                          type:
                                            - array
                                            - 'null'
                                          items:
                                            type: number
                                          description: >-
                                            A list of integers representing the
                                            UTF-8 bytes representation of the token.
                                      required:
                                        - token
                                        - logprob
                                        - bytes
                                    description: >-
                                      List of the most likely tokens and their
                                      log probability, at this token position.
                                required:
                                  - token
                                  - logprob
                                  - bytes
                                  - top_logprobs
                              description: >-
                                A list of message refusal tokens with log
                                probability information.
                          required:
                            - content
                            - refusal
                          description: Log probability information for the choice.
                        delta:
                          type: object
                          properties:
                            content:
                              type:
                                - string
                                - 'null'
                              description: The contents of the chunk message.
                            refusal:
                              type:
                                - string
                                - 'null'
                            tool_calls:
                              type: array
                              items:
                                type: object
                                properties:
                                  index:
                                    type: number
                                    description: The index of the tool call.
                                  id:
                                    type: string
                                    description: The ID of the tool call.
                                  type:
                                    type: string
                                    enum:
                                      - function
                                    description: >-
                                      The type of the tool. Currently, only
                                      `function` is supported.
                                  function:
                                    type: object
                                    properties:
                                      name:
                                        description: The name of the function.
                                        type: string
                                      arguments:
                                        type: string
                                        description: >-
                                          The arguments to call the function with,
                                          as generated by the model in JSON
                                          format. Note that the model does not
                                          always generate valid JSON, and may
                                          hallucinate parameters not defined by
                                          your function schema. Validate the
                                          arguments in your code before calling
                                          your function.
                                  thought_signature:
                                    type: string
                                    description: >-
                                      Encrypted representation of the model
                                      internal reasoning state during function
                                      calling. Required by Gemini 3 models.
                            role:
                              type: string
                              enum:
                                - assistant
                            reasoning:
                              type: string
                              description: Internal thought process of the model
                            reasoning_signature:
                              type: string
                              description: >-
                                The signature holds a cryptographic token which
                                verifies that the thinking block was generated
                                by the model, and is verified when thinking is
                                part of a multiturn conversation. This value
                                should not be modified and should always be sent
                                to the API when the reasoning is redacted.
                                Currently only supported by `Anthropic`.
                            redacted_reasoning:
                              type: string
                              description: >-
                                Occasionally the model's internal reasoning will
                                be flagged by the safety systems of the
                                provider. When this occurs, the provider will
                                encrypt the reasoning. These redacted reasoning
                                is decrypted when passed back to the API,
                                allowing the model to continue its response
                                without losing context.
                            audio:
                              type:
                                - object
                                - 'null'
                              properties:
                                id:
                                  type: string
                                transcript:
                                  type: string
                                data:
                                  type: string
                                expires_at:
                                  type: integer
                              description: Audio response data in streaming mode.
                          description: >-
                            A chat completion delta generated by streamed model
                            responses.
                      required:
                        - finish_reason
                        - delta
                    description: >-
                      A list of chat completion choices. Can contain more than
                      one elements if n is greater than 1. Can also be empty for
                      the last chunk if you set stream_options:
                      {"include_usage": true}.
                  created:
                    type: number
                    description: >-
                      The Unix timestamp (in seconds) of when the chat
                      completion was created.
                  model:
                    type: string
                    description: The model used for the chat completion.
                  system_fingerprint:
                    type:
                      - string
                      - 'null'
                    description: >-
                      This fingerprint represents the backend configuration that
                      the model runs with.
                  usage:
                    type:
                      - object
                      - 'null'
                    properties:
                      completion_tokens:
                        type: number
                        description: Number of tokens in the generated completion.
                      prompt_tokens:
                        type: number
                        description: Number of tokens in the prompt.
                      total_tokens:
                        type: number
                        description: >-
                          Total number of tokens used in the request (prompt +
                          completion).
                      prompt_tokens_details:
                        type:
                          - object
                          - 'null'
                        properties:
                          cached_tokens:
                            type:
                              - integer
                              - 'null'
                          cache_creation_tokens:
                            type:
                              - integer
                              - 'null'
                          audio_tokens:
                            type:
                              - integer
                              - 'null'
                            description: >-
                              The number of audio input tokens consumed by the
                              request.
                      completion_tokens_details:
                        type:
                          - object
                          - 'null'
                        properties:
                          reasoning_tokens:
                            type:
                              - number
                              - 'null'
                          accepted_prediction_tokens:
                            type:
                              - number
                              - 'null'
                          rejected_prediction_tokens:
                            type:
                              - number
                              - 'null'
                          audio_tokens:
                            type:
                              - integer
                              - 'null'
                            description: >-
                              The number of audio output tokens produced by the
                              response.
                    description: Usage statistics for the completion request.
                  object:
                    type: string
                    enum:
                      - chat.completion.chunk
                required:
                  - id
                  - choices
                  - created
                  - model
                  - object
                description: >-
                  Represents a streamed chunk of a chat completion response
                  returned by model, based on the provided input.
components:
  schemas:
    TextContentPartSchema:
      type: object
      properties:
        type:
          type: string
          enum:
            - text
          description: The type of the content part.
        text:
          type: string
          description: The text content.
        cache_control:
          type: object
          properties:
            type:
              type: string
              enum:
                - ephemeral
              description: >-
                Create a cache control breakpoint at this content block. Accepts
                only the value "ephemeral".
            ttl:
              type: string
              enum:
                - 5m
                - 1h
              default: 5m
              description: >-
                The time-to-live for the cache control breakpoint. This may be
                one of the following values:


                - `5m`: 5 minutes

                - `1h`: 1 hour


                Defaults to `5m`. Only supported by `Anthropic` Claude models.
          required:
            - type
      required:
        - type
        - text
      title: Text content part
      description: The type of the content part.
    ImageContentPartSchema:
      type: object
      properties:
        type:
          type: string
          enum:
            - image_url
        image_url:
          type: object
          properties:
            url:
              type: string
              description: Either a URL of the image or the base64 encoded image data.
            detail:
              type: string
              enum:
                - low
                - high
                - auto
              description: Specifies the detail level of the image.
          required:
            - url
        cache_control:
          type: object
          properties:
            type:
              type: string
              enum:
                - ephemeral
              description: >-
                Create a cache control breakpoint at this content block. Accepts
                only the value "ephemeral".
            ttl:
              type: string
              enum:
                - 5m
                - 1h
              default: 5m
              description: >-
                The time-to-live for the cache control breakpoint. This may be
                one of the following values:


                - `5m`: 5 minutes

                - `1h`: 1 hour


                Defaults to `5m`. Only supported by `Anthropic` Claude models.
          required:
            - type
      required:
        - type
        - image_url
      title: Image content part
      description: An image content part
    AudioContentPartSchema:
      type: object
      properties:
        type:
          type: string
          enum:
            - input_audio
        input_audio:
          type: object
          properties:
            data:
              type: string
              description: Base64 encoded audio data.
            format:
              type: string
              enum:
                - mp3
                - wav
              description: >-
                The format of the encoded audio data. Currently supports `wav`
                and `mp3`.
          required:
            - data
            - format
      required:
        - type
        - input_audio
      title: Audio content part
      description: An audio content part
    FileContentPartSchema:
      type: object
      properties:
        file_data:
          type: string
          description: >-
            The file data as a data URI string in the format
            'data:<mime-type>;base64,<base64-encoded-data>'. Example:
            'data:image/png;base64,iVBORw0KGgoAAAANS...'
        uri:
          type: string
          description: >-
            URL to the file. Only supported by Anthropic Claude models for PDF
            files.
        mimeType:
          type: string
          description: MIME type of the file (e.g., application/pdf, image/png)
        filename:
          type: string
          description: >-
            The name of the file, used when passing the file to the model as a
            string.
      description: >-
        File data for the content part. Must contain either file_data or uri,
        but not both.
    RefusalPartSchema:
      type: object
      properties:
        type:
          type: string
          enum:
            - refusal
          description: The type of the content part. Always `refusal`.
        refusal:
          type: string
          description: The refusal message generated by the model.
      required:
        - type
      title: Refusal part
      description: A message part containing a refusal message.
    ReasoningPartSchema:
      type: object
      properties:
        type:
          type: string
          enum:
            - reasoning
          description: The type of the content part. Always `reasoning`.
        reasoning:
          type: string
          description: >-
            The reasoning or thought process behind the response. Used for
            chain-of-thought or extended thinking.
        signature:
          type: string
          description: >-
            Optional cryptographic signature to verify the authenticity and
            integrity of the reasoning content
      required:
        - type
        - reasoning
        - signature
      title: Reasoning Part
      description: A message part containing reasoning or chain-of-thought content
    RedactedReasoningPartSchema:
      type: object
      properties:
        type:
          type: string
          enum:
            - redacted_reasoning
          description: The type of the content part. Always `reasoning`.
        data:
          type: string
          description: >-
            The encrypted reasoning or thought process behind the response. Used
            for chain-of-thought or extended thinking.
      required:
        - type
        - data
      title: Reasoning Part
      description: A message part containing reasoning or chain-of-thought content
    ThinkingConfigDisabledSchema:
      type: object
      properties:
        type:
          type: string
          enum:
            - disabled
          description: Disables the thinking mode capability
      required:
        - type
      title: Thinking config disabled
      description: Disables the thinking mode capability
    ThinkingConfigEnabledSchema:
      type: object
      properties:
        type:
          type: string
          enum:
            - enabled
          description: Enables or disables the thinking mode capability
        budget_tokens:
          type: number
          description: >-
            Determines how many tokens the model can use for its internal
            reasoning process. Larger budgets can enable more thorough analysis
            for complex problems, improving response quality. Must be ≥1024 and
            less than `max_tokens`.
        thinking_level:
          type: string
          enum:
            - low
            - medium
            - high
          description: >-
            The level of reasoning the model should use. This setting is
            supported only by `gemini-3` models. If budget_tokens is specified
            and `thinking_level` is available, `budget_tokens` will be ignored.
      required:
        - type
        - budget_tokens
      title: Thinking config enabled
      description: Enables the thinking mode capability
    ThinkingConfigAdaptiveSchema:
      type: object
      properties:
        type:
          type: string
          enum:
            - adaptive
          description: >-
            Lets the model dynamically determine when and how much to use
            extended thinking based on the complexity of each request. Supported
            on Claude Opus 4.6 and Sonnet 4.6.
      required:
        - type
      title: Thinking config adaptive
      description: >-
        Enables adaptive thinking mode where the model dynamically determines
        thinking depth
    PublicIdentity:
      type: object
      properties:
        id:
          type: string
          description: Unique identifier for the contact
          example: contact_01ARZ3NDEKTSV4RRFFQ69G5FAV
        display_name:
          type: string
          description: Display name of the contact
          example: Jane Doe
        email:
          type: string
          format: email
          description: Email address of the contact
          example: jane.doe@example.com
        metadata:
          type: array
          items:
            type: object
            additionalProperties: {}
          description: >-
            A hash of key/value pairs containing any other data about the
            contact
          example:
            - department: Engineering
              role: Senior Developer
        logo_url:
          type: string
          description: URL to the contact's avatar or logo
          example: https://example.com/avatars/jane-doe.jpg
        tags:
          type: array
          items:
            type: string
          description: A list of tags associated with the contact
          example:
            - hr
            - engineering
      required:
        - id
      description: >-
        Information about the identity making the request. If the identity does
        not exist, it will be created automatically.
    PublicContact:
      type: object
      properties:
        id:
          type: string
          description: Unique identifier for the contact
          example: contact_01ARZ3NDEKTSV4RRFFQ69G5FAV
        display_name:
          type: string
          description: Display name of the contact
          example: Jane Doe
        email:
          type: string
          format: email
          description: Email address of the contact
          example: jane.doe@example.com
        metadata:
          type: array
          items:
            type: object
            additionalProperties: {}
          description: >-
            A hash of key/value pairs containing any other data about the
            contact
          example:
            - department: Engineering
              role: Senior Developer
        logo_url:
          type: string
          description: URL to the contact's avatar or logo
          example: https://example.com/avatars/jane-doe.jpg
        tags:
          type: array
          items:
            type: string
          description: A list of tags associated with the contact
          example:
            - hr
            - engineering
      required:
        - id
      description: >-
        @deprecated Use identity instead. Information about the contact making
        the request.
      deprecated: true
  securitySchemes:
    ApiKey:
      type: http
      scheme: bearer
      bearerFormat: JWT

````