Creates a model response for the given input. Returns a response object or a stream of server-sent events.
Bearer authentication header of the form Bearer <token>, where <token> is your auth token.
Conversation context for multi-turn interactions.
Fallback models to try if the primary model fails. Each entry specifies a model in provider/model format.
Penalize new tokens based on their frequency in the text so far. Between -2.0 and 2.0.
Identity/contact information for the end-user.
Input to the model: a string or an array of input items (messages, files, etc.).
System prompt / instructions for the model.
Maximum number of tokens in the response output.
Maximum number of tool call rounds in the agentic loop.
Attach a memory store entity to enable persistent memory across requests. See Memory Stores documentation for setup.
Developer-defined key-value pairs attached to the response.
The model to use in provider/model format (e.g. openai/gpt-4o). Use agent/
Whether to allow parallel tool calls.
Penalize new tokens based on their presence in the text so far. Between -2.0 and 2.0.
The ID of a previous response to continue from. Requires store to be true (default) on the original response.
Key for prompt caching across requests.
Configure reasoning behavior. Set effort (none, minimal, low, medium, high, xhigh) to control how much the model thinks before answering. Higher effort means more reasoning tokens and better answers for complex tasks, at higher cost.
Retry configuration. Specify the number of retries and which HTTP status codes should trigger a retry.
Safety identifier for content filtering.
Whether to persist the response (default: true). When false, the response cannot be retrieved later and previous_response_id will not work for follow-up requests.
If true, returns a stream of server-sent events.
Sampling temperature between 0 and 2.
Configuration for text output.
Thread for grouping related requests.
How the model should use the provided tools. Can be a string shorthand or a specific function selector.
auto, none, required Tools available to the model.
A tool definition. The "type" field determines the tool kind.
Number of most likely tokens to return at each position.
Nucleus sampling parameter.
Template variables for prompt substitution. Plain values fill {{variable}} placeholders in instructions. For secrets, use {"secret": true, "value": "sensitive-data"} — secrets are automatically passed to platform tools (Python, HTTP, MCP) and redacted from traces.
Returns a response object or a stream of events.
Array of input items (messages, function call outputs, etc.)
Always "response"
Array of output items (messages, function calls, reasoning, etc.)
auto, default, flex, priority queued, in_progress, completed, failed, incomplete, requires_action Text output configuration including format and verbosity
Tool choice setting: "auto", "none", "required", or a specific function
Array of tool configurations used in this response
disabled, auto