Deployments
Invoke a Deployment
Invoke a deployment with a given payloadfrom orq_ai_sdk import Orq
import os
with Orq(
environment="<value>",
contact_id="<id>",
api_key=os.getenv("ORQ_API_KEY", ""),
) as orq:
res = orq.deployments.invoke(key="<key>", stream=False, identity={
"id": "contact_01ARZ3NDEKTSV4RRFFQ69G5FAV",
"display_name": "Jane Doe",
"email": "jane.doe@example.com",
"metadata": [
{
"department": "Engineering",
"role": "Senior Developer",
},
],
"logo_url": "https://example.com/avatars/jane-doe.jpg",
"tags": [
"hr",
"engineering",
],
}, documents=[
{
"text": "The refund policy allows customers to return items within 30 days of purchase for a full refund.",
"metadata": {
"file_name": "refund_policy.pdf",
"file_type": "application/pdf",
"page_number": 1,
},
},
{
"text": "Premium members receive free shipping on all orders over $50.",
"metadata": {
"file_name": "membership_benefits.md",
"file_type": "text/markdown",
},
},
])
assert res is not None
# Handle response
print(res)
Show Parameters
Show Parameters
The deployment key to invoke
If set, partial message content will be sent. Tokens will be sent as data-only
server-sent events as they become available, with the stream terminated by a data: [DONE] message.Key-value pairs variables to replace in your prompts. If a variable is not provided that is defined in the prompt, the default variables are used.
Key-value pairs that match your data model and fields declared in your deployment routing configuration
A list of messages to include after the
System message, but before the User and Assistant pairs configured in your deployment.A list of messages to send to the deployment.
Information about the identity making the request. If the identity does not exist, it will be created automatically.
Show Properties of identity
Show Properties of identity
A list of file IDs that are associated with the deployment request.
Key-value pairs that you want to attach to the log generated by this request.
Utilized for passing additional parameters to the model provider. Exercise caution when using this feature, as the included parameters will overwrite any parameters specified in the deployment prompt configuration.
A list of documents from your external knowledge base (e.g., chunks retrieved from your own vector database or RAG pipeline) that provide context for the model response. These documents can be used by evaluators and guardrails to assess the relevance and accuracy of the model output against the provided context.
Show Properties of documents
Show Properties of documents
The text content of the document
Show Response
Show Response
A unique identifier for the response. Can be used to add metrics to the transaction.
A timestamp indicating when the object was created. Usually in a standardized format like ISO 8601
Indicates the type of model used to generate the response
The model used to generate the response
The provider used to generate the response
Indicates if the response is the final response
Indicates integration id used to generate the response
Show Properties of telemetry
Show Properties of telemetry
The trace id for the request that generated this response
A timestamp indicating when the object was finalized. Usually in a standardized format like ISO 8601
Provider backed system fingerprint.
List of documents retrieved from the knowledge base. This property is only available when the
include_retrievals flag is set to true in the invoke settings. When stream is set to true, the retrievals property will be returned in the last streamed chunk where the property is_final is set to true.Show Properties of retrievals
Show Properties of retrievals
Content of the retrieved chunk from the knowledge base
Response returned by the model provider. This functionality is only supported when streaming is not used. If streaming is used, the
provider_response property will be set to null.List Deployments
Returns a list of your deployments. The deployments are returned sorted by creation date, with the most recent deployments appearing first.from orq_ai_sdk import Orq
import os
with Orq(
api_key=os.getenv("ORQ_API_KEY", ""),
) as orq:
res = orq.deployments.list(limit=10)
# Handle response
print(res)
Show Parameters
Show Parameters
A limit on the number of objects to be returned. Limit can range between 1 and 50, and the default is 10
A cursor for use in pagination.
starting_after is an object ID that defines your place in the list. For instance, if you make a list request and receive 20 objects, ending with 01JJ1HDHN79XAS7A01WB3HYSDB, your subsequent call can include after=01JJ1HDHN79XAS7A01WB3HYSDB in order to fetch the next page of the list.Show Response
Show Response
Show Properties of data
Show Properties of data
Unique identifier for the object.
Date in ISO 8601 format at which the object was created.
Date in ISO 8601 format at which the object was last updated.
The deployment unique key
An arbitrary string attached to the object. Often useful for displaying to users.
Show Properties of promptConfig
Show Properties of promptConfig
Show Properties of tools
Show Properties of tools
The type of the tool. Currently, only
function is supported.Show Properties of function
Show Properties of function
The modality of the model
Model Parameters: Not all parameters apply to every model
Show Properties of modelParameters
Show Properties of modelParameters
Only supported on
chat and completion models.Only supported on
chat and completion models.Only supported on
chat and completion models.Only supported on
chat and completion models.Only supported on
chat and completion models.Only supported on
chat and completion models.Only supported on
image models.Best effort deterministic seed for the model. Currently only OpenAI models support these
Only supported on
image models.Only supported on
image models.Only supported on
image models.Only supported on
image models.An object specifying the format that the model must output. Setting to
\{ "type": "json_schema", "json_schema": \{...\} \} enables Structured Outputs which ensures the model will match your supplied JSON schema Setting to \{ "type": "json_object" \} enables JSON mode, which ensures the message the model generates is valid JSON. Important: when using JSON mode, you must also instruct the model to produce JSON yourself via a system or user message. Without this, the model may generate an unending stream of whitespace until the generation reaches the token limit, resulting in a long-running and seemingly “stuck” request. Also note that the message content may be partially cut off if finish_reason=“length”, which indicates the generation exceeded max_tokens or the conversation exceeded the max context length.The version of photoReal to use. Must be v1 or v2. Only available for
leonardoai providerThe format to return the embeddings
Constrains effort on reasoning for reasoning models. Reducing reasoning effort can result in faster responses and fewer tokens used on reasoning in a response.
Gives the model enhanced reasoning capabilities for complex tasks. A value of 0 disables thinking. The minimum budget tokens for thinking are 1024. The Budget Tokens should never exceed the Max Tokens parameter. Only supported by
AnthropicControls the verbosity of the model output.
Get Config
Retrieve the deployment configurationfrom orq_ai_sdk import Orq
import os
with Orq(
api_key=os.getenv("ORQ_API_KEY", ""),
) as orq:
res = orq.deployments.get_config(key="<key>", identity={
"id": "contact_01ARZ3NDEKTSV4RRFFQ69G5FAV",
"display_name": "Jane Doe",
"email": "jane.doe@example.com",
"metadata": [
{
"department": "Engineering",
"role": "Senior Developer",
},
],
"logo_url": "https://example.com/avatars/jane-doe.jpg",
"tags": [
"hr",
"engineering",
],
}, documents=[
{
"text": "The refund policy allows customers to return items within 30 days of purchase for a full refund.",
"metadata": {
"file_name": "refund_policy.pdf",
"file_type": "application/pdf",
"page_number": 1,
},
},
{
"text": "Premium members receive free shipping on all orders over $50.",
"metadata": {
"file_name": "membership_benefits.md",
"file_type": "text/markdown",
},
},
])
assert res is not None
# Handle response
print(res)
Show Parameters
Show Parameters
The deployment key to invoke
Key-value pairs variables to replace in your prompts. If a variable is not provided that is defined in the prompt, the default variables are used.
Key-value pairs that match your data model and fields declared in your deployment routing configuration
A list of messages to include after the
System message, but before the User and Assistant pairs configured in your deployment.A list of messages to send to the deployment.
Information about the identity making the request. If the identity does not exist, it will be created automatically.
Show Properties of identity
Show Properties of identity
A list of file IDs that are associated with the deployment request.
Key-value pairs that you want to attach to the log generated by this request.
Utilized for passing additional parameters to the model provider. Exercise caution when using this feature, as the included parameters will overwrite any parameters specified in the deployment prompt configuration.
A list of documents from your external knowledge base (e.g., chunks retrieved from your own vector database or RAG pipeline) that provide context for the model response. These documents can be used by evaluators and guardrails to assess the relevance and accuracy of the model output against the provided context.
Show Properties of documents
Show Properties of documents
The text content of the document
Show Response
Show Response
A unique identifier for the response. Can be used to add metrics to the transaction.
The provider of the model
The model of the configuration
The type of the model. Current
chat,completion and image are supportedThe current version of the deployment
Show Properties of messages
Show Properties of messages
The role of the prompt message
The contents of the user message. Either the text content of the message or an array of content parts with a defined type, each can be of type
text or image_url when passing in images. You can pass multiple images by adding multiple image_url content parts. Can be null for tool messages in certain scenarios.Model Parameters: Not all parameters apply to every model
Show Properties of parameters
Show Properties of parameters
Only supported on
chat and completion models.Only supported on
chat and completion models.Only supported on
chat and completion models.Only supported on
chat and completion models.Only supported on
chat and completion models.Only supported on
chat and completion models.Only supported on
image models.Best effort deterministic seed for the model. Currently only OpenAI models support these
Only supported on
image models.Only supported on
image models.Only supported on
image models.Only supported on
image models.An object specifying the format that the model must output. Setting to
\{ "type": "json_schema", "json_schema": \{...\} \} enables Structured Outputs which ensures the model will match your supplied JSON schema Setting to \{ "type": "json_object" \} enables JSON mode, which ensures the message the model generates is valid JSON. Important: when using JSON mode, you must also instruct the model to produce JSON yourself via a system or user message. Without this, the model may generate an unending stream of whitespace until the generation reaches the token limit, resulting in a long-running and seemingly “stuck” request. Also note that the message content may be partially cut off if finish_reason=“length”, which indicates the generation exceeded max_tokens or the conversation exceeded the max context length.The version of photoReal to use. Must be v1 or v2. Only available for
leonardoai providerThe format to return the embeddings
Constrains effort on reasoning for reasoning models. Reducing reasoning effort can result in faster responses and fewer tokens used on reasoning in a response.
Gives the model enhanced reasoning capabilities for complex tasks. A value of 0 disables thinking. The minimum budget tokens for thinking are 1024. The Budget Tokens should never exceed the Max Tokens parameter. Only supported by
AnthropicControls the verbosity of the model output.
Stream a Deployment
Stream deployment generation. Only supported for completions and chat completions.from orq_ai_sdk import Orq
import os
with Orq(
environment="<value>",
contact_id="<id>",
api_key=os.getenv("ORQ_API_KEY", ""),
) as orq:
res = orq.deployments.stream(key="<key>", identity={
"id": "contact_01ARZ3NDEKTSV4RRFFQ69G5FAV",
"display_name": "Jane Doe",
"email": "jane.doe@example.com",
"metadata": [
{
"department": "Engineering",
"role": "Senior Developer",
},
],
"logo_url": "https://example.com/avatars/jane-doe.jpg",
"tags": [
"hr",
"engineering",
],
}, documents=[
{
"text": "The refund policy allows customers to return items within 30 days of purchase for a full refund.",
"metadata": {
"file_name": "refund_policy.pdf",
"file_type": "application/pdf",
"page_number": 1,
},
},
{
"text": "Premium members receive free shipping on all orders over $50.",
"metadata": {
"file_name": "membership_benefits.md",
"file_type": "text/markdown",
},
},
])
with res as event_stream:
for event in event_stream:
# handle event
print(event, flush=True)
Show Parameters
Show Parameters
The deployment key to invoke
Key-value pairs variables to replace in your prompts. If a variable is not provided that is defined in the prompt, the default variables are used.
Key-value pairs that match your data model and fields declared in your deployment routing configuration
A list of messages to include after the
System message, but before the User and Assistant pairs configured in your deployment.A list of messages to send to the deployment.
Information about the identity making the request. If the identity does not exist, it will be created automatically.
Show Properties of identity
Show Properties of identity
A list of file IDs that are associated with the deployment request.
Key-value pairs that you want to attach to the log generated by this request.
Utilized for passing additional parameters to the model provider. Exercise caution when using this feature, as the included parameters will overwrite any parameters specified in the deployment prompt configuration.
A list of documents from your external knowledge base (e.g., chunks retrieved from your own vector database or RAG pipeline) that provide context for the model response. These documents can be used by evaluators and guardrails to assess the relevance and accuracy of the model output against the provided context.
Show Properties of documents
Show Properties of documents
The text content of the document