curl --request POST \
--url https://api.orq.ai/v2/deployments/invoke \
--header 'Authorization: Bearer <token>' \
--header 'Content-Type: application/json' \
--data '
{
"key": "<string>",
"stream": false,
"inputs": {},
"context": {},
"prefix_messages": [
{
"role": "system",
"content": "<string>",
"name": "<string>"
}
],
"messages": [
{
"role": "system",
"content": "<string>",
"name": "<string>"
}
],
"file_ids": [
"<string>"
],
"metadata": {},
"extra_params": {},
"documents": [
{
"text": "<string>",
"metadata": {
"file_name": "<string>",
"file_type": "<string>",
"page_number": 123
}
}
],
"invoke_options": {
"include_retrievals": false,
"include_usage": false,
"mock_response": "<string>"
},
"thread": {
"id": "<string>",
"tags": [
"<string>"
]
},
"knowledge_filter": {}
}
'{
"id": "<string>",
"created": "2023-11-07T05:31:56Z",
"object": "chat",
"model": "<string>",
"provider": "openai",
"is_final": true,
"choices": [
{
"index": 123,
"message": {
"type": "tool_calls",
"role": "system",
"tool_calls": [
{
"type": "function",
"function": {
"name": "<string>",
"arguments": "<string>"
},
"id": "<string>",
"index": 123
}
],
"content": "<string>",
"reasoning": "<string>",
"reasoning_signature": "<string>",
"redacted_reasoning": "<string>"
},
"finish_reason": "<string>"
}
],
"integration_id": "<string>",
"finalized": "2023-11-07T05:31:56Z",
"system_fingerprint": "<string>",
"retrievals": [
{
"document": "<string>",
"metadata": {
"file_name": "<string>",
"page_number": 123,
"file_type": "<string>",
"search_score": 123,
"rerank_score": 123
}
}
],
"provider_response": "<unknown>",
"usage": {
"input_tokens": 123,
"output_tokens": 123,
"total_tokens": 123
}
}Invoke a deployment with a given payload
curl --request POST \
--url https://api.orq.ai/v2/deployments/invoke \
--header 'Authorization: Bearer <token>' \
--header 'Content-Type: application/json' \
--data '
{
"key": "<string>",
"stream": false,
"inputs": {},
"context": {},
"prefix_messages": [
{
"role": "system",
"content": "<string>",
"name": "<string>"
}
],
"messages": [
{
"role": "system",
"content": "<string>",
"name": "<string>"
}
],
"file_ids": [
"<string>"
],
"metadata": {},
"extra_params": {},
"documents": [
{
"text": "<string>",
"metadata": {
"file_name": "<string>",
"file_type": "<string>",
"page_number": 123
}
}
],
"invoke_options": {
"include_retrievals": false,
"include_usage": false,
"mock_response": "<string>"
},
"thread": {
"id": "<string>",
"tags": [
"<string>"
]
},
"knowledge_filter": {}
}
'{
"id": "<string>",
"created": "2023-11-07T05:31:56Z",
"object": "chat",
"model": "<string>",
"provider": "openai",
"is_final": true,
"choices": [
{
"index": 123,
"message": {
"type": "tool_calls",
"role": "system",
"tool_calls": [
{
"type": "function",
"function": {
"name": "<string>",
"arguments": "<string>"
},
"id": "<string>",
"index": 123
}
],
"content": "<string>",
"reasoning": "<string>",
"reasoning_signature": "<string>",
"redacted_reasoning": "<string>"
},
"finish_reason": "<string>"
}
],
"integration_id": "<string>",
"finalized": "2023-11-07T05:31:56Z",
"system_fingerprint": "<string>",
"retrievals": [
{
"document": "<string>",
"metadata": {
"file_name": "<string>",
"page_number": 123,
"file_type": "<string>",
"search_score": 123,
"rerank_score": 123
}
}
],
"provider_response": "<unknown>",
"usage": {
"input_tokens": 123,
"output_tokens": 123,
"total_tokens": 123
}
}Bearer authentication header of the form Bearer <token>, where <token> is your auth token.
The deployment request payload
The deployment key to invoke
If set, partial message content will be sent. Tokens will be sent as data-only server-sent events as they become available, with the stream terminated by a data: [DONE] message.
A list of messages to include after the System message, but before the User and Assistant pairs configured in your deployment.
Developer-provided instructions that the model should follow, regardless of messages sent by the user.
Show child attributes
The role of the messages author, in this case system.
system The contents of the system message.
An optional name for the participant. Provides the model information to differentiate between participants of the same role.
A list of messages to send to the deployment.
Developer-provided instructions that the model should follow, regardless of messages sent by the user.
Show child attributes
The role of the messages author, in this case system.
system The contents of the system message.
An optional name for the participant. Provides the model information to differentiate between participants of the same role.
A list of file IDs that are associated with the deployment request.
A list of relevant documents that evaluators and guardrails can cite to evaluate the user input or the model response based on your deployment settings.
Show child attributes
The text content of the document
Show child attributes
Whether to include the retrieved knowledge chunks in the response.
Whether to include the usage metrics in the response.
A mock response to use instead of calling the LLM API. This is useful for testing purposes. When provided, the system will return a response object with this content as the completion, without making an actual API call to the LLM provider. This works for both streaming and non-streaming requests. Mock responses will not generate logs, traces or be counted for your plan usage.
Successful operation
A unique identifier for the response. Can be used to add metrics to the transaction.
A timestamp indicating when the object was created. Usually in a standardized format like ISO 8601
Indicates the type of model used to generate the response
chat, completion, image The model used to generate the response
The provider used to generate the response
openai, groq, cohere, azure, aws, google, google-ai, huggingface, togetherai, perplexity, anthropic, leonardoai, fal, nvidia, jina, elevenlabs, litellm, cerebras, openailike, bytedance, mistral, deepseek, contextualai, moonshotai Indicates if the response is the final response
A list of choices generated by the model
Show child attributes
Show child attributes
tool_calls The role of the prompt message
system, assistant, user, exception, tool, prompt, correction, expected_output Show child attributes
function Internal thought process of the model
The signature holds a cryptographic token which verifies that the thinking block was generated by the model, and is verified when thinking is part of a multiturn conversation. This value should not be modified and should always be sent to the API when the reasoning is redacted. Currently only supported by Anthropic.
Occasionally the model's internal reasoning will be flagged by the safety systems of the provider. When this occurs, the provider will encrypt the reasoning. These redacted reasoning is decrypted when passed back to the API, allowing the model to continue its response without losing context.
Indicates integration id used to generate the response
A timestamp indicating when the object was finalized. Usually in a standardized format like ISO 8601
Provider backed system fingerprint.
List of documents retrieved from the knowledge base. This property is only available when the include_retrievals flag is set to true in the invoke settings. When stream is set to true, the retrievals property will be returned in the last streamed chunk where the property is_final is set to true.
Show child attributes
Content of the retrieved chunk from the knowledge base
Metadata of the retrieved chunk from the knowledge base
Show child attributes
Name of the file
Page number of the chunk
Type of the file
Search scores are normalized to be in the range [0, 1]. Search score is calculated based on [Cosine Similarity](https://en.wikipedia.org/wiki/Cosine_similarity) algorithm. Scores close to 1 indicate the document is closer to the query, and scores closer to 0 indicate the document is farther from the query.
Rerank scores are normalized to be in the range [0, 1]. Scores close to 1 indicate a high relevance to the query, and scores closer to 0 indicate low relevance. It is not accurate to assume a score of 0.9 means the document is 2x more relevant than a document with a score of 0.45
Response returned by the model provider. This functionality is only supported when streaming is not used. If streaming is used, the provider_response property will be set to null.
Was this page helpful?