Webhook events
Currently, Orq.ai supports only one webhook event: the invocation of a deployment.
A webhook event is a mechanism that allows your application to receive real-time notifications and payloads whenever a deployment is invoked. When this happens, the service triggers an HTTP POST request to a predefined URL (the webhook endpoint) with details about the event. This enables seamless, instant data integration and communication between different platforms without the need for constant polling.
Key Components of Webhook Events:
- Event Trigger: when a deployment is invoked, the event will be triggered.
- Webhook Endpoint URL: The URL on your server where the service will send the HTTP POST requests.
- Payload: The data sent to your webhook endpoint in JSON format, containing detailed information about the event.
- Security: The webhook does include a safety mechanism, i.e., a secret token to sign the webhook payload.
Events
Deployment Invocation
This event is sent whenever one of your deployment is invoked through the API. You can filter for which deployment you want this webhook event to be sent. This event is sent as a HTTP POST
call on the configured webhook API uri.
Here is an example payload for the event:
{
"id": "wk_log_id",
"created": "2024-08-09T07:53:54.289Z",
"type": "deployment.invoked",
"metadata": {
"deployment_id": "<deployment_id>",
"deployment_variant_id": "<deployment_variant_id>",
"deployment_log_id": "<log_id>",
"deployment_url": "<deployment_url>",
"deployment_variant_url": "<deployment_variant_url>",
"deployment_log_url": "<deployment_log_url>"
},
"data": {
"prompt_config": {
"stream": false,
"model": "claude-2.1",
"model_db_id": "<model-db-id>",
"model_type": "chat",
"model_parameters": {
"temperature": 0.7,
"maxTokens": 256,
"topK": 5,
"topP": 0.7
},
"provider": "anthropic",
"messages": [
{
"role": "system",
"content": "<system_message>"
},
{
"role": "user",
"content": "<user_message>"
}
]
},
"choices": [
{
"index": 0,
"message": {
"role": "assistant",
"content": "<assistant_message>"
},
"finish_reason": "end_turn"
}
],
"variables": [
{
"key": "name",
"value": "[PII]",
"is_pii": true
}
],
"performance": {
"latency": 3154.259751997888
},
"usage": {
"total_tokens": 99,
"prompt_tokens": 23,
"completion_tokens": 76
},
"billing": {
"total_cost": 0.002008,
"input_cost": 0.000184,
"output_cost": 0.001824
},
"tools": []
}
}
deployment.invoke
fields
deployment.invoke
fieldsField | Description |
---|---|
id | The Webhook ID |
created | The date the webhook was created at |
type | The Webhook type, here deployment.invoke |
metadata | Metadata about the webhook call |
metadata.deployment_id | The ID of the invoked Deployment |
metadata.deployment_variant_id | The ID of the invoked variant |
metadata.deployment_log_id | The ID of the invoked log |
metadata.deployment_url | The URL of the invoked Deployment |
metadata.deployment_variant_url | The URL of the invoked Variant |
metadata.deployment_log_url | The URL of the invoked Log |
data.prompt_config.stream | Was the invocation streamed (boolean) |
data.prompt_config.model | The model invoked |
data.prompt_config.model_db_id | The ID of the invoked model |
data.prompt_config.model_type | The type of the invoked model (chat, vision, ...) |
data.prompt_config.model_parameters.temperature | The temperature parameter value for the model |
data.prompt_config.model_parameters.maxTokens | The maxTokens parameter value for the model |
data.prompt_config.model_parameters.topK | The topK parameter value for the model |
data.prompt_config.model_parameters.topP | The topP parameter value for the model |
data.prompt_config.provider | The model provider |
data.prompt_config.messages | The model prompt configuration messages. |
data.choices | The model responses to the request, containing messages sent back to the user. |
data.variables | The variable used during generation, including values (if not omited via Data Security & PII controls. |
data.performance.latency | The latency for the current invocation. |
data.usage.total_tokens | The number of tokens used during generation. |
data.usage.prompt_tokens | The number of tokens used by the prompt configuration. |
data.usage.completion_tokens | The number of tokens used by the model to generate response to the user. |
data.billing.total_cost | The total cost incurred during generation. |
data.billing.input_cost | The cost incurred for inputs to the model. |
data.billing.output_cost | The cost incurred for outputs to the model |
data.tools | The function calling tools used during generation. |
Caching
It is possible that the Deployments response is cached and sent back through the webhook without a model generation.
In this case, the LLM costs for the call won't be incurred.
In the Deployment Invocation webhook, the billing
object will be omitted from the payload.
Updated 3 days ago