Webhook events

📘

Currently, Orq.ai supports only one webhook event: the invocation of a deployment.

A webhook event is a mechanism that allows your application to receive real-time notifications and payloads whenever a deployment is invoked. When this happens, the service triggers an HTTP POST request to a predefined URL (the webhook endpoint) with details about the event. This enables seamless, instant data integration and communication between different platforms without the need for constant polling.

Key Components of Webhook Events:

  1. Event Trigger: when a deployment is invoked, the event will be triggered.
  2. Webhook Endpoint URL: The URL on your server where the service will send the HTTP POST requests.
  3. Payload: The data sent to your webhook endpoint in JSON format, containing detailed information about the event.
  4. Security: The webhook does include a safety mechanism, i.e., a secret token to sign the webhook payload.

Events

Deployment Invocation

This event is sent whenever one of your deployment is invoked through the API. You can filter for which deployment you want this webhook event to be sent. This event is sent as a HTTP POSTcall on the configured webhook API uri.

Here is an example payload for the event:

{
  "id": "wk_log_id",
  "created": "2024-08-09T07:53:54.289Z",
  "type": "deployment.invoked",
  "metadata": {
    "deployment_id": "<deployment_id>",
    "deployment_variant_id": "<deployment_variant_id>",
    "deployment_log_id": "<log_id>",
    "deployment_url": "<deployment_url>",
    "deployment_variant_url": "<deployment_variant_url>",
    "deployment_log_url": "<deployment_log_url>"
  },
  "data": {
    "prompt_config": {
      "stream": false,
      "model": "claude-2.1",
      "model_db_id": "<model-db-id>",
      "model_type": "chat",
      "model_parameters": {
        "temperature": 0.7,
        "maxTokens": 256,
        "topK": 5,
        "topP": 0.7
      },
      "provider": "anthropic",
      "messages": [
        {
          "role": "system",
          "content": "<system_message>"
        },
        {
          "role": "user",
          "content": "<user_message>"
        }
      ]
    },
    "choices": [
      {
        "index": 0,
        "message": {
          "role": "assistant",
          "content": "<assistant_message>"
        },
        "finish_reason": "end_turn"
      }
    ],
    "variables": [
      {
        "key": "name",
        "value": "[PII]",
        "is_pii": true
      }
    ],
    "performance": {
      "latency": 3154.259751997888
    },
    "usage": {
      "total_tokens": 99,
      "prompt_tokens": 23,
      "completion_tokens": 76
    },
    "billing": {
      "total_cost": 0.002008,
      "input_cost": 0.000184,
      "output_cost": 0.001824
    },
    "tools": []
  }
}

deployment.invoke fields

FieldDescription
idThe Webhook ID
createdThe date the webhook was created at
typeThe Webhook type, here deployment.invoke
metadataMetadata about the webhook call
metadata.deployment_idThe ID of the invoked Deployment
metadata.deployment_variant_idThe ID of the invoked variant
metadata.deployment_log_idThe ID of the invoked log
metadata.deployment_urlThe URL of the invoked Deployment
metadata.deployment_variant_urlThe URL of the invoked Variant
metadata.deployment_log_urlThe URL of the invoked Log
data.prompt_config.streamWas the invocation streamed (boolean)
data.prompt_config.modelThe model invoked
data.prompt_config.model_db_idThe ID of the invoked model
data.prompt_config.model_typeThe type of the invoked model (chat, vision, ...)
data.prompt_config.model_parameters.temperatureThe temperature parameter value for the model
data.prompt_config.model_parameters.maxTokensThe maxTokens parameter value for the model
data.prompt_config.model_parameters.topKThe topK parameter value for the model
data.prompt_config.model_parameters.topPThe topP parameter value for the model
data.prompt_config.providerThe model provider
data.prompt_config.messagesThe model prompt configuration messages.
data.choicesThe model responses to the request, containing messages sent back to the user.
data.variablesThe variable used during generation, including values (if not omited via Data Security & PII controls.
data.performance.latencyThe latency for the current invocation.
data.usage.total_tokensThe number of tokens used during generation.
data.usage.prompt_tokensThe number of tokens used by the prompt configuration.
data.usage.completion_tokensThe number of tokens used by the model to generate response to the user.
data.billing.total_costThe total cost incurred during generation.
data.billing.input_costThe cost incurred for inputs to the model.
data.billing.output_costThe cost incurred for outputs to the model
data.toolsThe function calling tools used during generation.

Caching

It is possible that the Deployments response is cached and sent back through the webhook without a model generation.

In this case, the LLM costs for the call won't be incurred.

In the Deployment Invocation webhook, the billing object will be omitted from the payload.