Orq.ai Documentation - AI Gateway & LLM Collaboration Platform

Query AI usage, cost, latency, Evaluator outcomes, and guardrail metrics programmatically. The Reporting API returns time-series data sliced by model, provider, project, identity, and more in a single JSON request.

Spend dashboards

Track cost by model, provider, project, or credential type. Build per-customer billing breakdowns.

Performance monitoring

Monitor p50, p95, and p99 latency over time across models and providers.

Evaluator quality

Quantify pass rate, average score, and result distribution per Evaluator and version.

Guardrail enforcement

Measure block rate, triggers, and outcomes by Policy and stage.

Endpoint

POST https://api.orq.ai/v2/reporting

All requests require a Bearer token. See API Keys for how to generate one.
The workspace and project scope are derived from the API key itself.

Trace events flow into reporting in near real time, with a small ingestion delay measured in seconds. Queries against very recent windows may not yet include the latest events.

See the API Reference

Interactive playground and full schema for POST /v2/reporting.

Quickstart

curl -X POST "https://api.orq.ai/v2/reporting" \
  -H "Authorization: Bearer $ORQ_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "metric": "genai.usage",
    "from": "2026-05-01T00:00:00Z",
    "to":   "2026-05-15T00:00:00Z",
    "grain": "day",
    "group_by": ["model"],
    "include_totals": true
  }'

Request

The wire format is snake_case end-to-end. SDKs expose the same fields in each language’s native style (groupBy in TypeScript, group_by in Python).

{
  "metric": "genai.usage",
  "from": "2026-05-01T00:00:00Z",
  "to": "2026-05-15T00:00:00Z",
  "grain": "day",
  "group_by": ["model", "provider"],
  "filters": [
    { "field": "project", "op": "eq", "values": ["proj_123"] }
  ],
  "include_totals": true,
  "limit": 1000,
  "time_zone": "UTC"
}

See the API reference for the full field schema, types, defaults, and validation rules.

Response

Every successful response follows the same envelope. The data array contains time-ordered buckets. Each bucket carries:

a dimensions map: the group_by values for that row (see Dimensions)
a metrics map: shape depends on the metric requested

Scalar metrics return a single named value per bucket. Bundle metrics (such as genai.usage) return a full set of related fields in one call. totals is present only when include_totals: true is set in the request; otherwise the field is omitted. Metric values are rounded to 10 decimal places server-side.

{
  "object": "report",
  "request": { },
  "data": [
    {
      "timestamp": "2026-05-01T00:00:00Z",
      "dimensions": { "model": "gpt-4o-mini", "provider": "openai" },
      "metrics": { }
    }
  ],
  "totals": {
    "metrics": { }
  },
  "has_more": false,
  "meta": {
    "effective_grain": "day",
    "row_count": 14,
    "request_id": "req_...",
    "currency": "USD"
  }
}

Metric catalog

Usage
Evaluator
Guardrail

Metric	Description
`genai.requests`	Number of LLM requests
`genai.tokens`	Total tokens (input + output)
`genai.cost`	Total cost in USD
`genai.errors`	Number of failed requests
`genai.latency.p50`	Median request latency (ms)
`genai.latency.p95`	95th percentile latency (ms)
`genai.latency.p99`	99th percentile latency (ms)
`genai.usage`	Bundle of request, token, and cost fields

Quantile metrics (genai.latency.p*) carry a +/-1-2% error band typical of t-digest estimation at high cardinality.

Metric	Description
`genai.evaluator.runs`	Total Evaluator runs
`genai.evaluator.pass_rate`	Passed / total (ratio in `[0, 1]`)
`genai.evaluator.score.avg`	Average Evaluator score

Metric	Description
`genai.guardrail.runs`	Total guardrail checks
`genai.guardrail.block_rate`	Blocked / total (ratio in `[0, 1]`)
`genai.guardrail.triggered`	Count of blocked checks

Evaluator and guardrail metrics read from the same source: every Evaluator run is recorded as a guardrail check regardless of the Evaluator implementation type. Pick the metric that matches the question being asked.

Dimensions

The API picks the right index for the query automatically. Asking for an entity dimension (agent, tool, tag, and so on) on a usage metric routes the query through entity attribution; everything else stays on the core usage path.

billing_billable is available as a filter field for usage metrics, but not as a group_by dimension. To break down Orq.ai-managed versus customer-owned credentials, use credential_type instead. The values orq_managed and customer_byok are derived at query time from the workspace billing setup, not stored on each event.

Usage dimensions
Entity dimensions
Evaluator and guardrail dimensions

Pair these with genai.requests, genai.tokens, genai.cost, genai.errors, genai.latency.*, or genai.usage.

Dimension	Notes
`project`	Workspace project the request is attributed to
`identity`	End-user identifier from Identities
`provider`	For example, `openai`, `anthropic`
`model`	For example, `gpt-4o-mini`
`product`	Which Orq.ai product surface served the request
`api_key`	API key used to authenticate the request
`status_code`	High-level status (`OK`, `ERROR`)
`http_status_code`	HTTP status from the upstream model
`credential_type`	`orq_managed` (Orq.ai-managed keys) or `customer_byok` (customer BYOK keys)

Add an entity dimension to any usage metric to break it down by a product entity. The API routes the query to attribution storage automatically.

Entity	Description
`agent`	Agent that issued the request
`tool`	Tool invoked during the request
`deployment`	Deployment ID
`evaluator`	Evaluator that ran on the trace
`policy`	Policy that matched the request
`tag`	Tags attached to the request
`prompt`	Prompt template
`dataset`	Dataset used in an experiment
`conversation`	Conversation grouping
`thread`	Thread grouping
`memory_store`	Memory Store
`knowledge`	Knowledge base
`sheet`	Sheet ID

Two conflicting entity dimensions in the same request (for example, ["agent", "tool"]) return 400. Issue one request per entity dimension.

Advanced attribution queries can group by raw dimension, but must include a dimension_type filter such as { "field": "dimension_type", "op": "eq", "values": ["agent"] }. Prefer the entity aliases above for most dashboards.

Pair these with genai.evaluator.* and genai.guardrail.* metrics.

Dimension	Notes
`project`	Workspace project the request is attributed to
`identity`	End-user identifier from Identities
`api_key`	API key used to authenticate the request
`policy`	Policy that matched the request
`evaluator`	Evaluator that produced the result
`evaluator_name`	Human-readable Evaluator name
`evaluator_type`	`llm_eval`, `python_eval`, `function_eval`, and so on
`evaluator_version`	Compare versions over time
`result_type`	`boolean`, `number`, `categorical`, `string`
`result_label`	Bounded result bucket (`pass`, `fail`, `error`, or category name)
`evaluation_stage`	`input` or `output` (aliases: `guardrail_stage`, `evaluator_stage`)
`guardrail_action`	`block` or other
`guardrail_origin`	`rule` or `direct`
`product`	Which Orq.ai product surface served the request
`status_code`	High-level status (`OK`, `ERROR`)

Examples by use case

Spend: cost by day (bundle)

Get request count, tokens, and cost in one call. Best for headline dashboards.

curl -X POST "https://api.orq.ai/v2/reporting" \
  -H "Authorization: Bearer $ORQ_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "metric": "genai.usage",
    "from": "2026-05-01T00:00:00Z",
    "to":   "2026-05-15T00:00:00Z",
    "grain": "day",
    "include_totals": true
  }'

Spend: cost by model per hour

curl -X POST "https://api.orq.ai/v2/reporting" \
  -H "Authorization: Bearer $ORQ_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "metric": "genai.cost",
    "from": "2026-05-13T00:00:00Z",
    "to":   "2026-05-14T00:00:00Z",
    "grain": "hour",
    "group_by": ["model"]
  }'

Spend: BYOK vs Orq.ai-managed credentials

curl -X POST "https://api.orq.ai/v2/reporting" \
  -H "Authorization: Bearer $ORQ_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "metric": "genai.cost",
    "from": "2026-04-15T00:00:00Z",
    "to":   "2026-05-15T00:00:00Z",
    "grain": "day",
    "group_by": ["credential_type"]
  }'

The response carries dimensions.credential_type as "orq_managed" or "customer_byok".

Attribution: cost by Agent

Grouping by an entity auto-routes the query to attribution storage.

curl -X POST "https://api.orq.ai/v2/reporting" \
  -H "Authorization: Bearer $ORQ_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "metric": "genai.cost",
    "from": "2026-05-01T00:00:00Z",
    "to":   "2026-05-15T00:00:00Z",
    "grain": "day",
    "group_by": ["agent"]
  }'

Attribution: tokens by tag, filtered by project

curl -X POST "https://api.orq.ai/v2/reporting" \
  -H "Authorization: Bearer $ORQ_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "metric": "genai.tokens",
    "from": "2026-04-15T00:00:00Z",
    "to":   "2026-05-15T00:00:00Z",
    "grain": "day",
    "group_by": ["tag"],
    "filters": [
      { "field": "project", "op": "eq", "values": ["proj_123"] }
    ]
  }'

Attribution: cost by Tool with model filter

curl -X POST "https://api.orq.ai/v2/reporting" \
  -H "Authorization: Bearer $ORQ_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "metric": "genai.cost",
    "from": "2026-05-01T00:00:00Z",
    "to":   "2026-05-15T00:00:00Z",
    "grain": "day",
    "group_by": ["tool"],
    "filters": [
      { "field": "model", "op": "in", "values": ["openai/gpt-4o", "openai/gpt-4o-mini"] }
    ]
  }'

Latency: p95 by model

curl -X POST "https://api.orq.ai/v2/reporting" \
  -H "Authorization: Bearer $ORQ_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "metric": "genai.latency.p95",
    "from": "2026-05-13T00:00:00Z",
    "to":   "2026-05-14T00:00:00Z",
    "grain": "hour",
    "group_by": ["model"]
  }'

metrics["genai.latency.p95"] is reported in milliseconds.

Latency: p50, p95, p99 in parallel

Quantile metrics are independent calls. Fire them concurrently from the client.

for metric in genai.latency.p50 genai.latency.p95 genai.latency.p99; do
  curl -X POST "https://api.orq.ai/v2/reporting" \
    -H "Authorization: Bearer $ORQ_API_KEY" \
    -H "Content-Type: application/json" \
    -d "{\"metric\":\"$metric\",\"from\":\"2026-05-13T00:00:00Z\",\"to\":\"2026-05-14T00:00:00Z\",\"grain\":\"hour\"}" &
done
wait

Errors: count by provider

curl -X POST "https://api.orq.ai/v2/reporting" \
  -H "Authorization: Bearer $ORQ_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "metric": "genai.errors",
    "from": "2026-05-13T00:00:00Z",
    "to":   "2026-05-14T00:00:00Z",
    "grain": "hour",
    "group_by": ["provider"]
  }'

Errors: error rate by model (client-side ratio)

Fetch requests and errors in parallel and divide on the client.

curl -X POST "https://api.orq.ai/v2/reporting" -H "Authorization: Bearer $ORQ_API_KEY" -H "Content-Type: application/json" \
  -d '{"metric":"genai.requests","from":"2026-05-13T00:00:00Z","to":"2026-05-14T00:00:00Z","grain":"hour","group_by":["model"]}' &
curl -X POST "https://api.orq.ai/v2/reporting" -H "Authorization: Bearer $ORQ_API_KEY" -H "Content-Type: application/json" \
  -d '{"metric":"genai.errors","from":"2026-05-13T00:00:00Z","to":"2026-05-14T00:00:00Z","grain":"hour","group_by":["model"]}' &
wait

Evaluator: pass rate by Evaluator and version

curl -X POST "https://api.orq.ai/v2/reporting" \
  -H "Authorization: Bearer $ORQ_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "metric": "genai.evaluator.pass_rate",
    "from": "2026-05-01T00:00:00Z",
    "to":   "2026-05-15T00:00:00Z",
    "grain": "day",
    "group_by": ["evaluator", "evaluator_version"]
  }'

Evaluator: average score

curl -X POST "https://api.orq.ai/v2/reporting" \
  -H "Authorization: Bearer $ORQ_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "metric": "genai.evaluator.score.avg",
    "from": "2026-05-01T00:00:00Z",
    "to":   "2026-05-15T00:00:00Z",
    "grain": "day",
    "group_by": ["evaluator"]
  }'

Evaluator: runs by result label

curl -X POST "https://api.orq.ai/v2/reporting" \
  -H "Authorization: Bearer $ORQ_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "metric": "genai.evaluator.runs",
    "from": "2026-05-01T00:00:00Z",
    "to":   "2026-05-15T00:00:00Z",
    "grain": "day",
    "group_by": ["evaluator", "result_label"]
  }'

Guardrail: block rate by Evaluator and stage

curl -X POST "https://api.orq.ai/v2/reporting" \
  -H "Authorization: Bearer $ORQ_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "metric": "genai.guardrail.block_rate",
    "from": "2026-05-01T00:00:00Z",
    "to":   "2026-05-15T00:00:00Z",
    "grain": "day",
    "group_by": ["evaluator", "evaluation_stage"]
  }'

Guardrail: triggers by Policy and stage

curl -X POST "https://api.orq.ai/v2/reporting" \
  -H "Authorization: Bearer $ORQ_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "metric": "genai.guardrail.triggered",
    "from": "2026-05-01T00:00:00Z",
    "to":   "2026-05-15T00:00:00Z",
    "grain": "day",
    "group_by": ["policy", "guardrail_stage"]
  }'

Guardrail: filter by origin

curl -X POST "https://api.orq.ai/v2/reporting" \
  -H "Authorization: Bearer $ORQ_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "metric": "genai.guardrail.runs",
    "from": "2026-05-01T00:00:00Z",
    "to":   "2026-05-15T00:00:00Z",
    "grain": "day",
    "group_by": ["evaluator"],
    "filters": [
      { "field": "guardrail_origin", "op": "eq", "values": ["rule"] }
    ]
  }'

Multi-tenant: cost per project per day

curl -X POST "https://api.orq.ai/v2/reporting" \
  -H "Authorization: Bearer $ORQ_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "metric": "genai.cost",
    "from": "2026-05-01T00:00:00Z",
    "to":   "2026-05-15T00:00:00Z",
    "grain": "day",
    "group_by": ["project"]
  }'

Multi-tenant: usage per API key

curl -X POST "https://api.orq.ai/v2/reporting" \
  -H "Authorization: Bearer $ORQ_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "metric": "genai.usage",
    "from": "2026-05-01T00:00:00Z",
    "to":   "2026-05-15T00:00:00Z",
    "grain": "day",
    "group_by": ["api_key"],
    "include_totals": true
  }'

Identity: cost per end-user Identity

curl -X POST "https://api.orq.ai/v2/reporting" \
  -H "Authorization: Bearer $ORQ_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "metric": "genai.cost",
    "from": "2026-05-01T00:00:00Z",
    "to":   "2026-05-15T00:00:00Z",
    "grain": "day",
    "group_by": ["identity"]
  }'

Identity: requests filtered to a single end-user

curl -X POST "https://api.orq.ai/v2/reporting" \
  -H "Authorization: Bearer $ORQ_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "metric": "genai.requests",
    "from": "2026-05-01T00:00:00Z",
    "to":   "2026-05-15T00:00:00Z",
    "grain": "day",
    "filters": [
      { "field": "identity", "op": "eq", "values": ["user_42"] }
    ]
  }'

Time zone: day buckets in America/New_York

curl -X POST "https://api.orq.ai/v2/reporting" \
  -H "Authorization: Bearer $ORQ_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "metric": "genai.cost",
    "from": "2026-05-01T04:00:00Z",
    "to":   "2026-05-15T04:00:00Z",
    "grain": "day",
    "time_zone": "America/New_York"
  }'

Bucket boundaries align to midnight in the requested zone.

High resolution: requests per minute, last hour

curl -X POST "https://api.orq.ai/v2/reporting" \
  -H "Authorization: Bearer $ORQ_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "metric": "genai.requests",
    "from": "2026-05-14T11:00:00Z",
    "to":   "2026-05-14T12:00:00Z",
    "grain": "minute"
  }'

Cross-dimension: cost by credential type and model

curl -X POST "https://api.orq.ai/v2/reporting" \
  -H "Authorization: Bearer $ORQ_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "metric": "genai.cost",
    "from": "2026-05-01T00:00:00Z",
    "to":   "2026-05-15T00:00:00Z",
    "grain": "day",
    "group_by": ["credential_type", "model"]
  }'

Errors

Errors come back as JSON envelopes with a numeric code, a human-readable message, and an empty details array.

{
  "code": 3,
  "message": "time range exceeds maximum: 11952h0m0s > 90d",
  "details": []
}

HTTP	Code	Reason
400	3	Unknown metric, dimension not allowed, range outside workspace retention, range over 90 days, `from >= to`, grain outside retention, or invalid filter operator.
401	16	Missing or invalid bearer token.
403	7	API key does not have access to the requested project.
500	13	Unexpected backend failure. Forward `meta.request_id` to support.

Every successful response includes meta.request_id. Include it in support tickets for fast log correlation.

Limits

Limit	Value
Query window	Workspace retention period, capped at 90 days
`group_by`	5 columns
`filters`	20 entries
`filter.values`	100 values per filter
`limit`	5000 bucket rows (default 1000)
Per-query timeout	30 seconds

Spend dashboards

Performance monitoring

Evaluator quality

Guardrail enforcement

​Endpoint

See the API Reference

​Quickstart

​Request

​Response

​Metric catalog

​Dimensions

​Examples by use case

​Errors

​Limits

Endpoint

Quickstart

Request

Response

Metric catalog

Dimensions

Examples by use case

Errors

Limits