Query reporting metrics
Returns time-series analytics for AI usage, cost, latency, evaluator results, and guardrail outcomes. Select a metric and time range, break results down by supported dimensions, apply filters, and optionally include totals for the full range.
Documentation Index
Fetch the complete documentation index at: https://docs.orq.ai/llms.txt
Use this file to discover all available pages before exploring further.
Authorizations
Bearer authentication header of the form Bearer <token>, where <token> is your auth token.
Body
Catalogue metric to query.
genai.requests, genai.tokens, genai.cost, genai.errors, genai.latency.p50, genai.latency.p95, genai.latency.p99, genai.evaluator.runs, genai.evaluator.pass_rate, genai.evaluator.score.avg, genai.guardrail.runs, genai.guardrail.block_rate, genai.guardrail.triggered, genai.usage Inclusive lower bound for the report window (RFC 3339, UTC).
Exclusive upper bound for the report window (RFC 3339, UTC).
Requested bucket grain. Use auto or omit the field to let the server choose based on the requested range.
auto, minute, hour, day Reporting dimensions to break down by. Valid dimensions depend on the selected metric.
project, identity, provider, model, product, api_key, status_code, http_status_code, credential_type, dimension, dimension_type, tag, agent, tool, deployment, evaluator, dataset, prompt, policy, conversation, thread, memory_store, knowledge, sheet, guardrail_origin, evaluator_name, evaluator_type, evaluator_version, result_type, evaluation_stage, guardrail_stage, evaluator_stage, guardrail_action, result_label Up to 20 allowlisted predicates combined with AND.
Maximum bucket rows returned. Defaults to 1000 and is capped at 5000.
IANA time zone applied to bucket boundaries, for example
America/New_York. Response timestamps remain UTC. Empty means UTC.
When true, include a totals block aggregated across the full
report window.
Response
OK
Object discriminator for typed SDKs and JSON parsers; always report.
report The request that produced this response, parroted back so caller code never has to remember its own payload to interpret the result.
Time-ordered buckets.
Totals across the whole window. Present only when
include_totals=true on the request.
Pagination contract. Always populated; currently false because the
server caps the response at limit instead of issuing cursors.
Diagnostic metadata.