Orq.ai Documentation - AI Gateway & LLM Collaboration Platform

Workspace Usage

Billing can be accessed inside the Organization panel, where you have an overview of your usage over the current billing cycle. A graph displays the number of Requests, Retrievals, and Cache over time in your workspace. These are tracked across all Playground, Experiment and Deployment and show an aggregated view of all events. At the top-right of the graph, you can see your current usage against your plan capacity. When going beyond capacity, additional events are added to your billing cycle.

An additional view of the current events can be seen from your main Dashboard.

Understanding Trace Storage Usage

Our platform stores distributed traces received through OpenTelemetry:

Each trace is composed of one or more spans, which represent individual operations or segments of a request.
Each span is encoded as JSON data before being stored.
Once encoded, we apply an indexing layer on top of the raw data. This indexing allows for fast search and filtering across large trace datasets.

To account for both the raw and indexed data, we estimate total storage consumption as:

total_storage = (raw_bytes * 1.7)

In other words, the indexed representation roughly doubles the storage footprint compared to the raw JSON data.

Show Example Calculation

From our analytics:

Total traces: 100
Total storage consumed: ~10 MB
Average size per trace: ~120 KB

This average reflects the JSON encoding plus the indexing overhead.

Using OpenTelemetry, some exporters send incremental payloads which might result in sudden increases.

This means each new span can contain data from previous messages as well.
This results in larger payloads over time, and consequently, sudden jumps in reported storage usage.
Check your exporter configuration when dealing with sudden increases.

To reduce storage usage, consider:

Sampling fewer spans per trace.
Filtering out high-volume, low-value telemetry data.
Using compression or limiting payload size before sending.

These optimizations can help maintain observability while keeping your storage footprint efficient.

Understanding Events

A single Deployment invoke contains multiple events, each event will incur costs reflected in your Billing and Plan Usage. To understand better the events held within your Deployments, lookup Analytics and explore the events embedded into each generation.

Rate Limits

Our APIs are protected through Rate Limits on a per-account basis to ensure fair and efficient use of the API. This helps maintain optimal performance and prevent server overload, while also protecting against potential abuse and limiting costs effectively. When reaching rate limit, API calls are denied with a 429 Too Many Requests response.

Subscription	Rate Limit
Developer	50 API calls/day
Enterprise	Custom

To learn more about the Orq.ai Pricing options or to upgrade your plan, see Our Pricing Page.

Getting Started

Reference

Organization

Billing & Usage | Track AI Costs and Consumption

Workspace Usage

Understanding Trace Storage Usage

Understanding Events

Rate Limits

Getting Started

Reference

Organization

​Workspace Usage

​Understanding Trace Storage Usage

​Understanding Events

​Rate Limits

Workspace Usage

Understanding Trace Storage Usage

Understanding Events

Rate Limits