Billing & Usage

The Billing & Usage page offers an overview of your billing cycle and model generation usage within your workspaces.

Workspace Usage

Here, you can have an overview of your usage over the current billing cycle.

A graph displays the number of Requests, Retrievals, and Cache over time in your workspace. These are tracked across all Playground, Experiment and Deployment and show an aggregated view of all events.

At the top-right of the graph, you can see your current usage against your plan capacity. When going beyond capacity, additional events are added to your billing cycle.

Understanding Events

A single Deployment invoke contains multiple events, each event will incur costs reflected in your Billing and Plan Usage.

To understand better the events held within your Deployments, lookup Traces and explore the events embedded into each generation.

Each trace and event detail will hold usage and billing information.

Rate Limits

Our APIs are protected through Rate Limits on a per-account basis.

Rate limiting are applied on the orq.ai API ensures fair and efficient use of the API. This helps maintain optimal performance and prevent server overload. This is also used as a protection against potential abuse and limiting costs effectively.

When reaching rate limit, API calls are denied with a 429 Too Many Requests response.

The rate-limits levels are varying on the type of subscription.

Subscription	Rate Limit for Deployment API calls	Rate Limit for other API calls
Sandbox	100 API calls/minute	20 API calls/minute
Team (Legacy)	1000 API calls/minute	50 API calls/minute
Pro	1500 API calls/minute	100 API calls/minute
Enterprise	Custom	Custom

📘
To learn more about the Orq.ai Pricing options, see Our Pricing Page.