Use Orquesta's LLM Deployments for as a single source of integration to seamlessly operate your AI interactions across different LLM models

Within Deployments, Orquesta handles all integration, operation, and monitoring complexity for you. On top of that, Deployments have more superpowers, where you can configure retries and fallback models.

When logging metrics, you only have to log feedback and your custom metadata. All other metrics are logged for you by Orquesta, and you can analyze these in the Observability tools.

  1. Activate the models supported by the AI Gateway
  2. Create your first Deployment
  3. Add a variant to the Deployment
  4. Configure your Deployment (retries and fallback model)

Activate your models in the Model Garden

To use the Deployments, you activate the models that Orquesta's AI Gateway supports. Head to the Model Garden and activate the models you want to interact with.

Model Garden

Supported Models in Model Garden

Creating your first Deployment

In the Deployments section, click Add deployment.

Provide your Deployment Key and Domain, and then Create.

Variants based on custom context

For each Deployment, you can create variants based on your user's or application's custom context in Deployment using the Business Rules Engine. Click the Add Variant button, set up your prompt and model to use, and build business rules for the custom context.


Retries refer to the number of attempts to execute an LLM call or a specific task. Retries handle situations where an operation may fail for various reasons, such as network issues, system errors, or resource constraints.

When an LLM call fails, Orquesta automatically initiates a retry mechanism to reattempt the task. The number of retries determines how often the platform will try to operate before considering it a permanent failure or using the fallback model. This helps to improve the chances of successful task completion.


This refers to a secondary or alternative model used as a backup when the primary model is unavailable or encounters issues. A fallback model aims to ensure the continuity and reliability of language model operations in the event of unexpected failures or downtime of the primary model.

To use a fallback model in Orquesta, click the Fallback Model button. Select the fallback model of your choice and configure it if necessary.


The Deployment API is a scalable and efficient way to interact with multiple LLMs using smart business rules. One key and endless possibilities allow you and your team to streamline operations and focus on what matters most. Continue reading the API DocumentationDeployments