Google Vertex AI provides enterprise-grade access to Gemini models with enhanced security, compliance, and control. Connecting Vertex AI to Orq.ai provides enterprise Gemini capabilities with service account authentication, project-level billing, and data residency controls.
Set Up an API Key
To use Vertex AI with Orq.ai , create a service account with appropriate permissions:
Create Service Account
Go to Google Cloud Console
Navigate to IAM & Admin > Service Accounts
Click Create Service Account
Enter a name (e.g., “orq-vertex-ai”)
Grant the following roles:
Service Account Token Creator
Vertex AI User
Click Create and Continue
Click Done
Create Service Account Key
Find the service account in the list
Click the Actions menu (three dots)
Select Manage Keys
Click Add Key > Create New Key
Select JSON format
Click Create to download the key file
Configure in Orq.ai
Navigate to AI Gateway > BYOK
Find Google Vertex AI in the list
Click the Configure button
Select Setup your own API Key
Enter configuration name (e.g., “Vertex AI Production”)
Paste the service account JSON in the Deployment JSON field (see format below)
Click Save to complete the setup
The deployment JSON must include the service account credentials, project ID, and region:
{
"projectId" : "my-project-123456" ,
"location" : "us-central1" ,
"serviceAccount" : {
"type" : "service_account" ,
"project_id" : "my-project-123456" ,
"private_key_id" : "afd17083ecd5184b5ca880e70eb84c2e4c382f14" ,
"private_key" : "-----BEGIN PRIVATE KEY----- \n ...= \n -----END PRIVATE KEY----- \n " ,
"client_email" : "vertex-ai@my-project-123456.iam.gserviceaccount.com" ,
"client_id" : "000000000000000000000" ,
"auth_uri" : "https://accounts.google.com/o/oauth2/auth" ,
"token_uri" : "https://oauth2.googleapis.com/token" ,
"auth_provider_x509_cert_url" : "https://www.googleapis.com/oauth2/v1/certs" ,
"client_x509_cert_url" : "https://www.googleapis.com/robot/v1/metadata/x509/vertex-ai%40my-project-123456.iam.gserviceaccount.com" ,
"universe_domain" : "googleapis.com"
}
}
Project ID : Find the Google Cloud Project ID at the top of the Google Cloud Console.Location : Common regions include us-central1, europe-west1, asia-northeast1. Choose based on data residency requirements.
Available Models
The AI Gateway supports all current Vertex AI Gemini models. Here are the most commonly used:
Recommended Models
Model Context Best For google/gemini-2.5-pro-preview1M Latest preview, most advanced google/gemini-2.5-pro1M Latest stable, most capable google/gemini-2.5-flash1M Fast, balanced performance google/gemini-2.0-flash-0011M Stable, reliable
For a complete and up-to-date list of all available Vertex AI models, see Supported Models .
Use google/gemini-2.5-pro for the latest stable model, or google/gemini-2.5-flash for the best balance of performance and cost.
Quick Start
Access Vertex AI Gemini models through the AI Gateway .
cURL
TypeScript
Python
TypeScript (Chat Completions)
Python (Chat Completions)
curl -X POST https://api.orq.ai/v3/router/responses \
-H "Authorization: Bearer $ORQ_API_KEY " \
-H "Content-Type: application/json" \
-d '{
"model": "google/gemini-2.5-pro",
"input": "Explain quantum computing in simple terms"
}'
Using the AI Gateway
Access Vertex AI Gemini models through the AI Gateway with enterprise-grade security, advanced chat completions, streaming, and intelligent model routing. All Vertex AI models are available with consistent formatting and automatic request logging.
Vertex AI models use the provider slug format: google/model-name. For example: google/gemini-2.5-pro
Prerequisites
Before making requests to the AI Gateway , configure the environment and install the required SDKs.
Endpoint
POST https://api.orq.ai/v3/router/responses
Required Headers
Include the following headers in all requests:
Authorization: Bearer $ORQ_API_KEY
Content-Type: application/json
Getting an API Key:
Go to API Keys
Click Create API Key and copy it
Store it in your environment as ORQ_API_KEY
SDK Installation
Install the OpenAI SDK for your language (compatible with Vertex AI models):
npm install openai
# or
yarn add openai
If existing OpenAI code is already functioning, change only the base_url and api_key to the AI Gateway endpoint and ORQ_API_KEY.
Basic Usage
Send messages to Vertex AI Gemini models and get intelligent responses:
cURL
TypeScript
Python
cURL (Chat Completions)
TypeScript (Chat Completions)
Python (Chat Completions)
curl -X POST https://api.orq.ai/v3/router/responses \
-H "Authorization: Bearer $ORQ_API_KEY " \
-H "Content-Type: application/json" \
-d '{
"model": "google/gemini-2.5-pro",
"instructions": "You are a helpful assistant that explains complex concepts simply.",
"input": "Explain machine learning"
}'
Streaming
Stream responses for real-time output and improved user experience:
cURL
TypeScript
Python
TypeScript (Chat Completions)
Python (Chat Completions)
curl -X POST https://api.orq.ai/v3/router/responses \
-H "Authorization: Bearer $ORQ_API_KEY " \
-H "Content-Type: application/json" \
-d '{
"model": "google/gemini-2.5-pro",
"input": "Write a short poem about the ocean",
"stream": true
}'
Function Calling
Vertex AI Gemini models support function calling for structured interactions:
cURL
TypeScript
Python
cURL (Chat Completions)
TypeScript (Chat Completions)
Python (Chat Completions)
curl -X POST https://api.orq.ai/v3/router/responses \
-H "Authorization: Bearer $ORQ_API_KEY " \
-H "Content-Type: application/json" \
-d '{
"model": "google/gemini-2.5-pro",
"input": "What is the weather in San Francisco?",
"tools": [{
"type": "function",
"name": "get_weather",
"description": "Get the current weather in a location",
"parameters": {
"type": "object",
"properties": {
"location": { "type": "string", "description": "The city and state, e.g. San Francisco, CA" },
"unit": { "type": "string", "enum": ["celsius", "fahrenheit"] }
},
"required": ["location"]
}
}]
}'
Automatic Request Logging
All requests made through the AI Gateway are automatically logged to the dashboard. The dashboard shows:
Request details : Model used, tokens, latency
Cost tracking : Per-request and aggregate costs
Error monitoring : Failed requests with error messages
Performance metrics : Response times and throughput
No additional configuration is needed. Logging happens automatically.
Reference