> ## Documentation Index
> Fetch the complete documentation index at: https://docs.orq.ai/llms.txt
> Use this file to discover all available pages before exploring further.

# Google Vertex AI

> Connect Google Vertex AI to Orq.ai for enterprise Gemini access. Configure service account auth, project billing, and data residency controls.

Google Vertex AI provides enterprise-grade access to Gemini models with enhanced security, compliance, and control. By connecting Vertex AI to Orq.ai, you get enterprise Gemini capabilities with service account authentication, project-level billing, and data residency controls.

## Setup Your API Key

To use Vertex AI with Orq.ai, you need to create a service account with appropriate permissions:

<Steps>
  <Step title="Create Service Account">
    1. Go to [Google Cloud Console](https://console.cloud.google.com)
    2. Navigate to **IAM & Admin** > **Service Accounts**
    3. Click **Create Service Account**
    4. Enter a name (e.g., "orq-vertex-ai")
    5. Grant the following roles:
       * **Service Account Token Creator**
       * **Vertex AI User**
    6. Click **Create and Continue**
    7. Click **Done**
  </Step>

  <Step title="Create Service Account Key">
    1. Find your service account in the list
    2. Click the **Actions** menu (three dots)
    3. Select **Manage Keys**
    4. Click **Add Key** > **Create New Key**
    5. Select **JSON** format
    6. Click **Create** to download the key file
  </Step>

  <Step title="Configure in Orq.ai">
    1. Navigate to **AI Gateway** > BYOK
    2. Find **Google Vertex AI** in the list
    3. Click the **Configure** button
    4. Select <kbd className="key">Setup your own API Key</kbd>
    5. Enter configuration name (e.g., "Vertex AI Production")
    6. Paste your service account JSON in the **Deployment JSON** field (see format below)
    7. Click **Save** to complete the setup
  </Step>
</Steps>

<Frame caption="Vertex AI configuration modal">
  <img src="https://mintcdn.com/orqai/4YNqGRNpuZNyo0_T/images/vertex-ai-410.png?fit=max&auto=format&n=4YNqGRNpuZNyo0_T&q=85&s=2aa887c5db93a0f95eef9e96a9588fd9" alt="Vertex AI configuration modal" width="511" height="441" data-path="images/vertex-ai-410.png" />
</Frame>

### Deployment JSON Format

Your deployment JSON should include the service account credentials, project ID, and region:

```json theme={"theme":{"light":"github-light","dark":"github-dark"}}
{
  "projectId": "my-project-123456",
  "location": "us-central1",
  "serviceAccount": {
    "type": "service_account",
    "project_id": "my-project-123456",
    "private_key_id": "afd17083ecd5184b5ca880e70eb84c2e4c382f14",
    "private_key": "-----BEGIN PRIVATE KEY-----\n...=\n-----END PRIVATE KEY-----\n",
    "client_email": "vertex-ai@my-project-123456.iam.gserviceaccount.com",
    "client_id": "000000000000000000000",
    "auth_uri": "https://accounts.google.com/o/oauth2/auth",
    "token_uri": "https://oauth2.googleapis.com/token",
    "auth_provider_x509_cert_url": "https://www.googleapis.com/oauth2/v1/certs",
    "client_x509_cert_url": "https://www.googleapis.com/robot/v1/metadata/x509/vertex-ai%40my-project-123456.iam.gserviceaccount.com",
    "universe_domain": "googleapis.com"
  }
}
```

<Info>
  **Project ID**: Find your Google Cloud Project ID at the top of the Google Cloud Console.

  **Location**: Common regions include `us-central1`, `europe-west1`, `asia-northeast1`. Choose based on your data residency requirements.
</Info>

## Available Models

The **AI Gateway** supports all current Vertex AI Gemini models. Here are the most commonly used:

### Recommended Models

| Model                         | Context | Best For                      |
| ----------------------------- | ------- | ----------------------------- |
| `google/gemini-3-pro-preview` | 1M      | Latest preview, most advanced |
| `google/gemini-2.5-pro`       | 1M      | Latest stable, most capable   |
| `google/gemini-2.5-flash`     | 1M      | Fast, balanced performance    |
| `google/gemini-2.0-flash-001` | 1M      | Stable, reliable              |

For a complete and up-to-date list of all available Vertex AI models, see [Supported Models](/docs/ai-studio/ai-gateway/supported-models#chat-models).

<Tip>
  Use `google/gemini-2.5-pro` for the latest stable model, or `google/gemini-2.5-flash` for the best balance of performance and cost.
</Tip>

## Quick Start

Access Vertex AI Gemini models through the **AI Gateway**.

<CodeGroup>
  ```bash cURL theme={"theme":{"light":"github-light","dark":"github-dark"}}
  curl -X POST https://api.orq.ai/v3/router/responses \
    -H "Authorization: Bearer $ORQ_API_KEY" \
    -H "Content-Type: application/json" \
    -d '{
      "model": "google/gemini-2.5-pro",
      "input": "Explain quantum computing in simple terms"
    }'
  ```

  ```typescript TypeScript theme={"theme":{"light":"github-light","dark":"github-dark"}}
  import OpenAI from "openai";

  const client = new OpenAI({
    apiKey: process.env.ORQ_API_KEY,
    baseURL: "https://api.orq.ai/v3/router",
  });

  const response = await client.responses.create({
    model: "google/gemini-2.5-pro",
    input: "Explain quantum computing in simple terms",
  });

  console.log(response.output_text);
  ```

  ```python Python theme={"theme":{"light":"github-light","dark":"github-dark"}}
  from openai import OpenAI
  import os

  client = OpenAI(
      api_key=os.environ.get("ORQ_API_KEY"),
      base_url="https://api.orq.ai/v3/router",
  )

  response = client.responses.create(
      model="google/gemini-2.5-pro",
      input="Explain quantum computing in simple terms",
  )

  print(response.output_text)
  ```

  ```typescript TypeScript (Chat Completions) theme={"theme":{"light":"github-light","dark":"github-dark"}}
  import OpenAI from "openai";

  const client = new OpenAI({
    apiKey: process.env.ORQ_API_KEY,
    baseURL: "https://api.orq.ai/v3/router",
  });

  const response = await client.chat.completions.create({
    model: "google/gemini-2.5-pro",
    messages: [{ role: "user", content: "Explain quantum computing in simple terms" }],
  });

  console.log(response.choices[0].message.content);
  ```

  ```python Python (Chat Completions) theme={"theme":{"light":"github-light","dark":"github-dark"}}
  from openai import OpenAI
  import os

  client = OpenAI(
      api_key=os.environ.get("ORQ_API_KEY"),
      base_url="https://api.orq.ai/v3/router",
  )

  response = client.chat.completions.create(
      model="google/gemini-2.5-pro",
      messages=[{"role": "user", "content": "Explain quantum computing in simple terms"}],
  )

  print(response.choices[0].message.content)
  ```
</CodeGroup>

## Using the AI Gateway

Access Vertex AI Gemini models through the **AI Gateway** with enterprise-grade security, advanced chat completions, streaming, and intelligent model routing. All Vertex AI models are available with consistent formatting and automatic request logging.

<Info>
  Vertex AI models use the provider slug format: `google/model-name`. For example: `google/gemini-2.5-pro`
</Info>

### Prerequisites

Before making requests to the **AI Gateway**, you need to configure your environment and install the SDKs if you choose to use them.

**Endpoint**

```
POST https://api.orq.ai/v3/router/responses
```

**Required Headers**

Include the following headers in all requests:

```
Authorization: Bearer $ORQ_API_KEY
Content-Type: application/json
```

**Getting your API Key:**

1. Go to [API Keys](/docs/ai-studio/organization/api-keys)
2. Click <kbd className="key">Create API Key</kbd> and copy it
3. Store it in your environment as `ORQ_API_KEY`

**SDK Installation**

Install the OpenAI SDK for your language (compatible with Vertex AI models):

<CodeGroup>
  ```bash Node.js/TypeScript theme={"theme":{"light":"github-light","dark":"github-dark"}}
  npm install openai
  # or
  yarn add openai
  ```

  ```bash Python theme={"theme":{"light":"github-light","dark":"github-dark"}}
  pip install openai
  ```
</CodeGroup>

<Tip>
  If your OpenAI code is already functioning, you only need to change the `base_url` and `api_key` to the router endpoint and `ORQ_API_KEY`.
</Tip>

### Basic Usage

Send messages to Vertex AI Gemini models and get intelligent responses:

<CodeGroup>
  ```bash cURL theme={"theme":{"light":"github-light","dark":"github-dark"}}
  curl -X POST https://api.orq.ai/v3/router/responses \
    -H "Authorization: Bearer $ORQ_API_KEY" \
    -H "Content-Type: application/json" \
    -d '{
      "model": "google/gemini-2.5-pro",
      "instructions": "You are a helpful assistant that explains complex concepts simply.",
      "input": "Explain machine learning"
    }'
  ```

  ```bash cURL (Chat Completions) theme={"theme":{"light":"github-light","dark":"github-dark"}}
  curl -X POST https://api.orq.ai/v3/router/chat/completions \
    -H "Authorization: Bearer $ORQ_API_KEY" \
    -H "Content-Type: application/json" \
    -d '{
      "model": "google/gemini-2.5-pro",
      "messages": [
        {"role": "system", "content": "You are a helpful assistant that explains complex concepts simply."},
        {"role": "user", "content": "Explain machine learning"}
      ],
      "temperature": 0.7,
      "max_tokens": 500
    }'
  ```

  ```typescript TypeScript theme={"theme":{"light":"github-light","dark":"github-dark"}}
  import OpenAI from "openai";

  const client = new OpenAI({
    apiKey: process.env.ORQ_API_KEY,
    baseURL: "https://api.orq.ai/v3/router",
  });

  const response = await client.responses.create({
    model: "google/gemini-2.5-pro",
    instructions: "You are a helpful assistant that explains complex concepts simply.",
    input: "Explain machine learning",
  });

  console.log(response.output_text);
  ```

  ```python Python theme={"theme":{"light":"github-light","dark":"github-dark"}}
  from openai import OpenAI
  import os

  client = OpenAI(
      api_key=os.environ.get("ORQ_API_KEY"),
      base_url="https://api.orq.ai/v3/router",
  )

  response = client.responses.create(
      model="google/gemini-2.5-pro",
      instructions="You are a helpful assistant that explains complex concepts simply.",
      input="Explain machine learning",
  )

  print(response.output_text)
  ```

  ```typescript TypeScript (Chat Completions) theme={"theme":{"light":"github-light","dark":"github-dark"}}
  import OpenAI from "openai";

  const client = new OpenAI({
    apiKey: process.env.ORQ_API_KEY,
    baseURL: "https://api.orq.ai/v3/router",
  });

  const response = await client.chat.completions.create({
    model: "google/gemini-2.5-pro",
    messages: [
      {
        role: "system",
        content: "You are a helpful assistant that explains complex concepts simply.",
      },
      { role: "user", content: "Explain machine learning" },
    ],
    temperature: 0.7,
    max_tokens: 500,
  });

  console.log(response.choices[0].message.content);
  ```

  ```python Python (Chat Completions) theme={"theme":{"light":"github-light","dark":"github-dark"}}
  from openai import OpenAI
  import os

  client = OpenAI(
      api_key=os.environ.get("ORQ_API_KEY"),
      base_url="https://api.orq.ai/v3/router",
  )

  response = client.chat.completions.create(
      model="google/gemini-2.5-pro",
      messages=[
          {
              "role": "system",
              "content": "You are a helpful assistant that explains complex concepts simply.",
          },
          {"role": "user", "content": "Explain machine learning"},
      ],
      temperature=0.7,
      max_tokens=500,
  )

  print(response.choices[0].message.content)
  ```
</CodeGroup>

### Streaming

Stream responses for real-time output and improved user experience:

<CodeGroup>
  ```bash cURL theme={"theme":{"light":"github-light","dark":"github-dark"}}
  curl -X POST https://api.orq.ai/v3/router/responses \
    -H "Authorization: Bearer $ORQ_API_KEY" \
    -H "Content-Type: application/json" \
    -d '{
      "model": "google/gemini-2.5-pro",
      "input": "Write a short poem about the ocean",
      "stream": true
    }'
  ```

  ```typescript TypeScript theme={"theme":{"light":"github-light","dark":"github-dark"}}
  import OpenAI from "openai";

  const client = new OpenAI({
    apiKey: process.env.ORQ_API_KEY,
    baseURL: "https://api.orq.ai/v3/router",
  });

  const stream = await client.responses.create({
    model: "google/gemini-2.5-pro",
    input: "Write a short poem about the ocean",
    stream: true,
  });

  for await (const event of stream) {
    if (event.type === "response.output_text.delta") {
      process.stdout.write(event.delta);
    }
  }
  ```

  ```python Python theme={"theme":{"light":"github-light","dark":"github-dark"}}
  from openai import OpenAI
  import os

  client = OpenAI(
      api_key=os.environ.get("ORQ_API_KEY"),
      base_url="https://api.orq.ai/v3/router",
  )

  stream = client.responses.create(
      model="google/gemini-2.5-pro",
      input="Write a short poem about the ocean",
      stream=True,
  )

  for event in stream:
      if event.type == "response.output_text.delta":
          print(event.delta, end="", flush=True)
  ```

  ```typescript TypeScript (Chat Completions) theme={"theme":{"light":"github-light","dark":"github-dark"}}
  import OpenAI from "openai";

  const client = new OpenAI({
    apiKey: process.env.ORQ_API_KEY,
    baseURL: "https://api.orq.ai/v3/router",
  });

  const stream = await client.chat.completions.create({
    model: "google/gemini-2.5-pro",
    messages: [{ role: "user", content: "Write a short poem about the ocean" }],
    stream: true,
  });

  for await (const chunk of stream) {
    const content = chunk.choices[0]?.delta?.content || "";
    process.stdout.write(content);
  }
  ```

  ```python Python (Chat Completions) theme={"theme":{"light":"github-light","dark":"github-dark"}}
  from openai import OpenAI
  import os

  client = OpenAI(
      api_key=os.environ.get("ORQ_API_KEY"),
      base_url="https://api.orq.ai/v3/router",
  )

  stream = client.chat.completions.create(
      model="google/gemini-2.5-pro",
      messages=[{"role": "user", "content": "Write a short poem about the ocean"}],
      stream=True,
  )

  for chunk in stream:
      if chunk.choices and chunk.choices[0].delta.content:
          print(chunk.choices[0].delta.content, end="", flush=True)
  ```
</CodeGroup>

### Function Calling

Vertex AI Gemini models support function calling for structured interactions:

<CodeGroup>
  ```typescript TypeScript theme={"theme":{"light":"github-light","dark":"github-dark"}}
  import OpenAI from "openai";

  const client = new OpenAI({
    apiKey: process.env.ORQ_API_KEY,
    baseURL: "https://api.orq.ai/v3/router",
  });

  const response = await client.responses.create({
    model: "google/gemini-2.5-pro",
    input: "What's the weather in San Francisco?",
    tools: [
      {
        type: "function",
        name: "get_weather",
        description: "Get the current weather in a location",
        parameters: {
          type: "object",
          properties: {
            location: {
              type: "string",
              description: "The city and state, e.g. San Francisco, CA",
            },
            unit: { type: "string", enum: ["celsius", "fahrenheit"] },
          },
          required: ["location"],
        },
      },
    ],
  });

  const toolCall = response.output.find((item) => item.type === "function_call");
  if (toolCall && toolCall.type === "function_call") {
    console.log(`Calling function: ${toolCall.name}`);
    console.log(`Arguments: ${toolCall.arguments}`);
  }
  ```

  ```python Python theme={"theme":{"light":"github-light","dark":"github-dark"}}
  from openai import OpenAI
  import os

  client = OpenAI(
      api_key=os.environ.get("ORQ_API_KEY"),
      base_url="https://api.orq.ai/v3/router",
  )

  response = client.responses.create(
      model="google/gemini-2.5-pro",
      input="What's the weather in San Francisco?",
      tools=[
          {
              "type": "function",
              "name": "get_weather",
              "description": "Get the current weather in a location",
              "parameters": {
                  "type": "object",
                  "properties": {
                      "location": {
                          "type": "string",
                          "description": "The city and state, e.g. San Francisco, CA",
                      },
                      "unit": {"type": "string", "enum": ["celsius", "fahrenheit"]},
                  },
                  "required": ["location"],
              },
          }
      ],
  )

  tool_call = next((item for item in response.output if item.type == "function_call"), None)
  if tool_call:
      print(f"Calling function: {tool_call.name}")
      print(f"Arguments: {tool_call.arguments}")
  ```

  ```typescript TypeScript (Chat Completions) theme={"theme":{"light":"github-light","dark":"github-dark"}}
  import OpenAI from "openai";

  const client = new OpenAI({
    apiKey: process.env.ORQ_API_KEY,
    baseURL: "https://api.orq.ai/v3/router",
  });

  const response = await client.chat.completions.create({
    model: "google/gemini-2.5-pro",
    messages: [{ role: "user", content: "What's the weather in San Francisco?" }],
    tools: [
      {
        type: "function",
        function: {
          name: "get_weather",
          description: "Get the current weather in a location",
          parameters: {
            type: "object",
            properties: {
              location: {
                type: "string",
                description: "The city and state, e.g. San Francisco, CA",
              },
              unit: { type: "string", enum: ["celsius", "fahrenheit"] },
            },
            required: ["location"],
          },
        },
      },
    ],
  });

  const choice = response.choices[0];
  if (choice.finish_reason === "tool_calls" && choice.message.tool_calls) {
    const toolCall = choice.message.tool_calls[0];
    console.log(`Calling function: ${toolCall.function.name}`);
    console.log(`Arguments: ${toolCall.function.arguments}`);
  }
  ```

  ```python Python (Chat Completions) theme={"theme":{"light":"github-light","dark":"github-dark"}}
  from openai import OpenAI
  import os

  client = OpenAI(
      api_key=os.environ.get("ORQ_API_KEY"),
      base_url="https://api.orq.ai/v3/router",
  )

  response = client.chat.completions.create(
      model="google/gemini-2.5-pro",
      messages=[{"role": "user", "content": "What's the weather in San Francisco?"}],
      tools=[
          {
              "type": "function",
              "function": {
                  "name": "get_weather",
                  "description": "Get the current weather in a location",
                  "parameters": {
                      "type": "object",
                      "properties": {
                          "location": {
                              "type": "string",
                              "description": "The city and state, e.g. San Francisco, CA",
                          },
                          "unit": {"type": "string", "enum": ["celsius", "fahrenheit"]},
                      },
                      "required": ["location"],
                  },
              },
          }
      ],
  )

  choice = response.choices[0]
  if choice.finish_reason == "tool_calls" and choice.message.tool_calls:
      tool_call = choice.message.tool_calls[0]
      print(f"Calling function: {tool_call.function.name}")
      print(f"Arguments: {tool_call.function.arguments}")
  ```
</CodeGroup>

## Automatic Request Logging

All requests made through the **AI Gateway** are automatically logged to your dashboard. You can view:

* **Request details**: Model used, tokens, latency
* **Cost tracking**: Per-request and aggregate costs
* **Error monitoring**: Failed requests with error messages
* **Performance metrics**: Response times and throughput

No additional configuration is needed. Logging happens automatically.

## Reference

* [Vertex AI Documentation](https://cloud.google.com/vertex-ai/docs)
* [Gemini API Reference](https://ai.google.dev/api/generate-content)
* [Service Account Setup](https://cloud.google.com/iam/docs/service-accounts-create)
* [Vertex AI Pricing](https://cloud.google.com/vertex-ai/pricing)
