Getting Started with GCP Vertex AI Gemini
This guide shows how to set up a minimal deployment to use the TensorZero Gateway with GCP Vertex AI Gemini.
Setup
For this minimal setup, you’ll need just two files in your project directory:
Directoryconfig/
- tensorzero.toml
- docker-compose.yml
For production deployments, see our Deployment Guide.
Configuration
Create a minimal configuration file that defines a model and a simple chat function:
[models.gemini_1_5_flash_001]routing = ["gcp_vertex_gemini"]
[models.gemini_1_5_flash_001.providers.gcp_vertex_gemini]type = "gcp_vertex_gemini"model_id = "gemini-1.5-flash-001"location = "us-central1"project_id = "your-project-id" # change this
[functions.my_function_name]type = "chat"
[functions.my_function_name.variants.my_variant_name]type = "chat_completion"model = "gemini_1_5_flash_001"
See the list of models available on GCP Vertex AI Gemini.
Credentials
You must generate a GCP service account key in JWT form (described here) and point to it in the GCP_VERTEX_CREDENTIALS_PATH
environment variable.
You can customize the credential location by setting the credential_location
to env::YOUR_ENVIRONMENT_VARIABLE
.
See the Credential Management guide and Configuration Reference for more information.
Deployment (Docker Compose)
Create a minimal Docker Compose configuration:
# This is a simplified example for learning purposes. Do not use this in production.# For production-ready deployments, see: https://www.tensorzero.com/docs/gateway/deployment
services: gateway: image: tensorzero/gateway volumes: - ./config:/app/config:ro - ${GCP_VERTEX_CREDENTIALS_PATH:-/dev/null}:/app/gcp-credentials.json:ro environment: - GCP_VERTEX_CREDENTIALS_PATH=${GCP_VERTEX_CREDENTIALS_PATH:+/app/gcp-credentials.json} ports: - "3000:3000" extra_hosts: - "host.docker.internal:host-gateway"
You can start the gateway with docker compose up
.
Inference
Make an inference request to the gateway:
curl -X POST http://localhost:3000/inference \ -H "Content-Type: application/json" \ -d '{ "function_name": "my_function_name", "input": { "messages": [ { "role": "user", "content": "What is the capital of Japan?" } ] } }'