Setup
This guide assumes that you are running Ollama locally withollama serve
and that you’ve pulled the llama3.1
model in advance (e.g. ollama pull llama3.1
).
Make sure to update the api_base
and model_name
in the configuration below to match your OpenAI-compatible endpoint and model.
For this minimal setup, you’ll need just two files in your project directory:
You can also find the complete code for this example on GitHub.
Configuration
Create a minimal configuration file that defines a model and a simple chat function:config/tensorzero.toml
Credentials
Theapi_key_location
field in your model provider configuration specifies how to handle API key authentication:
-
If your endpoint does not require an API key (e.g. Ollama by default):
-
If your endpoint requires an API key, you have two options:
-
Configure it in advance through an environment variable:
You’ll need to set the environment variable before starting the gateway.
-
Provide it at inference time:
The API key can then be passed in the inference request.
-
Configure it in advance through an environment variable:
api_key_location = "none"
.
Deployment (Docker Compose)
Create a minimal Docker Compose configuration:docker-compose.yml
docker compose up
.