Skip to content

Configuration Reference

The configuration file is the backbone of TensorZero. It defines the behavior of the gateway, including the models and their providers, functions and their variants, tools, metrics, and more. Developers express the behavior of LLM calls by defining the relevant prompt templates, schemas, and other parameters in this configuration file.

You can see an example configuration file here.

The configuration file is a TOML file with a few major sections (TOML tables): gateway, clickhouse, models, model_providers, functions, variants, tools, and metrics.

[gateway]

The [gateway] section defines the behavior of the TensorZero Gateway.

bind_address

  • Type: string
  • Required: no (default: 0.0.0.0:3000)

Defines the socket address to bind the TensorZero Gateway to.

tensorzero.toml
[gateway]
# ...
bind_address = "0.0.0.0:3000"
# ...

disable_observability

  • Type: boolean
  • Required: no (default: false)

Disable the observability features of the TensorZero Gateway (not recommended).

tensorzero.toml
[gateway]
# ...
disable_observability = true # not recommended
# ...

[models.model_name]

The [models.model_name] section defines the behavior of a model. You can define multiple models by including multiple [models.model_name] sections.

A model is provider agnostic, and the relevant providers are defined in the providers sub-section (see below).

If your model_name is not a basic string, it can be escaped with quotation marks. For example, periods are not allowed in basic strings, so you can define llama-3.1-8b-instruct as [models."llama-3.1-8b-instruct"].

tensorzero.toml
[models.claude-3-haiku-20240307]
# fieldA = ...
# fieldB = ...
# ...
[models."llama-3.1-8b-instruct"]
# fieldA = ...
# fieldB = ...
# ...

routing

  • Type: array of strings
  • Required: yes

A list of provider names to route requests to. The providers must be defined in the providers sub-section (see below). The TensorZero Gateway will attempt to route a request to the first provider in the list, and fallback to subsequent providers in order if the request is not successful.

tensorzero.toml
[models.gpt-4o]
# ...
routing = ["openai", "azure"]
# ...
[models.gpt-4o.providers.openai]
# ...
[models.gpt-4o.providers.azure]
# ...

[models.model_name.providers.provider_name]

The providers sub-section defines the behavior of a specific provider for a model. You can define multiple providers by including multiple [models.model_name.providers.provider_name] sections.

If your provider_name is not a basic string, it can be escaped with quotation marks. For example, periods are not allowed in basic strings, so you can define vllm.internal as [models.model_name.providers."vllm.internal"].

tensorzero.toml
[models.gpt-4o]
# ...
routing = ["openai", "azure"]
# ...
[models.gpt-4o.providers.openai]
# ...
[models.gpt-4o.providers.azure]
# ...

type

  • Type: string
  • Required: yes

Defines the types of the provider. See Integrations » Model Providers for details.

The supported provider types are anthropic, aws_bedrock, azure, fireworks, gcp_vertex, mistral, openai, together, and vllm.

The other fields in the provider sub-section depend on the provider type.

tensorzero.toml
[models.gpt-4o.providers.azure]
# ...
type = "azure"
# ...
type: "anthropic"
model_name
  • Type: string
  • Required: yes

Defines the model name to use with the Anthropic API. See Anthropic’s documentation for the list of available model names.

tensorzero.toml
[models.claude-3-haiku.providers.anthropic]
# ...
type = "anthropic"
model_name = "claude-3-haiku-20240307"
# ...
type: "aws_bedrock"
model_id
  • Type: string
  • Required: yes

Defines the model ID to use with the AWS Bedrock API. See AWS Bedrock’s documentation for the list of available model IDs.

tensorzero.toml
[models.claude-3-haiku.providers.aws_bedrock]
# ...
type = "aws_bedrock"
model_id = "anthropic.claude-3-haiku-20240307-v1:0"
# ...
region
  • Type: string
  • Required: no (default: based on credentials if set, otherwise us-east-1)

Defines the AWS region to use with the AWS Bedrock API.

tensorzero.toml
[models.claude-3-haiku.providers.aws_bedrock]
# ...
type = "aws_bedrock"
region = "us-east-2"
# ...
type: "azure"

The TensorZero Gateway handles the API version under the hood (currently 2024-06-01). You only need to set the deployment_id and endpoint fields.

deployment_id
  • Type: string
  • Required: yes

Defines the deployment ID of the Azure OpenAI deployment.

See Azure OpenAI’s documentation for the list of available models.

tensorzero.toml
[models.claude-3-haiku.providers.azure]
# ...
type = "azure"
deployment_id = "gpt4o-mini-20240718"
# ...
endpoint
  • Type: string
  • Required: yes

Defines the endpoint of the Azure OpenAI deployment (protocol and hostname).

tensorzero.toml
[models.claude-3-haiku.providers.azure]
# ...
type = "azure"
endpoint = "https://<your-endpoint>.openai.azure.com"
# ...
type: "fireworks"
model_name
  • Type: string
  • Required: yes

Defines the model name to use with the Fireworks API.

See Fireworks’ documentation for the list of available model names. You can also deploy your own models on Fireworks AI.

tensorzero.toml
[models."llama-3.1-8b-instruct".providers.fireworks]
# ...
type = "fireworks"
model_name = "accounts/fireworks/models/llama-v3p1-8b-instruct"
# ...
type: "gcp_vertex"
location
  • Type: string
  • Required: yes

Defines the location (region) of the GCP Vertex AI model.

tensorzero.toml
[models."gemini-1.5-flash".providers.gcp_vertex]
# ...
type = "gcp_vertex"
location = "us-central1"
# ...
model_id
  • Type: string
  • Required: yes

Defines the model ID of the GCP Vertex AI model.

See GCP Vertex AI’s documentation for the list of available model IDs.

tensorzero.toml
[models."gemini-1.5-flash".providers.gcp_vertex]
# ...
type = "gcp_vertex"
model_id = "gemini-1.5-flash-001"
# ...
project_id
  • Type: string
  • Required: yes

Defines the project ID of the GCP Vertex AI model.

tensorzero.toml
[models."gemini-1.5-flash".providers.gcp_vertex]
# ...
type = "gcp_vertex"
project_id = "your-project-id"
# ...
type: "mistral"
model_name
  • Type: string
  • Required: yes

Defines the model name to use with the Mistral API.

See Mistral’s documentation for the list of available model names.

tensorzero.toml
[models."open-mistral-nemo".providers.mistral]
# ...
type = "mistral"
model_name = "open-mistral-nemo-2407"
# ...
type: "openai"
api_base
  • Type: string
  • Required: no (default: https://api.openai.com/v1/)

Defines the base URL of the OpenAI API.

You can use the api_base field to use an API provider that is compatible with the OpenAI API. However, many providers are only “approximately compatible” with the OpenAI API, so you might need to use a specialized model provider in those cases.

tensorzero.toml
[models."gpt-4o".providers.openai]
# ...
type = "openai"
api_base = "https://api.openai.com/v1/"
# ...
model_name
  • Type: string
  • Required: yes

Defines the model name to use with the OpenAI API.

See OpenAI’s documentation for the list of available model names.

tensorzero.toml
[models.gpt-4o-mini.providers.openai]
# ...
type = "openai"
model_name = "gpt-4o-mini-2024-07-18"
# ...
type: "together"
model_name
  • Type: string
  • Required: yes

Defines the model name to use with the Together API.

See Together’s documentation for the list of available model names. You can also deploy your own models on Together AI.

tensorzero.toml
[models."mixtral-8x7b-instruct-v0.1".providers.together]
# ...
type = "together"
model_name = "mistralai/Mixtral-8x7B-Instruct-v0.1"
# ...
type: "vllm"
api_base
  • Type: string
  • Required: yes (default: http://localhost:8000/v1/)

Defines the base URL of the VLLM API.

tensorzero.toml
[models."phi-3.5-mini-instruct".providers.vllm]
# ...
type = "vllm"
api_base = "http://localhost:8000/v1/"
# ...
model_name
  • Type: string
  • Required: yes

Defines the model name to use with the vLLM API.

tensorzero.toml
[models."phi-3.5-mini-instruct".providers.vllm]
# ...
type = "vllm"
model_name = "microsoft/Phi-3.5-mini-instruct"
# ...

[functions.function_name]

The [functions.function_name] section defines the behavior of a function. You can define multiple functions by including multiple [functions.function_name] sections.

A function can have multiple variants, and each variant is defined in the variants sub-section (see below). A function expresses the abstract behavior of an LLM call (e.g. the schemas for the messages), and its variants express concrete instantiations of that LLM call (e.g. specific templates and models).

If your function_name is not a basic string, it can be escaped with quotation marks. For example, periods are not allowed in basic strings, so you can define summarize-2.0 as [functions."summarize-2.0"].

tensorzero.toml
[functions.draft-email]
# fieldA = ...
# fieldB = ...
# ...
[functions.summarize-email]
# fieldA = ...
# fieldB = ...
# ...

assistant_schema

  • Type: string (path)
  • Required: no

Defines the path to the assistant schema file. The path is relative to the configuration file.

If provided, the assistant schema file should contain a JSON Schema for the assistant messages. The variables in the schema are used for templating the assistant messages. If a schema is provided, all function variants must also provide an assistant template (see below).

tensorzero.toml
[functions.draft-email]
# ...
assistant_schema = "./functions/draft-email/assistant_schema.json"
# ...
[functions.draft-email.variants.prompt-v1]
# ...
assistant_template = "./functions/draft-email/prompt-v1/assistant_template.minijinja"
# ...

system_schema

  • Type: string (path)
  • Required: no

Defines the path to the system schema file. The path is relative to the configuration file.

If provided, the system schema file should contain a JSON Schema for the system message. The variables in the schema are used for templating the system message. If a schema is provided, all function variants must also provide a system template (see below).

tensorzero.toml
[functions.draft-email]
# ...
system_schema = "./functions/draft-email/system_schema.json"
# ...
[functions.draft-email.variants.prompt-v1]
# ...
system_template = "./functions/draft-email/prompt-v1/system_template.minijinja"
# ...

type

  • Type: string
  • Required: yes

Defines the type of the function.

The supported function types are chat and json.

Most other fields in the function section depend on the function type.

tensorzero.toml
[functions.draft-email]
# ...
type = "chat"
# ...
type: "chat"
parallel_tool_calls
  • Type: boolean
  • Required: no (default: false)

Determines whether the function should be allowed to call multiple tools in a single conversation turn.

Most model providers do not support this feature. In those cases, this field will be ignored.

tensorzero.toml
[functions.draft-email]
# ...
type = "chat"
parallel_tool_calls = true
# ...
tool_choice
  • Type: string
  • Required: no (default: auto)

Determines the tool choice strategy for the function.

The supported tool choice strategies are:

  • none: The function should not use any tools.
  • auto: The model decides whether or not to use a tool. If it decides to use a tool, it also decides which tools to use.
  • required: The model should use a tool. If multiple tools are available, the model decides which tool to use.
  • { specific = "tool_name" }: The model should use a specific tool. The tool must be defined in the tools field (see below).
tensorzero.toml
[functions.solve-math-problem]
# ...
type = "chat"
tool_choice = "auto"
tools = [
# ...
"run-python"
# ...
]
# ...
[tools.run-python]
# ...
tensorzero.toml
[functions.generate-query]
# ...
type = "chat"
tool_choice = { specific = "query-database" }
tools = [
# ...
"query-database"
# ...
]
# ...
[tools.query-database]
# ...
tools
  • Type: array of strings
  • Required: no (default: [])

Determines the tools that the function can use.

The supported tools are defined in [tools.tool_name] sections (see below).

tensorzero.toml
[functions.draft-email]
# ...
type = "chat"
tools = [
# ...
"query-database"
# ...
]
# ...
[tools.query-database]
# ...
type: "json"
output_schema
  • Type: string (path)
  • Required: no (default: {}, the empty JSON schema that accepts any valid JSON output)

Defines the path to the output schema file, which should contain a JSON Schema for the output of the function. The path is relative to the configuration file.

This schema is used for validating the output of the function.

tensorzero.toml
[functions.extract-customer-info]
# ...
type = "json"
output_schema = "./functions/extract-customer-info/output_schema.json"
# ...

user_schema

  • Type: string (path)
  • Required: no

Defines the path to the user schema file. The path is relative to the configuration file.

If provided, the user schema file should contain a JSON Schema for the user messages. The variables in the schema are used for templating the user messages. If a schema is provided, all function variants must also provide a user template (see below).

tensorzero.toml
[functions.draft-email]
# ...
user_schema = "./functions/draft-email/user_schema.json"
# ...
[functions.draft-email.variants.prompt-v1]
# ...
user_template = "./functions/draft-email/prompt-v1/user_template.minijinja"
# ...

[functions.function_name.variants.variant_name]

The variants sub-section defines the behavior of a specific variant of a function. You can define multiple variants by including multiple [functions.function_name.variants.variant_name] sections.

If your variant_name is not a basic string, it can be escaped with quotation marks. For example, periods are not allowed in basic strings, so you can define llama-3.1-8b-instruct as [functions.function_name.variants."llama-3.1-8b-instruct"].

tensorzero.toml
[functions.draft-email]
# ...
[functions.draft-email.variants."llama-3.1-8b-instruct"]
# ...
[functions.draft-email.variants.claude-3-haiku]
# ...

type

  • Type: string
  • Required: yes

Defines the type of the variant.

At the moment, the only supported variant type is chat_completion.

tensorzero.toml
[functions.draft-email.variants.prompt-v1]
# ...
type = "chat_completion"
# ...
type: "chat_completion"
assistant_template
  • Type: string (path)
  • Required: no

Defines the path to the assistant template file. The path is relative to the configuration file.

This file should contain a MiniJinja template for the assistant messages. If the template uses any variables, the variables should be defined in the function’s assistant_schema field.

tensorzero.toml
[functions.draft-email]
# ...
assistant_schema = "./functions/draft-email/assistant_schema.json"
# ...
[functions.draft-email.variants.prompt-v1]
# ...
assistant_template = "./functions/draft-email/prompt-v1/assistant_template.minijinja"
# ...
json_mode
  • Type: string
  • Required: no (default: on)

Defines the strategy for generating JSON outputs.

This parameter is only supported for variants of functions with type = "json".

The supported modes are:

  • off: Make a chat completion request without any special JSON handling (not recommended).
  • on: Make a chat completion request with JSON mode (if supported by the provider).
  • strict: Make a chat completion request with strict JSON mode (if supported by the provider). For example, the TensorZero Gateway uses Structured Outputs for OpenAI.
  • implicit_tool: Make a special-purpose tool use request under the hood, and convert the tool call into a JSON response.
tensorzero.toml
[functions.draft-email.variants.prompt-v1]
# ...
json_mode = "strict"
# ...
max_tokens
  • Type: integer
  • Required: no (default: null)

Defines the maximum number of tokens to generate.

tensorzero.toml
[functions.draft-email.variants.prompt-v1]
# ...
max_tokens = 100
# ...
model
  • Type: string
  • Required: yes

Defines the model to use for the variant. The model must be defined in the [models.model_name] section (see above).

tensorzero.toml
[models.gpt-4o-mini]
# ...
[functions.draft-email.variants.prompt-v1]
# ...
model = "gpt-4o-mini"
# ...
seed
  • Type: integer
  • Required: no (default: null)

Defines the seed to use for the variant.

tensorzero.toml
[functions.draft-email.variants.prompt-v1]
# ...
seed = 42
system_template
  • Type: string (path)
  • Required: no

Defines the path to the system template file. The path is relative to the configuration file.

This file should contain a MiniJinja template for the system messages. If the template uses any variables, the variables should be defined in the function’s system_schema field.

tensorzero.toml
[functions.draft-email]
# ...
system_schema = "./functions/draft-email/system_schema.json"
# ...
[functions.draft-email.variants.prompt-v1]
# ...
system_template = "./functions/draft-email/prompt-v1/system_template.minijinja"
# ...
temperature
  • Type: float
  • Required: no (default: null)

Defines the temperature to use for the variant.

tensorzero.toml
[functions.draft-email.variants.prompt-v1]
# ...
temperature = 0.5
# ...
user_template
  • Type: string (path)
  • Required: no

Defines the path to the user template file. The path is relative to the configuration file.

This file should contain a MiniJinja template for the user messages. If the template uses any variables, the variables should be defined in the function’s user_schema field.

tensorzero.toml
[functions.draft-email]
# ...
user_schema = "./functions/draft-email/user_schema.json"
# ...
[functions.draft-email.variants.prompt-v1]
# ...
user_template = "./functions/draft-email/prompt-v1/user_template.minijinja"
# ...
weight
  • Type: float
  • Required: yes

Defines the weight of the variant. When you call a function, the weight determines the relative importance of the variant when sampling.

Variants will be sampled with a probability proportional to their weight. For example, if variant A has a weight of 1.0 and variant B has a weight of 3.0, variant A will be sampled with probability 1.0 / (1.0 + 3.0) = 25% and variant B will be sampled with probability 3.0 / (1.0 + 3.0) = 75%.

You can disable a variant by setting its weight to 0. The variant will only be used if there are no other variants available for sampling or if the variant is requested explicitly in the request with variant_name. This is useful for defining fallback variants, which won’t be used unless no other variants are available.

tensorzero.toml
[functions.draft-email.variants.prompt-v1]
# ...
weight = 1.0
# ...

[metrics]

The [metrics] section defines the behavior of a metric. You can define multiple metrics by including multiple [metrics.metric_name] sections.

The metric name can’t be comment or demonstration, as those names are reserved for internal use.

If your metric_name is not a basic string, it can be escaped with quotation marks. For example, periods are not allowed in basic strings, so you can define beats-gpt-3.5 as [metrics."beats-gpt-3.5"].

tensorzero.toml
[metrics.task-completed]
# fieldA = ...
# fieldB = ...
# ...
[metrics.user-rating]
# fieldA = ...
# fieldB = ...
# ...

level

  • Type: string
  • Required: yes

Defines whether the metric applies to individual inference or across entire episodes.

The supported levels are inference and episode.

tensorzero.toml
[metrics.valid-output]
# ...
level = "inference"
# ...
[metrics.task-completed]
# ...
level = "episode"
# ...

optimize

  • Type: string
  • Required: yes

Defines whether the metric should be maximized or minimized.

The supported values are max and min.

tensorzero.toml
[metrics.mistakes-made]
# ...
optimize = "min"
# ...
[metrics.user-rating]
# ...
optimize = "max"
# ...

type

  • Type: string
  • Required: yes

Defines the type of the metric.

The supported metric types are boolean and float.

tensorzero.toml
[metrics.user-rating]
# ...
type = "float"
# ...
[metrics.task-completed]
# ...
type = "boolean"
# ...

[tools.tool_name]

The [tools.tool_name] section defines the behavior of a tool. You can define multiple tools by including multiple [tools.tool_name] sections.

If your tool_name is not a basic string, it can be escaped with quotation marks. For example, periods are not allowed in basic strings, so you can define run-python-3.10 as [tools."run-python-3.10"].

You can enable a tool for a function by adding it to the function’s tools field.

tensorzero.toml
[functions.weather-chatbot]
# ...
type = "chat"
tools = [
# ...
"get-temperature"
# ...
]
# ...
[tools.get-temperature]
# ...

description

  • Type: string
  • Required: yes

Defines the description of the tool provided to the model.

You can typically materially improve the quality of responses by providing a detailed description of the tool.

tensorzero.toml
[tools.get-temperature]
# ...
description = "Get the current temperature in a given location (e.g. \"Tokyo\") using the specified unit (must be \"celsius\" or \"fahrenheit\")."
# ...

parameters

  • Type: string (path)
  • Required: yes

Defines the path to the parameters file. The path is relative to the configuration file.

This file should contain a JSON Schema for the parameters of the tool.

tensorzero.toml
[tools.get-temperature]
# ...
parameters = "./tools/get-temperature.json"
# ...

strict

  • Type: boolean
  • Required: no (default: false)

If set to true, the TensorZero Gateway attempts to use strict JSON generation for the tool parameters. This typically improves the quality of responses.

Only a few providers support strict JSON generation. For example, the TensorZero Gateway uses Structured Outputs for OpenAI. If the provider does not support strict mode, the TensorZero Gateway ignores this field.

tensorzero.toml
[tools.get-temperature]
# ...
strict = true
# ...