Imagine we’re building an LLM application that writes haikus.
Today, our integration with OpenAI might look like this:
Sample Output
Migrating to TensorZero
TensorZero offers dozens of features covering inference, observability, optimization, and experimentation.
But the absolutely minimal setup requires just a simple configuration file: tensorzero.toml.
This minimal configuration file tells the TensorZero Gateway everything it needs to replicate our original OpenAI call with added observability.
For now it’s not doing much else, but we could enable additional features with just a few extra lines of configuration. We’ll cover that later.
Deploying TensorZero
We’re almost ready to start making API calls.
Let’s launch the TensorZero Gateway.
You need to:
Set the environment variable OPENAI_API_KEY
Download the following docker-compose.yml file.
Place your tensorzero.toml configuration file in ./config/tensorzero.toml.
This Docker Compose configuration sets up a development ClickHouse database and the TensorZero Gateway.
The gateway will store inference data in the database.
Your setup should look like:
Directoryconfig/
tensorzero.toml
after.pysee below
before.py
docker-compose.yml
Let’s launch everything!
Our First TensorZero API Call
The gateway will replicate our original OpenAI call and store the data in our database — with less than 1ms latency overhead thanks to Rust 🦀.
TensorZero can be used with its native Python client, with OpenAI’s client, or via standard HTTP requests in any programming language.
The gateway stored our inference data in ClickHouse.
Let’s query it.
Sample Output
Conclusion & Next Steps
The Quick Start guide gives a tiny taste of what TensorZero is capable of.
We strongly encourage you to check out the section on prompt templates & schemas.
Though optional, they unlock many of the downstream features TensorZero offers in experimentation and optimization.
As we collect data with the gateway, we’ll start building a dataset we can use for optimization, especially if we incoporate metrics & feedback.
For example, we could use the haikus that received positive feedback to fine-tune a custom model with TensorZero Recipes.
What should we try next? We can dive deeper into the TensorZero Gateway, or skip to optimizing our haiku generator?
Learn how to build better LLM applications with TensorZero. We’ll build
complete examples involving copilots, RAG, and data extraction. Along the
way, we’ll cover features like experimentation, routing & fallbacks, and
multi-step LLM workflows.
This complete runnable example fine-tunes GPT-4o Mini to generate haikus
tailored to a judge with hidden preferences. Continuous improvement over
successive fine-tuning runs demonstrates TensorZero’s data & learning
flywheel.