Quick Start — From 0 to Observability
This Quick Start guide shows how we’d upgrade an OpenAI wrapper to a minimal TensorZero deployment with built-in observability — in just 5 minutes.
From there, you can take advantage of dozens of features to build best-in-class LLM applications. Some of our favorites include built-in inference-time optimizations and experimentation (A/B testing).
Status Quo: OpenAI Wrapper
Imagine we’re building an LLM application that writes haikus.
Today, our integration with OpenAI might look like this:
Sample Output
Migrating to TensorZero
TensorZero offers dozens of features covering inference, observability, optimization, and experimentation.
But the absolutely minimal setup requires just a simple configuration file: tensorzero.toml
.
This minimal configuration file tells the TensorZero Gateway everything it needs to replicate our original OpenAI call with added observability. For now it’s not doing much else, but we could enable additional features with just a few extra lines of configuration. We’ll cover that later.
Deploying TensorZero
We’re almost ready to start making API calls. Let’s launch the TensorZero Gateway.
You need to:
- Set the environment variable
OPENAI_API_KEY
- Download the following
docker-compose.yml
file. - Place your
tensorzero.toml
configuration file in./config/tensorzero.toml
.
This Docker Compose configuration sets up a development ClickHouse database and the TensorZero Gateway. The gateway will store inference data in the database.
Your setup should look like:
Directoryconfig/
- tensorzero.toml
- docker-compose.yml
- run_with_openai.py
- run_with_tensorzero.py see below
Let’s launch everything!
Our First TensorZero API Call
The gateway will replicate our original OpenAI call and store the data in our database — with less than 1ms latency overhead thanks to Rust 🦀.
Sample Output
Querying Observability Data
The gateway stored our inference data in ClickHouse. Let’s query it.
Sample Output
Conclusion & Next Steps
The Quick Start guide gives a tiny taste of what TensorZero is capable of.
We strongly encourage you to check out the section on prompt templates & schemas. Though optional, they unlock many of the downstream features TensorZero offers in experimentation and optimization.
From here, you can explore features like built-in support for experimentation (A/B testing) with prompts and models, inference-time optimizations, multi-step LLM workflows (episodes), routing & fallbacks, JSON generation, tool use, and a lot more.
As we collect data with the gateway, we’ll start building a dataset we can use for optimization, especially if we incoporate metrics & feedback. For example, we could use the haikus that received positive feedback to fine-tune a custom model with TensorZero Recipes.
What should we try next? We can dive deeper into the TensorZero Gateway, or skip to optimizing our haiku generator?
Learn how to build better LLM applications with TensorZero. We’ll build complete examples involving copilots, RAG, and data extraction. Along the way, we’ll cover features like experimentation, routing & fallbacks, and multi-step LLM workflows.
This complete runnable example fine-tunes GPT-4o Mini to generate haikus tailored to a judge with hidden preferences. Continuous improvement over successive fine-tuning runs demonstrates TensorZero’s data & learning flywheel.