Vision & Roadmap
Near-term Roadmap
Our goal is to help engineers build, manage, and optimize the next generation of LLM applications: systems that learn from real-world experience. While in stealth, we achieved very positive technical results in commercial pilots and internal benchmarks. We’re now working towards open-sourcing our work over the coming months.
The planned Q4 & Q1 roadmap includes:
Optimization. The initial release includes simple offline recipes for optimization. We’re working towards open-sourcing more advanced recipes (e.g. automated prompt optimization, reinforcement learning) as well as inference-time optimizations (e.g. dynamic in-context learning, rejection sampling).
Evaluation & Experimentation. The TensorZero Gateway supports basic A/B testing. We’re working towards open-sourcing more advanced evaluation (e.g. reward modeling, importance sampling) and experimentation tools (e.g. asynchronous optimization for multi-armed bandits).
Observability. The TensorZero Gateway already stores comprehensive observability data in a ClickHouse database you control. We’re working towards open-sourcing a suite of user-friendly tools (e.g. dashboards) to support your LLM engineering workflows.
Examples. We’ve written a Quick Start guide and a comprehensive Tutorial about the TensorZero Gateway, as well as complete runnable examples for all recipes. We’re working towards adding more complete examples, especially for common workflows (e.g. RAG) and challenges (e.g. evaluations).
Integrations. We’ve integrated TensorZero with many popular LLM providers (see Integrations). We’re working towards integrating with more providers, as well as third-party LLMOps tools (e.g. fine-tuning, observability).
Vision
As LLMs get smarter, one of the main engineering challenges will be to enable them to learn from real-world experience. The analogy we like here is, “If you take a smart person and throw them at a completely new job, they likely won’t be great at it at first, but will quickly learn the ropes through instruction or trial-and-error.” This same process is very challenging for LLMs today, especially as people try to tackle increasingly more complex use cases (e.g. agents).
At some point you won’t be able to judge business outcomes by evaluating individual models or staring at inferences, the way people approach LLM engineering today. You’ll have to reason about these end-to-end systems as a whole while iterating based on the data the system produces over time.
TensorZero is our answer to all this. We’re building a layer on top of models and other tools, and automating much of the low-level LLM engineering work. By grounding on real-world performance, TensorZero will ultimately enable AI systems to learn from experience. This is our long-term vision.