TensorZero is an open-source stack for industrial-grade LLM applications:
  • Gateway: access every LLM provider through a unified API, built for performance (<1ms p99 latency)
  • Observability: store inferences and feedback in your database, available programmatically or in the UI
  • Optimization: collect metrics and human feedback to optimize prompts, models, and inference strategies
  • Evaluations: benchmark individual inferences or end-to-end workflows using heuristics, LLM judges, etc.
  • Experimentation: ship with confidence with built-in A/B testing, routing, fallbacks, retries, etc.
Take what you need, adopt incrementally, and complement with other tools.
Start building today. The Quickstart shows it’s easy to set up an LLM application with TensorZero. If you want to dive deeper, the Tutorial teaches how to build a simple chatbot, an email copilot, a weather RAG system, and a structured data extraction pipeline.Questions? Ask us on Slack or Discord.Using TensorZero at work? Email us at [email protected] to set up a Slack or Teams channel with your team (free).Work with us. We’re hiring in NYC. We’d also welcome open-source contributions!