Integrations
The TensorZero Gateway integrates with the major LLM providers.
Model Providers
Provider | Chat Functions | JSON Functions | Streaming | Tool Use | Embeddings | Batch |
---|---|---|---|---|---|---|
Anthropic | ✅ | ✅ | ✅ | ✅ | ❌ | ❌ |
AWS Bedrock | ✅ | ✅ | ✅ | ✅ | ❌ | ❌ |
Azure OpenAI Service | ✅ | ✅ | ✅ | ✅ | ❌ | ❌ |
DeepSeek | ✅ | ✅ | ⚠️ | ❌ | ❌ | ❌ |
Fireworks AI | ✅ | ✅ | ✅ | ✅ | ❌ | ❌ |
GCP Vertex AI Anthropic | ✅ | ✅ | ✅ | ✅ | ❌ | ❌ |
GCP Vertex AI Gemini | ✅ | ✅ | ✅ | ✅ | ❌ | ❌ |
Google AI Studio Gemini | ✅ | ✅ | ✅ | ✅ | ❌ | ❌ |
Hyperbolic | ✅ | ⚠️ | ✅ | ❌ | ❌ | ❌ |
Mistral | ✅ | ✅ | ✅ | ✅ | ❌ | ❌ |
OpenAI (& OpenAI-Compatible) | ✅ | ✅ | ✅ | ✅ | ⚠️ | ✅ |
SGLang | ✅ | ✅ | ✅ | ❌ | ❌ | ❌ |
TGI | ✅ | ✅ | ⚠️ | ❌ | ❌ | ❌ |
Together AI | ✅ | ✅ | ✅ | ✅ | ❌ | ❌ |
vLLM | ✅ | ✅ | ✅ | ❌ | ❌ | ❌ |
xAI | ✅ | ✅ | ✅ | ✅ | ❌ | ❌ |
Guides
See the following guides for more information on how to use the TensorZero Gateway with each provider:
- Getting Started with Anthropic
- Getting Started with AWS Bedrock
- Getting Started with Azure OpenAI Service
- Getting Started with Fireworks AI
- Getting Started with GCP Vertex AI Anthropic
- Getting Started with GCP Vertex AI Gemini
- Getting Started with Google AI Studio
- Getting Started with Hyperbolic
- Getting Started with Mistral
- Getting Started with OpenAI
- Getting Started with OpenAI-Compatible Endpoints (e.g. Ollama)
- Getting Started with SGLang
- Getting Started with TGI
- Getting Started with Together AI
- Getting Started with vLLM (+ Llama 3.1)
- Getting Started with xAI
Limitations
The TensorZero Gateway makes a best effort to normalize configuration across providers.
For example, certain providers don’t support tool_choice: required
; in these cases,
TensorZero Gateway will coerce the request to tool_choice: auto
under the hood.
Currently, Fireworks AI and OpenAI are the only providers that support parallel_tool_calls
.
Additionally, TensorZero Gateway only supports strict
(commonly referred to as Structured Outputs, Guided Decoding, or similar names) for Azure, GCP Vertex AI Gemini, Google AI Studio, OpenAI, Together AI, vLLM, and xAI.
Below are the known limitations for each supported model provider.
- Anthropic
- The Anthropic API doesn’t support consecutive messages from the same role.
- The Anthropic API doesn’t support
tool_choice: none
. - The Anthropic API doesn’t support
seed
.
- AWS Bedrock
- The TensorZero Gateway currently doesn’t support AWS Bedrock guardrails and traces.
- The TensorZero Gateway uses a non-standard structure for storing
ModelInference.raw_response
for AWS Bedrock inference requests. - The AWS Bedrock API doesn’t support
tool_choice: none
. - The AWS Bedrock API doesn’t support
seed
.
- Azure OpenAI Service
- The Azure OpenAI Service API doesn’t provide usage information when streaming.
- The Azure OpenAI Service API doesn’t support
tool_choice: required
.
- DeepSeek
- The
deepseek-chat
model doesn’t support tool use for production use cases. - The
deepseek-reasoner
model doesn’t support JSON mode or tool use. - The TensorZero Gateway doesn’t return
thought
blocks in the response (coming soon!).
- The
- Fireworks AI
- The Fireworks API doesn’t support
seed
.
- The Fireworks API doesn’t support
- GCP Vertex AI
- The TensorZero Gateway currently only supports the Gemini and Anthropic models.
- The GCP Vertex AI API doesn’t support
tool_choice: required
for Gemini Flash models. - The Anthropic models have the same limitations as those listed under the Anthropic provider.
- Hyperbolic
- The Hyperbolic provider doesn’t support JSON mode or tool use. JSON functions are supported with
json_mode = "off"
(not recommended).
- The Hyperbolic provider doesn’t support JSON mode or tool use. JSON functions are supported with
- Mistral
- The Mistral API doesn’t support
seed
.
- The Mistral API doesn’t support
- SGLang
- There is no support for tools
- TGI
- The TGI API doesn’t support streaming JSON mode.
- There is very limited support for tool use so we don’t recommend using it.
- Together AI
- The Together AI API doesn’t seem to respect
tool_choice
in many cases.
- The Together AI API doesn’t seem to respect
- xAI
- The xAI provider doesn’t support JSON mode. JSON functions are supported with
json_mode = "implicit_tool"
(recommended) orjson_mode = "off"
. - The xAI API has issues with multi-turn tool use (bug report).
- The xAI API has issues with
tool_choice: none
(bug report).
- The xAI provider doesn’t support JSON mode. JSON functions are supported with