Integrations
The TensorZero Gateway integrates with the major LLM providers.
Model Providers
Provider | Chat Functions | JSON Functions | Streaming | Tool Use | Embeddings | Batch |
---|---|---|---|---|---|---|
Anthropic | ✅ | ✅ | ✅ | ✅ | ❌ | ❌ |
AWS Bedrock | ✅ | ✅ | ✅ | ✅ | ❌ | ❌ |
Azure OpenAI Service | ✅ | ✅ | ✅ | ✅ | ❌ | ❌ |
Fireworks AI | ✅ | ✅ | ✅ | ✅ | ❌ | ❌ |
GCP Vertex AI Anthropic | ✅ | ✅ | ✅ | ✅ | ❌ | ❌ |
GCP Vertex AI Gemini | ✅ | ✅ | ✅ | ✅ | ❌ | ❌ |
Google AI Studio Gemini | ✅ | ✅ | ✅ | ✅ | ❌ | ❌ |
Hyperbolic | ✅ | ⚠️ | ✅ | ❌ | ❌ | ❌ |
Mistral | ✅ | ✅ | ✅ | ✅ | ❌ | ❌ |
OpenAI (& OpenAI-Compatible) | ✅ | ✅ | ✅ | ✅ | ✅ | ✅ |
TGI | ✅ | ✅ | ⚠️ | ❌ | ❌ | ❌ |
Together AI | ✅ | ✅ | ✅ | ❌ | ❌ | ❌ |
vLLM | ✅ | ✅ | ✅ | ❌ | ❌ | ❌ |
xAI | ✅ | ✅ | ✅ | ✅ | ❌ | ❌ |
Guides
See the following guides for more information on how to use the TensorZero Gateway with each provider:
- Getting Started with Anthropic
- Getting Started with AWS Bedrock
- Getting Started with Azure OpenAI Service
- Getting Started with Fireworks AI
- Getting Started with GCP Vertex AI Anthropic
- Getting Started with GCP Vertex AI Gemini
- Getting Started with Google AI Studio
- Getting Started with Hyperbolic
- Getting Started with Mistral
- Getting Started with OpenAI
- Getting Started with OpenAI-Compatible Endpoints (e.g. Ollama)
- Getting Started with TGI
- Getting Started with Together AI
- Getting Started with vLLM (+ Llama 3.1)
- Getting Started with xAI
Limitations
The TensorZero Gateway makes a best effort to normalize configuration across providers.
For example, certain providers don’t support tool_choice: required
; in these cases,
TensorZero Gateway will coerce the request to tool_choice: auto
under the hood.
Currently, Fireworks AI and OpenAI are the only providers that support parallel_tool_calls
.
Additionally, TensorZero Gateway only supports strict
for OpenAI (Structured Outputs) and vLLM (Guided Decoding).
Below are the known limitations for each supported model provider.
- Anthropic
- The Anthropic API doesn’t support consecutive messages from the same role.
- The Anthropic API doesn’t support
tool_choice: none
. - The Anthropic API doesn’t support
seed
.
- AWS Bedrock
- The TensorZero Gateway currently doesn’t support AWS Bedrock guardrails and traces.
- The TensorZero Gateway uses a non-standard structure for storing
ModelInference.raw_response
for AWS Bedrock inference requests. - The AWS Bedrock API doesn’t support
tool_choice: none
. - The AWS Bedrock API doesn’t support
seed
.
- Azure OpenAI Service
- The Azure OpenAI Service API doesn’t provide usage information when streaming.
- The Azure OpenAI Service API doesn’t support
tool_choice: required
.
- Fireworks AI
- The Fireworks API doesn’t support
seed
.
- The Fireworks API doesn’t support
- GCP Vertex AI
- The TensorZero Gateway currently only supports the Gemini and Anthropic models.
- The GCP Vertex AI API doesn’t support
tool_choice: required
for Gemini Flash models. - The Anthropic models have the same limitations as those listed under the Anthropic provider.
- Hyperbolic
- The Hyperbolic provider doesn’t support JSON mode or tool use. JSON functions are supported with
json_mode = "off"
(not recommended).
- The Hyperbolic provider doesn’t support JSON mode or tool use. JSON functions are supported with
- Mistral
- The Mistral API doesn’t support
seed
.
- The Mistral API doesn’t support
- TGI
- The TGI API doesn’t support streaming JSON mode.
- There is very limited support for tool use so we don’t recommend using it.
- xAI
- The xAI provider doesn’t support JSON mode. JSON functions are supported with
json_mode = "implicit_tool"
(recommended) orjson_mode = "off"
. - The xAI API has issues with multi-turn tool use (bug report).
- The xAI API has issues with
tool_choice: none
(bug report).
- The xAI provider doesn’t support JSON mode. JSON functions are supported with