Integrations

The TensorZero Gateway integrates with the major LLM providers.

Model Providers

Provider	Chat Functions	JSON Functions	Streaming	Tool Use	Multimodal (Image)	Embeddings	Batch
Anthropic	✅	✅	✅	✅	✅	❌	❌
AWS Bedrock	✅	✅	✅	✅	❌	❌	❌
AWS SageMaker	✅	✅	✅	✅	❌	❌	❌
Azure OpenAI Service	✅	✅	✅	✅	❌	❌	❌
DeepSeek	✅	✅	⚠️	❌	❌	❌	❌
Fireworks AI	✅	✅	✅	✅	❌	❌	❌
GCP Vertex AI Anthropic	✅	✅	✅	✅	❌	❌	❌
GCP Vertex AI Gemini	✅	✅	✅	✅	❌	❌	❌
Google AI Studio Gemini	✅	✅	✅	✅	✅	❌	❌
Hyperbolic	✅	⚠️	✅	❌	❌	❌	❌
Mistral	✅	✅	✅	✅	❌	❌	❌
OpenAI and OpenAI-Compatible	✅	✅	✅	✅	✅	⚠️	✅
SGLang	✅	✅	✅	❌	❌	❌	❌
TGI	✅	✅	⚠️	❌	❌	❌	❌
Together AI	✅	✅	✅	✅	❌	❌	❌
vLLM	✅	✅	✅	❌	❌	❌	❌
xAI	✅	✅	✅	✅	❌	❌	❌

Limitations

The TensorZero Gateway makes a best effort to normalize configuration across providers. For example, certain providers don’t support tool_choice: required; in these cases, TensorZero Gateway will coerce the request to tool_choice: auto under the hood.

Currently, Fireworks AI and OpenAI are the only providers that support parallel_tool_calls. Additionally, TensorZero Gateway only supports strict (commonly referred to as Structured Outputs, Guided Decoding, or similar names) for Azure, GCP Vertex AI Gemini, Google AI Studio, OpenAI, Together AI, vLLM, and xAI.

Below are the known limitations for each supported model provider.

Anthropic
- The Anthropic API doesn’t support consecutive messages from the same role.
- The Anthropic API doesn’t support tool_choice: none.
- The Anthropic API doesn’t support seed.
AWS Bedrock
- The TensorZero Gateway currently doesn’t support AWS Bedrock guardrails and traces.
- The TensorZero Gateway uses a non-standard structure for storing ModelInference.raw_response for AWS Bedrock inference requests.
- The AWS Bedrock API doesn’t support tool_choice: none.
- The AWS Bedrock API doesn’t support seed.
Azure OpenAI Service
- The Azure OpenAI Service API doesn’t provide usage information when streaming.
- The Azure OpenAI Service API doesn’t support tool_choice: required.
DeepSeek
- The deepseek-chat model doesn’t support tool use for production use cases.
- The deepseek-reasoner model doesn’t support JSON mode or tool use.
- The TensorZero Gateway doesn’t return thought blocks in the response (coming soon!).
Fireworks AI
- The Fireworks API doesn’t support seed.
GCP Vertex AI
- The TensorZero Gateway currently only supports the Gemini and Anthropic models.
- The GCP Vertex AI API doesn’t support tool_choice: required for Gemini Flash models.
- The Anthropic models have the same limitations as those listed under the Anthropic provider.
Hyperbolic
- The Hyperbolic provider doesn’t support JSON mode or tool use. JSON functions are supported with json_mode = "off" (not recommended).
Mistral
- The Mistral API doesn’t support seed.
SGLang
- There is no support for tools
TGI
- The TGI API doesn’t support streaming JSON mode.
- There is very limited support for tool use so we don’t recommend using it.
Together AI
- The Together AI API doesn’t seem to respect tool_choice in many cases.
xAI
- The xAI provider doesn’t support JSON mode. JSON functions are supported with json_mode = "implicit_tool" (recommended) or json_mode = "off".
- The xAI API has issues with multi-turn tool use (bug report).
- The xAI API has issues with tool_choice: none (bug report).