Additional Providers
These providers are useful when you want a specific hosted model, a router, or an OpenAI-compatible endpoint without needing a dedicated first-class provider guide. For Grok models, see the dedicated xAI / Grok guide. For Hugging Face Inference Providers, see the dedicated Hugging Face guide.
Most entries use the same small configuration shape:
<provider>:
api_key: "${PROVIDER_API_KEY}"
# base_url: "https://api.example.com/v1" # optional override
# default_model: "model-name" # optional
# default_headers: # optional
# X-Custom-Header: "value"
Run fast-agent check after adding credentials to confirm they are visible to fast-agent.
Quick reference
| Provider | Config key | API key environment variable | Default endpoint | Model string examples |
|---|---|---|---|---|
| Groq | groq |
GROQ_API_KEY |
https://api.groq.com/openai/v1 |
groq.openai/gpt-oss-120b |
| DeepSeek | deepseek |
DEEPSEEK_API_KEY |
Provider default | deepseek, deepseek.deepseek-chat |
| Aliyun | aliyun |
ALIYUN_API_KEY |
https://dashscope-intl.aliyuncs.com/compatible-mode/v1 |
qwen-turbo, aliyun.qwen3-max |
| OpenRouter | openrouter |
OPENROUTER_API_KEY |
https://openrouter.ai/api/v1 |
openrouter.google/gemini-2.5-pro-exp-03-25:free |
| Open Responses | openresponses |
OPENRESPONSES_API_KEY |
Your Open Responses endpoint | openresponses.openai/gpt-oss-120b:groq |
| Generic OpenAI-compatible | generic |
GENERIC_API_KEY |
http://localhost:11434/v1 for Ollama-style local use |
generic.llama3.2:latest |
| TensorZero | tensorzero |
None; configure provider credentials in the TensorZero Gateway | http://localhost:3000 |
tensorzero.test_chat |
Capabilities vary by provider and model
Structured outputs, tool calling, reasoning controls, multimodal input, and provider-managed web tools are all model-dependent. Use the Models Reference for fast-agent's known capability metadata.
OpenAI-compatible hosted providers
Use these when the provider exposes an OpenAI-compatible API but has its own credentials, model catalog, or small behavior differences.
Groq
Groq is optimized for fast hosted inference. It uses OpenAI-compatible request handling in fast-agent.
The shortcut gpt-oss currently resolves through the Hugging Face provider; use the explicit
groq. prefix when you want Groq.
No aliases defined.
DeepSeek
DeepSeek uses the official OpenAI-format API. fast-agent handles provider-specific reasoning streams where supported.
| Model Alias | Maps to |
|---|---|
deepseek |
deepseek.deepseek-v4-pro |
deepseek-chat |
deepseek-chat |
deepseek-direct |
deepseek.deepseek-v4-pro |
deepseek-reasoner |
deepseek.deepseek-reasoner |
deepseek-v4-flash |
deepseek-v4-flash |
deepseek-v4-pro |
deepseek-v4-pro |
deepseek4 |
deepseek.deepseek-v4-pro |
deepseek4flash |
deepseek.deepseek-v4-flash |
deepseek4pro |
deepseek.deepseek-v4-pro |
deepseek4pro-direct |
deepseek.deepseek-v4-pro |
deepseekv4pro |
deepseek.deepseek-v4-pro |
Aliyun
Aliyun uses the DashScope compatible-mode endpoint by default. Override base_url only when you need a different Aliyun region, gateway, or compatible endpoint.
| Model Alias | Maps to |
|---|---|
qwen-long |
qwen-long |
qwen-max |
qwen-max |
qwen-plus |
qwen-plus |
qwen-turbo |
qwen-turbo |
qwen3-max |
qwen3-max |
OpenRouter
OpenRouter routes requests to many upstream providers. Model names and capabilities are controlled by OpenRouter and the selected upstream model.
Open Responses endpoints
Open Responses is an open standard for interoperable LLM interfaces. Use the openresponses provider for compatible endpoints:
openresponses:
api_key: "${OPENRESPONSES_API_KEY}"
base_url: "https://api.example.com"
reasoning: "medium" # minimal, low, medium, high
Provider-managed MCP is not supported by openresponses. Use the OpenAI responses provider when you need management: provider.
TensorZero
TensorZero is an open-source framework for production LLM applications. It combines an LLM gateway, observability, optimization, evaluations, and experimentation.
Use TensorZero when you want fast-agent to call task-specific TensorZero functions while the gateway owns model selection, fallbacks, retries, prompt templates, observability, and provider credentials.
The fastest way to start is the bundled quickstart:
That creates a dockerized example with a TensorZero Gateway, a custom MCP server, MiniIO-backed multimodal support, and a ready-to-run fast-agent example.
Configure the gateway endpoint if you are not using the default http://localhost:3000:
Call a TensorZero function with the tensorzero. model prefix:
Provider credentials should normally be configured in the TensorZero Gateway, not in fast-agent.
Generic OpenAI-compatible endpoints
Use generic for local or self-hosted OpenAI-compatible APIs, including Ollama-style endpoints.
For reusable local names, defaults, metadata, and authentication behavior, prefer Model Overlays.