Additional Providers

These providers are useful when you want a specific hosted model, a router, or an OpenAI-compatible endpoint without needing a dedicated first-class provider guide. For Grok models, see the dedicated xAI / Grok guide. For Hugging Face Inference Providers, see the dedicated Hugging Face guide.

Most entries use the same small configuration shape:

<provider>:
  api_key: "${PROVIDER_API_KEY}"
  # base_url: "https://api.example.com/v1" # optional override
  # default_model: "model-name"             # optional
  # default_headers:                        # optional
  #   X-Custom-Header: "value"

Run fast-agent check after adding credentials to confirm they are visible to fast-agent.

Quick reference

Provider	Config key	API key environment variable	Default endpoint	Model string examples
Groq	`groq`	`GROQ_API_KEY`	`https://api.groq.com/openai/v1`	`groq.openai/gpt-oss-120b`
DeepSeek	`deepseek`	`DEEPSEEK_API_KEY`	Provider default	`deepseek`, `deepseek.deepseek-chat`
Aliyun	`aliyun`	`ALIYUN_API_KEY`	`https://dashscope-intl.aliyuncs.com/compatible-mode/v1`	`qwen-turbo`, `aliyun.qwen3-max`
OpenRouter	`openrouter`	`OPENROUTER_API_KEY`	`https://openrouter.ai/api/v1`	`openrouter.google/gemini-2.5-pro-exp-03-25:free`
Open Responses	`openresponses`	`OPENRESPONSES_API_KEY`	Your Open Responses endpoint	`openresponses.openai/gpt-oss-120b:groq`
Generic OpenAI-compatible	`generic`	`GENERIC_API_KEY`	`http://localhost:11434/v1` for Ollama-style local use	`generic.llama3.2:latest`
TensorZero	`tensorzero`	None; configure provider credentials in the TensorZero Gateway	`http://localhost:3000`	`tensorzero.test_chat`

Capabilities vary by provider and model

Structured outputs, tool calling, reasoning controls, multimodal input, and provider-managed web tools are all model-dependent. Use the Models Reference for fast-agent's known capability metadata.

OpenAI-compatible hosted providers

Use these when the provider exposes an OpenAI-compatible API but has its own credentials, model catalog, or small behavior differences.

Groq

groq:
  api_key: "${GROQ_API_KEY}"

fast-agent --model groq.openai/gpt-oss-120b

Groq is optimized for fast hosted inference. It uses OpenAI-compatible request handling in fast-agent. The shortcut gpt-oss currently resolves through the Hugging Face provider; use the explicit groq. prefix when you want Groq.

Model Alias	Maps to
`qwen/qwen3.6-27b`	`qwen/qwen3.6-27b`
`qwen3-32b`	`groq.qwen/qwen3-32b`
`qwen3.6-27b`	`groq.qwen/qwen3.6-27b`

DeepSeek

deepseek:
  api_key: "${DEEPSEEK_API_KEY}"

fast-agent --model deepseek
fast-agent --model deepseek.deepseek-chat

DeepSeek uses the official OpenAI-format API. fast-agent handles provider-specific reasoning streams where supported.

Model Alias	Maps to
`deepseek`	`deepseek.deepseek-v4-pro`
`deepseek-chat`	`deepseek-chat`
`deepseek-direct`	`deepseek.deepseek-v4-pro`
`deepseek-reasoner`	`deepseek.deepseek-reasoner`
`deepseek-v4-flash`	`deepseek-v4-flash`
`deepseek-v4-pro`	`deepseek-v4-pro`
`deepseek4`	`deepseek.deepseek-v4-pro`
`deepseek4flash`	`deepseek.deepseek-v4-flash`
`deepseek4pro`	`deepseek.deepseek-v4-pro`
`deepseek4pro-direct`	`deepseek.deepseek-v4-pro`
`deepseekv4pro`	`deepseek.deepseek-v4-pro`

Aliyun

aliyun:
  api_key: "${ALIYUN_API_KEY}"

fast-agent --model qwen-turbo
fast-agent --model aliyun.qwen3-max

Aliyun uses the DashScope compatible-mode endpoint by default. Override base_url only when you need a different Aliyun region, gateway, or compatible endpoint.

Model Alias	Maps to
`qwen-long`	`qwen-long`
`qwen-max`	`qwen-max`
`qwen-plus`	`qwen-plus`
`qwen-turbo`	`qwen-turbo`
`qwen3-max`	`qwen3-max`

OpenRouter

openrouter:
  api_key: "${OPENROUTER_API_KEY}"

fast-agent --model openrouter.google/gemini-2.5-pro-exp-03-25:free

OpenRouter routes requests to many upstream providers. Model names and capabilities are controlled by OpenRouter and the selected upstream model.

Open Responses endpoints

Open Responses is an open standard for interoperable LLM interfaces. Use the openresponses provider for compatible endpoints:

openresponses:
  api_key: "${OPENRESPONSES_API_KEY}"
  base_url: "https://api.example.com"
  reasoning: "medium" # minimal, low, medium, high

fast-agent --model openresponses.openai/gpt-oss-120b:groq

Provider-managed MCP is not supported by openresponses. Use the OpenAI responses provider when you need management: provider.

TensorZero

TensorZero is an open-source framework for production LLM applications. It combines an LLM gateway, observability, optimization, evaluations, and experimentation.

Use TensorZero when you want fast-agent to call task-specific TensorZero functions while the gateway owns model selection, fallbacks, retries, prompt templates, observability, and provider credentials.

The fastest way to start is the bundled quickstart:

fast-agent quickstart tensorzero

That creates a dockerized example with a TensorZero Gateway, a custom MCP server, MiniIO-backed multimodal support, and a ready-to-run fast-agent example.

Configure the gateway endpoint if you are not using the default http://localhost:3000:

tensorzero:
  base_url: "http://localhost:3000"

Call a TensorZero function with the tensorzero. model prefix:

uv run agent.py --model=tensorzero.test_chat

Provider credentials should normally be configured in the TensorZero Gateway, not in fast-agent.

Generic OpenAI-compatible endpoints

Use generic for local or self-hosted OpenAI-compatible APIs, including Ollama-style endpoints.

generic:
  base_url: "http://localhost:11434/v1"
  api_key: "ollama"

fast-agent --model generic.llama3.2:latest

For reusable local names, defaults, metadata, and authentication behavior, prefer Model Overlays.