Configuring LLM Providers
The BYOLLM model alias system — configuring OpenAI, Anthropic, and Ollama providers and mapping them to agent-facing aliases.
Configuring LLM Providers
AEGIS uses a BYOLLM (Bring Your Own LLM) system. Agent manifests reference model aliases (e.g., default, fast, smart), not provider names. Node configuration maps these aliases to concrete providers and models. This decouples agent code from infrastructure choices — you can swap providers or models without changing any agent manifest.
How the Alias System Works
Agent manifest: spec.runtime.model = "fast"
↓
bootstrap.py: POST /v1/dispatch-gateway { "model": "fast" }
↓
Node config: spec.llm_providers[].models[].alias = "fast"
→ provider: ollama, model: qwen2.5-coder:7b
↓
LLM API call to http://ollama:11434Credential resolution happens in the orchestrator. The agent process never receives API keys.
Standard Aliases
AEGIS defines five conventional aliases. You can map any of these (or create your own) in your node config:
| Alias | Intent | Typical Mapping |
|---|---|---|
default | General-purpose, used when no model is specified | GPT-4o, Claude Sonnet, phi3 |
fast | Low latency, high throughput | GPT-4o-mini, Claude Haiku, qwen2.5-coder:7b |
smart | Maximum capability for complex tasks | GPT-4o, Claude Sonnet, deepseek-coder-v2 |
cheap | Minimize API cost | GPT-4o-mini, Ollama local models |
local | Air-gapped / zero API cost | Any Ollama model |
Configuring Providers
All provider configuration lives under spec.llm_providers in aegis-config.yaml.
Ollama (Local / Air-Gapped)
Ollama runs a local inference server. No API key is required. This is ideal for air-gapped environments or cost-sensitive workloads.
spec:
llm_providers:
- name: "local"
type: "ollama"
endpoint: "http://ollama:11434"
models:
- alias: "default"
model: "phi3:mini"
capabilities: ["code", "reasoning"]
context_window: 4096
- alias: "fast"
model: "qwen2.5-coder:7b"
capabilities: ["code"]
context_window: 4096
- alias: "local"
model: "phi3:mini"
capabilities: ["code", "reasoning"]
context_window: 4096Ensure the Ollama server is running and the model is pulled before starting the AEGIS daemon:
ollama pull phi3:mini
ollama pull qwen2.5-coder:7b
ollama serveOpenAI
spec:
llm_providers:
- name: "openai"
type: "openai"
api_key: "env:OPENAI_API_KEY"
models:
- alias: "default"
model: "gpt-4o"
capabilities: ["code", "reasoning"]
context_window: 128000
- alias: "fast"
model: "gpt-4o-mini"
capabilities: ["code"]
context_window: 128000
- alias: "smart"
model: "gpt-4o"
capabilities: ["code", "reasoning"]
context_window: 128000
- alias: "cheap"
model: "gpt-4o-mini"
capabilities: ["code"]
context_window: 128000For Azure OpenAI, add base_url and api_version:
- name: "azure-openai"
type: "openai"
api_key: "env:AZURE_OPENAI_KEY"
base_url: "https://my-resource.openai.azure.com/openai/deployments/gpt-4o"
api_version: "2024-02-01"
models:
- alias: "default"
model: "gpt-4o"
capabilities: ["code", "reasoning"]
context_window: 128000Anthropic
spec:
llm_providers:
- name: "anthropic"
type: "anthropic"
api_key: "env:ANTHROPIC_API_KEY"
models:
- alias: "default"
model: "claude-sonnet-4-5"
capabilities: ["code", "reasoning"]
context_window: 200000
- alias: "fast"
model: "claude-3-5-haiku-20241022"
capabilities: ["code"]
context_window: 200000
- alias: "smart"
model: "claude-sonnet-4-5"
capabilities: ["code", "reasoning"]
context_window: 200000OpenAI-Compatible Providers
Any provider that exposes an OpenAI-compatible API (e.g., vLLM, LM Studio, Together AI) can be used with type: "openai" plus a custom base_url:
- name: "together"
type: "openai"
api_key: "env:TOGETHER_API_KEY"
base_url: "https://api.together.xyz/v1"
models:
- alias: "cheap"
model: "meta-llama/Meta-Llama-3.1-8B-Instruct-Turbo"
capabilities: ["code"]
context_window: 8192Selection Strategy
The spec.llm_selection block controls how the orchestrator resolves aliases when multiple providers define the same alias:
spec:
llm_selection:
strategy: "prefer-local" # Prefer local providers, fall back to cloud
default_provider: "local" # Provider to try first for all aliases
max_retries: 3 # Retry count on transient failures
retry_delay_ms: 1000 # Delay between retries| Strategy | Behavior |
|---|---|
prefer-local | Use the local provider if it defines the requested alias; fall back to cloud providers |
round-robin | Distribute requests across providers that define the alias |
failover | Always use default_provider; switch to next provider only on error |
Using Aliases in Agent Manifests
Set spec.runtime.model in your agent manifest to select a model alias:
apiVersion: 100monkeys.ai/v1
kind: Agent
metadata:
name: "my-agent"
version: "1.0.0"
spec:
runtime:
language: "python"
version: "3.11"
model: "smart" # ← Model alias
task:
instruction: "..."If model is omitted, it defaults to "default".
Hot-Swapping Models
To change which model backs a given alias:
-
Update
aegis-config.yaml:spec: llm_providers: - name: "openai" type: "openai" api_key: "env:OPENAI_API_KEY" models: - alias: "default" model: "gpt-4o" # Change this to swap models -
Restart the AEGIS daemon:
aegis daemon stop aegis daemon start --config aegis-config.yaml
Running executions continue using the provider that was active when they started. New executions use the updated alias mapping.
Provider Configuration Reference
Provider Fields
| Field | Type | Required | Description |
|---|---|---|---|
name | string | Yes | Internal identifier for the provider. |
type | openai | anthropic | ollama | Yes | Provider adapter to use. Use openai for any OpenAI-compatible API. |
api_key | string | Except Ollama | API key value or env:VAR_NAME reference. Never hardcode keys. |
endpoint / base_url | string | Ollama, Azure | Override the default API endpoint. |
api_version | string | Azure only | API version string for Azure OpenAI. |
Model Entry Fields
| Field | Type | Required | Description |
|---|---|---|---|
alias | string | Yes | Agent-facing name (e.g., default, fast, smart). |
model | string | Yes | Model identifier as accepted by the provider's API. |
capabilities | string[] | No | Tags like code, reasoning, vision. |
context_window | integer | No | Provider's context window size in tokens. |
Credential Security
API keys configured as env:VAR_NAME are read from the daemon's environment at startup. They are stored in memory and never written to disk or logs. When OpenBao secrets management is configured (see Secrets Management), credentials can also be sourced from OpenBao with the secret:path/to/secret syntax.
Agents never receive API keys. All LLM API calls are made by the orchestrator host process using the resolved credential.