Aegis Orchestrator
Architecture

MCP Tool Routing

The three tool routing paths — FSAL host file operations, SEAL External host MCP servers, and Dispatch Protocol in-container execution.

MCP Tool Routing

The AEGIS orchestrator mediates every tool call an agent makes. No tool call travels directly from the agent container to a tool server. Instead, the SealMiddleware receives all tool call envelopes, verifies them, and routes them via one of three paths based on the tool's type.


Security Guarantees

The Orchestrator Proxy Pattern provides three hard guarantees regardless of which routing path is used:

Credential isolation — API keys, OAuth tokens, and database credentials are held by the orchestrator (sourced from environment variables or OpenBao) and injected into Tool Server processes on the host. They never travel into agent containers.

Policy-before-action — The ToolPolicy is evaluated against every tool call before any file is read, any process is spawned, or any external API is contacted. Calls that violate policy are rejected synchronously and never forwarded.

Full audit trail — Every invocation, routing decision, and policy violation is published as a domain event on the internal event bus, giving the Cortex learning system and your external audit infrastructure a complete, tamper-resistant record of agent actions.

Semantic pre-dispatch review — When execution.tool_validation is configured for a tool, the orchestrator runs that judge before the call enters the routing pipeline. The judge is a semantic gate, not a replacement for authorization: approved calls still pass through SEAL and policy enforcement, while rejected calls are blocked synchronously without side effects.

If an operator explicitly disables the judge with skip_judge, the tool still remains subject to the normal policy-before-action path. The bypass only removes the semantic review step; it does not weaken routing or authorization checks.


The Three Routing Paths

Agent Container                  Orchestrator Host                   External
    │                                   │                              │
    │  POST /v1/dispatch-gateway        │                              │
    │  (SealEnvelope wrapping           │                              │
    │   tool call in messages[])        │                              │
    │──────────────────────────────────►│                              │
    │                                   │                              │
    │                         SealMiddleware                           │
    │                           1. Verify Ed25519 signature            │
    │                           2. Validate SecurityToken (JWT)        │
    │                           3. PolicyEngine: check capabilities    │
    │                           4. Route to ToolRouter                 │
    │                                   │                              │
    │                    ┌──────────────┼──────────────┐               │
    │                    │              │              │               │
    │               PATH 1:        PATH 2:        PATH 3:              │
    │               FSAL           SEAL           Dispatch             │
    │               Host File      External       Protocol             │
    │               Ops            Host MCP                            │
    │                    │              │              │               │
    │                    ▼              ▼              │               │
    │               AegisFSAL      Tool Server    ◄────┘               │
    │               (SeaweedFS)    Process        OrchestratorMessage  │
    │                    │         (e.g.,          {type:"dispatch"}   │
    │                    │         web-search,      │                  │
    │                    │         gmail)           │                  │
    │                    │              │           ▼                  │
    │                    │              └──────────►│ External API     │
    │                    │                          │                  │
    │◄───────────────────┴──────────────────────────┘                  │
    │              Tool result injected into LLM context               │

Tool Policy Enforcement Pipeline

Before any routing happens, the ToolPolicy evaluates the invocation in order. Failure at any step rejects the call immediately and publishes a ToolPolicyViolation event.

1. Allowlist check
   Is the tool name in spec.tools[]?
   → NO  ‣ reject: ToolNotAllowed

2. Explicit deny check
   Is the tool name in the SecurityContext deny_list?
   → YES ‣ reject: ToolExplicitlyDenied

3. Rate limit check
   Has this execution exceeded max_calls_per_execution?
   → YES ‣ reject: RateLimitExceeded

4a. Filesystem path check  (for fs.* tools only)
   Does the requested path start with an entry in path_allowlist?
   Contains ".." components?
   → FAIL ‣ reject: PathOutsideBoundary | PathTraversalAttempt

4b. Domain check  (for web.* / email.* tools only)
   Is the target domain in domain_allowlist?
   → NO  ‣ reject: DomainNotAllowed

4c. Subcommand check  (for cmd.run only)
   Is the base command in subcommand_allowlist?
   Is the first positional argument in the allowed args for that command?
   → NO  ‣ reject: SubcommandNotAllowed | CommandNotAllowed

5. Forward to ToolRouter

The subcommand_allowlist is a map of command → [allowed_first_positional_arguments] defined in the agent manifest and bounded by the node-level ceiling in builtin_dispatchers.cmd. For example, allowing cargo with [build, test] permits cargo build and cargo test but rejects cargo publish.


Path 1: FSAL — Host File Operations

Used for: fs.read, fs.write, fs.create, fs.delete, fs.list

File system operations execute directly on the orchestrator host via the AegisFSAL (File System Abstraction Layer). No dispatch message goes to the container — the result is returned synchronously as part of the inner-loop LLM API response. Changes made by AegisFSAL are immediately visible to the agent container through its NFS volume mount.

Agent LLM output:
  tool call {name: "fs.write",
             arguments: {path: "/workspace/solution.py", content: "..."}}


SealMiddleware
  · Verify SealEnvelope (Ed25519 signature, SecurityToken JWT, expiry)
  · PolicyEngine: is /workspace/solution.py in path_allowlist? YES


AegisFSAL.write(file_handle, offset, data)
  · file_handle encodes execution_id + volume_id
    (48 bytes raw, ≤ 64-byte NFSv3 hard limit after bincode overhead)
  · Path canonicalization: all ".." and "." components rejected before
    the path reaches the StorageProvider — no exceptions
  · UID/GID squashing: file ownership set to agent container's UID/GID
    so kernel permission checks never interfere
  · Write forwarded to SeaweedFS (or local fallback) via StorageProvider
  · Publish StorageEvent::FileWritten to event bus


Result injected into LLM context:
  {success: true, bytes_written: 234}

AegisFileHandle encoding

Each NFS file handle encodes exactly two fields — execution_id (UUID, 16 bytes) and volume_id (UUID, 16 bytes) — serialised with bincode. The orchestrator uses this handle on every subsequent operation to verify that the requesting execution owns the volume before proceeding. Requests with handles it did not issue are rejected with UnauthorizedVolumeAccess.

The NFSv3 protocol imposes a 64-byte hard limit on file handles. The current encoding occupies 48 bytes raw (plus approximately 4 bytes bincode overhead), keeping total size safely within the limit.

For full details on the NFS Server Gateway and storage backend, see Storage Gateway.


Path 2: SEAL External — Host MCP Servers

Used for: web.fetch, web.search, email.send, email.read, and any other tool that requires external network access

Long-running MCP server processes (Tool Servers) run on the orchestrator host. They hold credentials and make outbound API calls; credentials never enter agent containers.

Agent tool call:
  {name: "web.fetch", arguments: {url: "https://api.github.com/repos/..."}}


SealMiddleware
  · Verify SealEnvelope
  · PolicyEngine: is domain api.github.com in domain_allowlist? YES
  · Rate limit: under 30 calls/60s? YES


ToolRouter
  · Capability index lookup: exact match "web.fetch" → server_id "web-search"
  · Fallback: prefix match "web.*" if exact match not found
  · Verify server status == Running


Tool Server process (host, JSON-RPC stdio)
  · Credentials resolved at server startup from env: or secret: references
  · Makes HTTP request to api.github.com using GITHUB_TOKEN
  · Returns JSON-RPC response


Result injected into LLM context as tool result message
Publish MCPToolEvent::InvocationCompleted to event bus

Credential isolation

Credentials are declared in aegis-config.yaml under mcp_servers[].credentials using env: or secret: prefixes. The orchestrator resolves these values from the host environment or OpenBao at server startup time and passes them to the server process as environment variables. The values are never written to disk, never logged, and never sent to the agent container.

mcp_servers:
  - name: web-search
    executable: "node"
    args: ["/opt/aegis-tools/web-search/index.js"]
    capabilities: [web.search, web.fetch]
    credentials:
      SEARCH_API_KEY: "secret:aegis-system/tools/search-api-key"
    health_check:
      interval_seconds: 60
      method: "tools/list"

See Node Configuration Reference for the full mcp_servers schema.

Capability routing index

The ToolRouter maintains an in-memory capability index mapping each registered tool name to a ToolServerId. On every routing call it first tries an exact name match ("web.fetch"), then falls back to a prefix match ("web.*"). If no server is found, the call is rejected with a ToolNotFound error before reaching any external system.


Path 3: Dispatch Protocol — In-Container Execution

Used for: cmd.run — tools that require spawning a subprocess inside the agent container

File operations backed by the NFS mount are visible to code running in the container, but code must actually execute there. The Dispatch Protocol solves this by sending a structured command message from the orchestrator back to bootstrap.py running inside the container. bootstrap.py spawns the subprocess locally and re-posts the result.

Wire format

Orchestrator sends this as the HTTP response to the POST /v1/dispatch-gateway call that bootstrap.py made:

{
  "type": "dispatch",
  "dispatch_id": "c7a3f1e2-84b1-4a3c-9d7e-1f2a3b4c5d6e",
  "action": "exec",
  "command": "cargo",
  "args": ["test", "--", "--test-output", "immediate"]
}

bootstrap.py runs the subprocess and re-posts to POST /v1/dispatch-gateway:

{
  "type": "dispatch_result",
  "execution_id": "exec-3f4e29...",
  "dispatch_id": "c7a3f1e2-84b1-4a3c-9d7e-1f2a3b4c5d6e",
  "exit_code": 0,
  "stdout": "running 12 tests\ntest result: ok. 12 passed\n",
  "stderr": "",
  "duration_ms": 1834,
  "truncated": false
}

The dispatch_id UUID is echoed from request to response. The orchestrator rejects any response whose dispatch_id does not match the outstanding dispatch. bootstrap.py re-posts immediately after the subprocess exits; it does not batch or defer results.

BuiltinDispatcher and subcommand_allowlist

cmd.run is not handled by an MCP server process. It is routed to the BuiltinDispatcher — an in-process orchestrator component. Before the dispatch message is sent, the BuiltinDispatcher checks the call against the subcommand_allowlist specified in the agent manifest and bounded by the node-level ceiling in builtin_dispatchers.cmd.

bootstrap.py acts as a trusted executor inside the container. It receives only pre-validated, policy-cleared dispatch messages from the orchestrator and executes them locally using standard system subprocess calls.

cmd.run {command: "cargo", args: ["publish"]}


subcommand_allowlist check:
  cargo → allowed first args: [build, test, fmt, clippy, check, run]
  "publish" not in list


Reject: SealViolationType::SubcommandNotAllowed
Publish: CommandExecutionEvent::CommandPolicyViolation
Return 403 to agent
cmd.run {command: "cargo", args: ["build"]}


subcommand_allowlist check: PASS


OrchestratorMessage {type:"dispatch", dispatch_id:"...", action:"exec",
                     command:"cargo", args:["build"]}
  sent to bootstrap.py as HTTP response

bootstrap.py (trusted executor) spawns subprocess: cargo build

AgentMessage {type:"dispatch_result", dispatch_id:"...", exit_code:0, ...}
  re-posted to orchestrator


Result injected into LLM context
Publish: CommandExecutionEvent::CommandExecutionCompleted

For the full builtin_dispatchers and subcommand_allowlist field schemas, see Node Configuration Reference and Agent Manifest Reference.


Policy Violation Types

When a tool call is rejected, the orchestrator returns a structured error to the agent and publishes a ToolPolicyViolation or CommandPolicyViolation event. The variant describes the specific rule that fired:

ViolationCause
ToolNotAllowedTool name not present in the agent's spec.tools list
ToolExplicitlyDeniedTool name matches an entry in the SecurityContext deny_list
RateLimitExceededExecution has hit max_calls_per_execution or the per-tool rate window
PathOutsideBoundaryFilesystem path is not prefixed by any entry in path_allowlist
PathTraversalAttemptPath contains .. components — rejected unconditionally
DomainNotAllowedTarget domain not in domain_allowlist
SubcommandNotAllowedFirst positional argument not in subcommand_allowlist for the command
CommandNotAllowedBase command not present in subcommand_allowlist at all
OutputSizeLimitExceededSubprocess stdout+stderr exceeds max_output_bytes

Tool Routing Decision Table

Tool patternSEAL policy checkRouting pathExecutor
fs.read, fs.write, fs.create, fs.delete, fs.listFilesystem path allowlist + traversal checkPath 1 — FSALAegisFSAL on orchestrator host; writes visible via NFS mount
cmd.runsubcommand_allowlistPath 3 — Dispatch Protocolsubprocess inside agent container via bootstrap.py
web.fetch, web.searchDomain allowlist + rate limitPath 2 — SEAL Externalweb-search Tool Server on host
email.send, email.readDomain and account allowlistPath 2 — SEAL Externalgmail-tools Tool Server on host
Any unregistered toolRejected before routingToolPolicyViolation event emitted, 403 returned

Event Audit Trail

Every routing decision publishes a domain event on the internal event bus. These events feed the Cortex learning system and can be forwarded to external audit sinks. See Event Bus for the subscription and consumer model.

EventTrigger
MCPToolEvent::InvocationRequestedSealMiddleware accepts and validates an envelope
MCPToolEvent::InvocationCompletedTool result returned to agent
MCPToolEvent::InvocationFailedTool server error or subprocess non-zero exit
MCPToolEvent::ToolPolicyViolationPolicyEngine denies the call before routing
CommandExecutionEvent::CommandExecutionStartedDispatch message sent to bootstrap.py
CommandExecutionEvent::CommandExecutionCompletedbootstrap.py re-posts result with exit code
CommandExecutionEvent::CommandExecutionFailedSubprocess timeout or OS error
CommandExecutionEvent::CommandPolicyViolationSubcommandAllowlist check failed
StorageEvent::FileWrittenAegisFSAL completes a write to a volume
StorageEvent::FileReadAegisFSAL completes a read from a volume
StorageEvent::PathTraversalBlockedAegisFSAL rejects a ..-containing path
StorageEvent::UnauthorizedVolumeAccessFile handle execution_id does not match caller
SealEvent::PolicyViolationBlockedSecurityContext policy blocks a tool call
SealEvent::SignatureVerificationFailedEd25519 signature on envelope is invalid

On this page