Storage Gateway
AegisFSAL architecture, user-space NFSv3, FileHandle structure, UID/GID squashing, path canonicalization, and SeaweedFS integration.
Storage Gateway
The AEGIS Storage Gateway is the security boundary for all filesystem access by agent containers. It is implemented as a user-space NFSv3 server (AegisFSAL) running on the orchestrator host, with agent containers mounting their volumes via the kernel NFS client.
Design Philosophy
Traditional container volume mounts (bind mounts, CAP_SYS_ADMIN FUSE mounts) give agent containers unrestricted access to mounted storage once the mount is established. AEGIS takes a different approach:
Every POSIX operation is routed through the orchestrator-controlled AegisFSAL before reaching SeaweedFS. This means:
- Per-operation authorization: The orchestrator validates every read, write, create, and delete against the execution's manifest policies.
- Full audit trail: Every file operation is published as a
StorageEventdomain event. - Path traversal prevention: Server-side path canonicalization blocks
../attempts before they reach SeaweedFS. - No elevated privileges: Agent containers require zero special capabilities (
CAP_SYS_ADMINis not needed).
Component Hierarchy
Agent Container (Docker)
│ kernel NFS client
│ mount: addr=orchestrator_host, nfsvers=3, proto=tcp, nolock
│ /workspace → NFS server
▼
Orchestrator Host: NFS Server Gateway (user-space, tcp, port 2049)
│ NFSv3 protocol handler (nfsserve Rust crate)
▼
AegisFSAL (File System Abstraction Layer)
│ receive: LOOKUP, READ, WRITE, READDIR, GETATTR, CREATE, REMOVE
├──► Decode FileHandle → extract execution_id + volume_id
├──► Authorize: does execution own this volume?
├──► Canonicalize path: reject ".." components
├──► Enforce FilesystemPolicy (manifest allowlists)
├──► Apply UID/GID squashing (return agent container's UID/GID, not real ownership)
├──► Enforce quota (size_limit_bytes)
├──► Publish StorageEvent to Event Bus
▼
StorageProvider trait (via StorageRouter)
├── SeaweedFS POSIX API client (default)
├── OpenDalStorageProvider (S3, GCS, Azure)
├── LocalHostStorageProvider (NVMe, bind mounts)
└── SealStorageProvider (Remote Node execution coordination)AegisFileHandle
The NFSv3 protocol requires servers to return an opaque FileHandle for each file and directory. AEGIS encodes authorization information directly into the FileHandle:
FileHandle layout (48 bytes raw, ~52 bytes serialized, ≤64 bytes NFSv3 limit):
┌──────────────────────────────────────────────────┐
│ execution_id (UUID binary, 16 bytes) │
│ volume_id (UUID binary, 16 bytes) │
│ path_hash (u64, 8 bytes) — FNV hash of path │
│ created_at (i64, 8 bytes) — Unix timestamp │
└──────────────────────────────────────────────────┘Because NFSv3's 64-byte limit does not allow storing a full file path in the handle, path_hash contains a hash of the canonical path. The NFS server maintains a bidirectional in-memory FileHandleTable (fileid3 ↔ AegisFileHandle) that maps numeric file IDs to handles and reconstructs paths on demand. This table is per-execution and discarded when the execution ends.
On every NFS operation, AegisFSAL decodes the FileHandle, extracts execution_id and volume_id, and verifies that the requesting execution is authorized to access that volume. If the execution does not own the volume, the operation fails with NFS3ERR_ACCES and an UnauthorizedVolumeAccess event is published.
The 64-byte NFSv3 FileHandle size limit is a hard protocol constraint enforced by the kernel NFS client. The current layout serializes to ~52 bytes via bincode, safely within the limit.
UID/GID Squashing
When SeaweedFS stores files, they carry a real POSIX UID/GID. Agent containers run as varying user IDs. Without squashing, file ownership mismatches would cause permission errors.
AegisFSAL overrides all file metadata returned by GETATTR to report the agent container's UID/GID rather than the real file ownership:
- All
GETATTRresponses returnuid = agent_container_uid,gid = agent_container_gid. - POSIX permission bit checks (
chmod 600) are not enforced by the NFS server. - Authorization is handled entirely by the manifest
FilesystemPolicy, not kernel permission bits.
The agent_container_uid and agent_container_gid are stored in the Execution metadata when the container is created and retrieved by AegisFSAL during each operation.
Path Canonicalization
All incoming paths are canonicalized before reaching the StorageProvider:
- Resolve any
.components. - Detect any
..components. - If
..is detected, reject the entire operation withNFS3ERR_ACCESand publish aPathTraversalBlockedevent. - Strip the volume's root prefix to produce a path relative to the SeaweedFS bucket.
Example:
Incoming: /workspace/../etc/passwd
Step 2: ".." detected
Step 3: REJECTED → NFS3ERR_ACCES
PathTraversalBlocked event publishedFilesystem Policy Enforcement
Each WRITE, CREATE, and REMOVE operation is validated against the manifest's FilesystemPolicy:
spec:
security:
filesystem:
read:
- /workspace
- /agent
write:
- /workspaceIf an agent attempts to write to /agent/config.py but only /workspace is in write, the operation is blocked with NFS3ERR_PERM and a FilesystemPolicyViolation event is published.
Quota Enforcement
When size_limit is set in the volume declaration, AegisFSAL tracks cumulative bytes written to the volume. Before each WRITE:
current_volume_size + write_size > parsed(size_limit)?
→ YES: fail with NFS3ERR_NOSPC, emit VolumeQuotaExceeded event
→ NO: proceed with writeQuota accounting is maintained in-memory per execution and persisted to PostgreSQL. It is not affected by file deletions in Phase 1 (quota only tracks bytes written, not net storage used).
Storage Routing
The StorageRouter acts as a proxy for the AegisFSAL, allowing operations on volumes using distinct backends (like OpenDAL or LocalHost). Every POSIX operation requests the StorageRouter to find the correct StorageProvider for the specified volume_id in the FileHandle.
For more in-depth operational mechanisms regarding diverse storage modes, see Storage Backends.
AegisFSAL is designed as a transport-agnostic core. The NFSv3 server is the Phase 1 transport for Docker-based deployments. In Phase 2 (Firecracker), a virtio-fs frontend will use the same AegisFSAL security and authorization logic with zero code duplication:
Phase 1: Docker
NFSv3 Frontend → AegisFSAL → StorageProvider
Phase 2: Firecracker
virtio-fs Frontend → AegisFSAL → StorageProviderThe FSAL authorization logic, path canonicalization, UID/GID squashing, quota tracking, and event publishing are written once in AegisFSAL and shared across both transports.
Volume Lifecycle
Volumes follow a deterministic state machine managed by the orchestrator:
Creating ──► Available ──► Attached ──► Detached
│ │ │
│ └─────────────┤
│ │
└──────────────────────────────────► Deleting ──► Deleted
Any non-terminal state ──► Failed| State | Meaning |
|---|---|
Creating | Directory being provisioned in SeaweedFS and quota being set |
Available | SeaweedFS directory ready; no container has mounted it yet |
Attached | NFS export active; container is mounted and I/O is live |
Detached | Container stopped; NFS export removed; volume data intact in SeaweedFS |
Deleting | Delete request accepted; SeaweedFS directory removal in progress |
Deleted | SeaweedFS directory confirmed removed; record retained briefly for audit trail |
Failed | A state transition failed (e.g., SeaweedFS unreachable during creation or deletion) |
Available → Attached occurs when the container starts and the NFS mount is confirmed active. Attached → Detached occurs when the container stops or is killed. Ephemeral volumes with no active execution proceed immediately from Detached to Deleting. Persistent volumes remain in Detached until explicitly deleted via the CLI or API.
Failed volumes are surfaced through volume management APIs and can be retried by reissuing the delete request through the orchestrator API.
Phase 1 Constraints
nolock Mount Option
All NFS mounts in Phase 1 use nolock. This disables the NLM (Network Lock Manager) protocol, meaning POSIX advisory file locks (flock, fcntl) are not coordinated across agents.
This is safe for the common case of single-agent-per-volume. For multi-agent coordination (swarms), use the ResourceLock mechanism provided by the swarm coordination context instead of POSIX locks.
Single-Writer Constraint
Persistent volumes with ReadWrite access can only be mounted by one execution at a time. Attempting a second ReadWrite mount on the same volume returns VolumeAlreadyMounted. Multiple executions may hold ReadOnly mounts simultaneously.
SRE & Performance Tuning
To optimize the AegisFSAL NFS Server Gateway for varied agent workloads, operators should tune the kernel NFS client mount options. By default, the orchestrator mounts volumes with the following options:
addr=<orchestrator_host>,nfsvers=3,proto=tcp,soft,timeo=10,nolock,acregmin=3,acregmax=60- Graceful Degradation (
soft,timeo=10): A soft mount ensures that if the NFS server crashes or becomes unreachable, the agent container's I/O operations will return anEIOerror rather than hanging indefinitely in a D-state. Thetimeo=10parameter specifies a 1-second timeout (measured in deciseconds). - Client Caching (
acregmin,acregmax): The kernel NFS client caches file attributes. Lowering these values (acregmin=1) reduces cache staleness at the cost of moreGETATTRcalls to the orchestrator. For high-throughput artifact generation, keeping the defaults (acregmin=3,acregmax=60) reduces load on the orchestrator. - Latency Overhead: Because every POSIX operation routes through the
AegisFSALorchestrator process for authorization, there is an expected 1-2ms latency overhead per operation compared to a direct FUSE mount. This is generally acceptable for agent-driven code generation, but may affect high-frequency I/O workloads.
Export Path Routing
Each volume gets a unique NFS export path derived from its tenant and volume identifiers:
/{tenant_id}/{volume_id}The orchestrator maintains a runtime NfsVolumeRegistry — a concurrent map of VolumeId → NfsVolumeContext. When an execution mounts a volume, its export path is registered. When the execution ends and the volume is detached, the entry is removed. Volumes that are not currently mounted have no active export and cannot be reached via NFS.
The agent container's NFS mount is configured to target the orchestrator host at the volume's export path:
addr=<orchestrator_host>,nfsvers=3,proto=tcp,soft,timeo=10,nolock
device: :/<tenant_id>/<volume_id>
target: /workspace (or mount_path from manifest)Storage Events
Every file operation handled by AegisFSAL publishes a StorageEvent to the event bus. These events are persisted to PostgreSQL by a background StorageEventPersister task and form the complete file-level audit trail for each execution.
| Event | Trigger |
|---|---|
FileOpened | Agent opens a file (open() / create()) |
FileRead | Bytes are read from a file — includes offset and bytes_read |
FileWritten | Bytes are written to a file — includes offset and bytes_written |
FileClosed | File handle is released |
DirectoryListed | readdir is called on a directory |
FileCreated | A new file is created |
FileDeleted | A file is removed |
PathTraversalBlocked | A .. component was detected in the incoming path |
FilesystemPolicyViolation | An operation violated a manifest read/write allowlist |
QuotaExceeded | A write would exceed the volume's size_limit |
UnauthorizedVolumeAccess | The requesting execution does not own the volume |
All events carry execution_id, volume_id, and a timestamp. File operation events additionally carry the canonicalized path, byte counts, and latency in milliseconds.
SeaweedFS Integration
SeaweedFS is the default StorageProvider. The orchestrator communicates with SeaweedFS through two separate interfaces:
| Interface | Used For |
|---|---|
HTTP Filer API (port 8888) | Directory lifecycle (create, delete, set quota, get usage, list) — called by VolumeManager during volume provisioning and GC |
HTTP Filer API (port 8888) | POSIX file operations (open, read, write, stat, readdir, create, rename, delete) — called by AegisFSAL on every NFS LOOKUP, READ, WRITE, READDIR, GETATTR etc. |
Volume data is stored in SeaweedFS at the following path structure:
/{tenant_id}/{volume_id}/{file_path}For example, a file /workspace/main.py written by an execution exec-abc on volume vol-xyz in the default single-tenant environment is stored at:
/00000000-0000-0000-0000-000000000001/vol-xyz/main.pyReplication
SeaweedFS replication is configured independently of AEGIS at the SeaweedFS layer. The AEGIS orchestrator does not set the replication factor on volume directories — this is controlled by the SeaweedFS default replication setting and can be overridden in SeaweedFS collection configuration.
A common convention for AEGIS deployments is:
| Storage Class | SeaweedFS Replication | Rationale |
|---|---|---|
| Ephemeral | 000 (no replication) | TTL-based; durability not required |
| Persistent | 001 (one copy on different nodes) | Survives single node failure |
Health Checks
The orchestrator checks SeaweedFS health via the Filer API on startup and periodically thereafter. If SeaweedFS is unreachable and fallback_to_local is enabled in the node configuration, the orchestrator falls back to a local filesystem StorageProvider. The local fallback does not support S3 artifact inspection or multi-node access.
SEAL Tooling Gateway
Standalone Rust gateway that compresses REST APIs and CLI tools into semantic macro-tools for agents, with SEAL security and dynamic credential injection.
Storage Backends
Comprehensive guide to the storage backends supported by AEGIS, including SeaweedFS, OpenDAL, Local Host mounts, and SEAL remote volumes.