--gpu sandboxes can't initialize CUDA — seccomp blocks memfd_create (CUDA error 304)

### Summary
NVML/`nvidia-smi` and Vulkan work inside a `--gpu` sandbox, but **any CUDA workload fails at init with `cudaErrorOperatingSystem (304)`**, so RTX/OptiX/ML workloads can't run. Root cause: OpenShell's seccomp profile blocks **`memfd_create`**, which the NVIDIA CUDA driver requires during initialization.

### Environment
- OpenShell **0.0.53 and 0.0.54** (same behavior)
- **Docker driver**, GPU via CDI (`nvidia.com/gpu=all`)
- Ubuntu 24.04, **kernel 6.8**, NVIDIA driver **570.169**, GPU **RTX 5080**
- Workload: NVIDIA OVRTX (Omniverse RTX) Python SDK

### Symptom
```
[Error] [omni.rtx] CUDA error 304: cudaErrorOperatingSystem - OS call failed or operation not supported on this OS
[Error] [omni.rtx] Failed to query CUDA device count.
[Error] [ovrtx] createDevices failed -> Failed to setup graphics!
```

### Root cause (confirmed)
- GPU is correctly attached — all `/dev/nvidia*` nodes are present in the container (`docker exec -u 0`), and the **same workload renders successfully via `docker exec` into the same container** (bypassing the process-sandbox). So GPU/driver/image are fine.
- Seccomp is default-allow with ~20 targeted blocks; CUDA works under Docker's default seccomp, so the culprit is one of those blocks.
- **Direct test:** `ctypes` calling `memfd_create` (syscall 319) -> **`EPERM` under OpenShell, succeeds via `docker exec`** — matches the block in `seccomp.rs` (commented re: fileless execution / landlock bypass).
- Ruled out: landlock (disabling it didn't help), version (0.0.54 same), LD_PRELOAD (static binary), strace (ptrace blocked).

### Why CUDA needs it
The NVIDIA driver/CUDA runtime uses `memfd_create` during initialization. With it blocked, CUDA can't initialize -> error 304.

### Proposed fix (security-preserving), in order of preference
1. **Allow `memfd_create` only with `MFD_NOEXEC_SEAL`** (kernel >= 6.3; ours is 6.8). Permits non-executable in-memory files (what CUDA needs) while still blocking executable memfds — the fileless-execution vector stays closed.
2. Or relax the block only when the sandbox is created with `--gpu`.
3. Or expose a policy knob to opt in.

### Security assessment
Regression from option (1) is essentially nil — executable memfds remain blocked, and the primary containment (network egress proxy + allowlist, credential isolation, non-root, no ptrace, the other seccomp blocks, landlock FS) is unchanged. `memfd_create` grants in-memory file creation, not network/secret/host access.

### Impact
Unblocks the entire class of CUDA/RTX/ML agent workloads, uniformly across all backends (Docker, Podman, Kubernetes, MicroVM). Happy to validate a patched build, and can attempt a PR.


Provide feedback

Saved searches

Use saved searches to filter your results more quickly

--gpu sandboxes can't initialize CUDA — seccomp blocks memfd_create (CUDA error 304) #1696

Summary

Environment

Symptom

Root cause (confirmed)

Why CUDA needs it

Proposed fix (security-preserving), in order of preference

Security assessment

Impact

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

--gpu sandboxes can't initialize CUDA — seccomp blocks memfd_create (CUDA error 304) #1696

Description

Summary

Environment

Symptom

Root cause (confirmed)

Why CUDA needs it

Proposed fix (security-preserving), in order of preference

Security assessment

Impact

Metadata

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

Issue actions