A deep dive on agent sandboxes
# September 26, 2025
The modern generation of coding agents are powerful in part because of their access permissions. They have a few built-in functions for the most obvious operations (writing files, editing strings) but other than that they mostly generate what they need adhoc. Their bash
tool is by far their most powerful because it's the most expressive. You can run some adhoc python to test out some logic, run a compilation pass, or curl a website. It's turning complete for gosh sake.
It's also by far the most dangerous. You probably wouldn't give your new intern access to the prod credentials. But an arbitrary bash session could certainly provide that permissions escalation; if not fully delete your home directory from disk.
The safest way to run any coding agent is within virtualization. Boot up a container with a limited scope, a git branched workspace of what you're working on, and all the tools it will need to get the job done. I'm willing to wadger a large sum that almost no one does that. --dangerously-skip-permissions
is well named but most people just shrug and let it run.
One solution to balance permissions and danger are the command whitelist. You see this in Claude Code and Cursor when run in the normal mode. Every command execution asks you, a real human being, if you feel comfortable with the agent doing that same command in the future. A swift build
? Perfectly safe. An rm -rf
? Nah let's skip that one for now. But command whitelists are brittle and a bit annoying. If you need to run the command in a new directory, sometimes the cd
command will block because it hasn't been approved yet. It's fully impractical to run if you step away from your computer and you aren't available in the loop for the approval.
But the tides are a changing.
Codex Permissions
I've been trying out the Codex CLI with gpt-5-codex high
for the last couple of weeks. So far I like it1. I was specifically having it iterate on a swift project to resolve some compilation issues by continuously running swift build
before it yielded back. The command was running but it kept failing with a clang permissions error. Clang was installed and working fine when I ran it manually so I wasn't immediately sure what codepath or environment variable was causing the issue.
Turns out Codex default launches in the mode where it only has access to the current folder. Running /approvals
shows you what mode you're actually in. It supports three different ones:
- Read Only - Codex can read files and answer questions. Requires approval for edits, commands, or network access
- Auto (current) - Can read files, make edits, and run commands in the workspace. Requires approval for workspace-external or network access
- Full Access - Can read files, make edits, and run commands with network access, without approval
Read Only and Full Access are pretty obvious implementations. Read Only basically only allows access to grep
and cat
. And Full Access throws caution to the wind and just gives your Agent full access to the terminal.
Auto is the most interesting one. Limiting network access and workspace external access (when when running within a script itself) smelled like it was doing something more interesting, perhaps even containerized. But it's certainly not using Docker - so what's going on here?
Since everything's open sourced you can take a look yourself. When running on macOS it uses the native sandboxing APIs to enforce these restrictions2. The swift toolchain lives outside the project directory, so of course it couldn't access it. I spent some time digging into how this actually works.
The execution pipeline
Every tool call flows through a centralized execution system that decides whether to run commands raw, under macOS Seatbelt, or through a Linux seccomp helper. The CLI bootstraps this by wrapping the main entry point in arg0_dispatch_or_else
:
// 160:163:codex-rs/cli/src/main.rs
fn main() -> anyhow::Result<()> {
arg0_dispatch_or_else(|codex_linux_sandbox_exe| async move {
cli_main(codex_linux_sandbox_exe).await?;
Ok(())
})
}
This captures the executable path and passes it to the async runtime, so downstream components can spawn the Linux sandbox helper when they need it. The dispatcher checks if the binary was invoked via the codex-linux-sandbox
alias:
// 48:49:codex-rs/arg0/src/lib.rs
if exe_name == LINUX_SANDBOX_ARG0 {
// Safety: [`run_main`] never returns.
If it was, it jumps straight into sandbox execution. Otherwise it loads environment variables, patches PATH, and calls the main CLI logic with the sandbox executable path available.
This architecture also makes sandboxing effectively. opt-out, not opt-in. Every command execution goes through process_exec_tool_call
, which routes based on SandboxType
:
// 92:99:codex-rs/core/src/exec.rs
let raw_output_result: std::result::Result<RawExecToolCallOutput, CodexErr> = match sandbox_type
{
SandboxType::None => exec(params, sandbox_policy, stdout_stream.clone()).await,
SandboxType::MacosSeatbelt => {
let ExecParams {
command, cwd, env, ..
} = params;
let child = spawn_command_under_seatbelt(
By default, you're sandboxed.
An aside on OS sandboxing
There are a bunch of different ways to isolate processes on modern operating systems. Linux gives you the most options, while macOS keeps things simpler (but more limited).
Linux combines several kernel features that have been available for awhile now:
- Landlock (Linux 5.13+): Capability-based filesystem access control that's actually pretty straightforward to use
- Seccomp-BPF: System call filtering with programmable filters
- Namespaces: Process, network, mount, and user isolation
- Cgroups: Resource limiting and accounting
macOS has two main approaches, both with pretty significant limitations for agents:
- App Sandbox: Apple's preferred approach, designed for App Store apps with static entitlements
- Seatbelt: The older framework that App Sandbox is built on top of
- Virtualization.framework: Actually implementing virtualization/container technologies yourself. This is what Docker uses as its default backend these days.
Despite Apple marking Seatbelt as deprecated, it's still pretty used across the ecosystem. This thread on HN was a good read. Some choice insights:
the sandbox subsystem is what all of Apple's system software uses for sandboxing, as well as many security-conscious third-party programs such as web browsers. It's not going anywhere anytime soon.
Some other issues. I'm quoting here since I haven't spent a whole lot of time poking around the edges of Seatbelt myself:
-
Package management restrictions: Agents need to install dependencies on the fly. "If you use a native Mac sandbox then it'd need to ask permission to use homebrew. Maybe you can find a way to run homebrew inside a Mac sandbox but it won't be straightforward."
-
Policy complexity leads to security holes: Custom Seatbelt policies are easy to mess up: "The bug I reported (in an AI agent sandbox using Seatbelt) was that it failed to properly block off access to the user's home directory dotfiles and ~/Library. This sort of mistake is easy to make with Seatbelt but harder to make with containers."
-
Limited network control: Neither App Sandbox nor Seatbelt gives you the granular network controls you'd want. You can't easily restrict access to specific domains or protocols, or intercept traffic to black some bad actors. It's an all or nothing approach.
OS-level sandboxes work at the syscall level. They understand files, network sockets, and processes, but not higher-level concepts like "only allow HTTPS to api.openai.com" or "permit git operations but block arbitrary network access." So you're stuck making pretty coarse-grained decisions.
Platform-specific implementations
Now back to Codex. The SandboxPolicy
struct drives both platform implementations. It specifies writable roots, network access, and other constraints that get translated into platform-specific restrictions.
macOS Seatbelt
On macOS, they're using Apple's Seatbelt framework. The implementation hardcodes /usr/bin/sandbox-exec
and injects environment variables to identify sandboxed execution:
// 19:38:codex-rs/core/src/seatbelt.rs
pub async fn spawn_command_under_seatbelt(
command: Vec<String>,
sandbox_policy: &SandboxPolicy,
cwd: PathBuf,
stdio_policy: StdioPolicy,
mut env: HashMap<String, String>,
) -> std::io::Result<Child> {
let args = create_seatbelt_command_args(command, sandbox_policy, &cwd);
let arg0 = None;
env.insert(CODEX_SANDBOX_ENV_VAR.to_string(), "seatbelt".to_string());
spawn_child_async(
PathBuf::from(MACOS_PATH_TO_SEATBELT_EXECUTABLE),
args,
arg0,
cwd,
sandbox_policy,
stdio_policy,
env,
)
.await
}
They enumerate writable roots from the policy configuration but carve out .git
directories as read-only:
// 59:73:codex-rs/core/src/seatbelt.rs
for (index, wr) in writable_roots.iter().enumerate() {
// Canonicalize to avoid mismatches like /var vs /private/var on macOS.
let canonical_root = wr.root.canonicalize().unwrap_or_else(|_| wr.root.clone());
let root_param = format!("WRITABLE_ROOT_{index}");
cli_args.push(format!(
"-D{root_param}={}",
canonical_root.to_string_lossy()
));
if wr.read_only_subpaths.is_empty() {
writable_folder_policies.push(format!("(subpath (param \"{root_param}\"))"));
} else {
// Add parameters for each read-only subpath and generate
// the `(require-not ...)` clauses.
let mut require_parts: Vec<String> = Vec::new();
Pretty clever. Agents can modify your workspace but can't mess up your git history.
Network access is straightforward - either you have it or you don't:
// 106:110:codex-rs/core/src/seatbelt.rs
let network_policy = if sandbox_policy.has_full_network_access() {
"(allow network-outbound)\n(allow network-inbound)\n(allow system-socket)"
} else {
""
};
When network is disabled, they just omit the network permissions. Seatbelt denies by default, so that's all you need.
Linux Landlock + seccomp
The Linux implementation is a bit more complex - they're using Landlock for filesystem restrictions and seccomp for system call filtering.
The codex-linux-sandbox
helper is a separate binary that gets spawned as a subprocess. It parses the serialized policy and target command, applies restrictions, then exec's the actual command:
// 20:35:codex-rs/linux-sandbox/src/linux_run_main.rs
pub fn run_main() -> ! {
let LandlockCommand {
sandbox_policy_cwd,
sandbox_policy,
command,
} = LandlockCommand::parse();
if let Err(e) = apply_sandbox_policy_to_current_thread(&sandbox_policy, &sandbox_policy_cwd) {
panic!("error running landlock: {e:?}");
}
if command.is_empty() {
panic!("No command specified to execute.");
}
#[expect(clippy::expect_used)]
let c_command =
This keeps the sandbox restrictions isolated to just the child process.
They apply both filesystem and network restrictions before the exec:
// 30:50:codex-rs/linux-sandbox/src/landlock.rs
pub(crate) fn apply_sandbox_policy_to_current_thread(
sandbox_policy: &SandboxPolicy,
cwd: &Path,
) -> Result<()> {
if !sandbox_policy.has_full_network_access() {
install_network_seccomp_filter_on_current_thread()?;
}
if !sandbox_policy.has_full_disk_write_access() {
let writable_roots = sandbox_policy
.get_writable_roots_with_cwd(cwd)
.into_iter()
.map(|writable_root| writable_root.root)
.collect();
install_filesystem_landlock_rules_on_current_thread(writable_roots)?;
}
// TODO(ragona): Add appropriate restrictions if
// `sandbox_policy.has_full_disk_read_access()` is `false`.
Ok(())
}
Landlock grants read access everywhere but restricts writes to whitelisted roots (plus /dev/null
). The key is applying these rules before execvp
- once the target command starts, it's already sandboxed.
For network isolation, they use seccomp to block outbound network sockets (except AF_UNIX for local IPC):
// 87:109:codex-rs/linux-sandbox/src/landlock.rs
fn install_network_seccomp_filter_on_current_thread() -> std::result::Result<(), SandboxErr> {
// Build rule map.
let mut rules: BTreeMap<i64, Vec<SeccompRule>> = BTreeMap::new();
// Helper – insert unconditional deny rule for syscall number.
let mut deny_syscall = |nr: i64| {
rules.insert(nr, vec![]); // empty rule vec = unconditional match
};
deny_syscall(libc::SYS_connect);
deny_syscall(libc::SYS_accept);
deny_syscall(libc::SYS_accept4);
deny_syscall(libc::SYS_bind);
deny_syscall(libc::SYS_listen);
deny_syscall(libc::SYS_getpeername);
deny_syscall(libc::SYS_getsockname);
deny_syscall(libc::SYS_shutdown);
deny_syscall(libc::SYS_sendto);
deny_syscall(libc::SYS_sendmsg);
deny_syscall(libc::SYS_sendmmsg);
// NOTE: allowing recvfrom allows some tools like: `cargo clippy` to run
// with their socketpair + child processes for sub-proc management
This is more granular than what you get with Seatbelt on macOS.
Child process management
All sandboxed processes go through spawn_child_async
, which handles environment cleanup and stdio management. They completely clear the environment and rebuild it with only the variables you actually want:
// 38:61:codex-rs/core/src/spawn.rs
pub(crate) async fn spawn_child_async(
program: PathBuf,
args: Vec<String>,
#[cfg_attr(not(unix), allow(unused_variables))] arg0: Option<&str>,
cwd: PathBuf,
sandbox_policy: &SandboxPolicy,
stdio_policy: StdioPolicy,
env: HashMap<String, String>,
) -> std::io::Result<Child> {
trace!(
"spawn_child_async: {program:?} {args:?} {arg0:?} {cwd:?} {sandbox_policy:?} {stdio_policy:?} {env:?}"
);
let mut cmd = Command::new(&program);
#[cfg(unix)]
cmd.arg0(arg0.map_or_else(|| program.to_string_lossy().to_string(), String::from));
cmd.args(args);
cmd.current_dir(cwd);
cmd.env_clear();
cmd.envs(env);
if !sandbox_policy.has_full_network_access() {
cmd.env(CODEX_SANDBOX_NETWORK_DISABLED_ENV_VAR, "1");
}
This prevents leaking sensitive environment variables. Network-disabled runs get tagged with CODEX_SANDBOX_NETWORK_DISABLED=1
so downstream tools know what's happening. Stdio can be either piped (for capturing output) or inherited (for interactive commands).
On Linux, they register a parent-death signal handler using prctl(PR_SET_PDEATHSIG)
:
// 68:69:codex-rs/core/src/spawn.rs
#[cfg(target_os = "linux")]
unsafe {
This ensures sandboxed children die if the main process gets killed - you don't want orphaned processes running around.
Command whitelisting
Before executing any command, they run it through assess_command_safety
:
// 81:99:codex-rs/core/src/safety.rs
pub fn assess_command_safety(
command: &[String],
approval_policy: AskForApproval,
sandbox_policy: &SandboxPolicy,
approved: &HashSet<Vec<String>>,
with_escalated_permissions: bool,
) -> SafetyCheck {
// A command is "trusted" because either:
// - it belongs to a set of commands we consider "safe" by default, or
// - the user has explicitly approved the command for this session
//
// Currently, whether a command is "trusted" is a simple boolean, but we
// should include more metadata on this command test to indicate whether it
// should be run inside a sandbox or not. (This could be something the user
// defines as part of `execpolicy`.)
//
// For example, when `is_known_safe_command(command)` returns `true`, it
// would probably be fine to run the command in a sandbox, but when
// `approved.contains(command)` is `true`, the user may have approved it for
This uses trust lists and approval policies to categorize commands: safe to auto-run, needs user approval, or needs to run unsandboxed.
The trust list is session-scoped - once you approve a command, it's trusted for the rest of that session.
If a sandboxed command fails due to permissions, the system asks if you want to retry unsandboxed. If you approve, it marks the command as trusted and re-executes with SandboxType::None
. Pretty reasonable escape hatch for when you actually need broader access.
Debugging support
While I was poking around the code I also noticed that they expose some debuggers to help test sandboxing. codex debug seatbelt
and codex debug landlock
let you test arbitrary commands through the sandbox. These honor the same --full-auto
flags as the main CLI.
Conclusion
This implementation is quite nice considering the main goal of Codex is improving agents and not containerization.3 OS-native enforcement lets them avoid a lot of overhead while still getting pretty good isolation.
The key design choices that stood out to me:
- Platform-specific implementations unified behind a common policy abstraction
- Default-sandbox execution with selective escalation when needed
- Session-based trust lists that reduce approval fatigue
- Debug tooling that actually helps you understand what's happening
As more people start using agents to write and execute code, this kind of sandboxing is going to be pretty important. The alternative, really asking pretty please to avoid calling rm -rf your home directory, doesn't seem like a great long-term plan.
-
It's a SOTA architecture. All these models are basically in the same ballpark. ↩
-
Linux, flexible in all things, lets you do this customization with even lower level permissioning. Half the reason why Docker thrives on Linux. ↩
-
But I suppose it's an acknowledgement that one is growing more tightly bound to the other. ↩