Protect first-time project .codex creation across Linux and macOS sandboxes#15067
Protect first-time project .codex creation across Linux and macOS sandboxes#15067rreichel3-oai merged 12 commits intomainfrom
Conversation
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: dd61525cf7
ℹ️ About Codex in GitHub
Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".
| @@ -637,31 +654,24 @@ fn create_seatbelt_args_with_read_only_git_and_codex_subpaths() { | |||
| .map(std::string::ToString::to_string) | |||
| .collect(); | |||
| let args = create_seatbelt_command_args(shell_command.clone(), &policy, &cwd, false, None); | |||
|
|
|||
There was a problem hiding this comment.
Can we get back to this? It's much clearer to have one assert_eq!().
| @@ -142,8 +150,10 @@ fn explicit_unreadable_paths_are_excluded_from_full_disk_read_and_write_access() | |||
| "expected read carveout parameter in args: {args:#?}" | |||
| ); | |||
| assert!( | |||
| args.iter() | |||
| .any(|arg| arg == &format!("-DWRITABLE_ROOT_0_RO_0={}", unreadable_root.display())), | |||
| args.iter().any(|arg| { | |||
There was a problem hiding this comment.
Is there any reason the arg list is unstable? Shouldn't we be able to match arg exactly instead of using prefix/suffix checks?
There was a problem hiding this comment.
Yes, fixed. This now matches the generated arg exactly instead of using prefix/suffix checks.
codex-rs/core/src/seatbelt.rs
Outdated
| @@ -388,6 +388,12 @@ fn build_seatbelt_access_policy( | |||
| normalize_path_for_sandbox(excluded_subpath.as_path()).unwrap_or(excluded_subpath); | |||
| let excluded_param = format!("{param_prefix}_{index}_RO_{excluded_index}"); | |||
There was a problem hiding this comment.
Hmm, so _RO_ means "read-only," but are we also using it for "unreadable" subpaths here?
codex-rs/core/src/seatbelt_tests.rs
Outdated
| .map(std::string::ToString::to_string) | ||
| .collect(); | ||
| let args = | ||
| create_seatbelt_command_args(shell_command, &policy, repo_root.as_path(), false, None); |
There was a problem hiding this comment.
This is a reminder to myself that we really need to get away from SandboxPolicy in favor of FileSystemPermissions.
| @@ -797,6 +812,82 @@ fn create_seatbelt_args_with_read_only_git_and_codex_subpaths() { | |||
| ); | |||
| } | |||
|
|
|||
| #[test] | |||
| fn create_seatbelt_args_block_first_time_dot_codex_creation() { | |||
There was a problem hiding this comment.
I really like this test, though I feel it should ideally live in codex-rs/exec/tests/suite/sandbox.rs because both our Mac and Linux sandboxing should be enforcing this, right?
| if top_level_codex.as_path().is_dir() { | ||
| subpaths.push(top_level_codex); | ||
| } | ||
| #[allow(clippy::expect_used)] |
There was a problem hiding this comment.
For these, we should check to see if the user has an existing rule.
Though I'll be honest: I think I need to go back and change FileSystemPermissions to use a map like we do in the config file.
There was a problem hiding this comment.
Updated!
I think that would be helpful but would prefer to keep this PR focused
7319031 to
c121825
Compare
ead5148 to
aa6bf08
Compare
Reserve missing top-level project .codex paths in the protocol layer so first-time creation still goes through protected-path approval. Update macOS Seatbelt to deny both the exact protected path and descendants, and add regression coverage for protocol, seatbelt, and apply_patch safety behavior.
Update codex-rs/protocol/src/permissions.rs to skip proactive cwd-root protection when legacy workspace-write conversion receives a relative cwd and add regression coverage for the relative-path case.
Co-authored-by: Michael Bolin <mbolin@openai.com>
Rename Seatbelt excluded-subpath parameters to avoid read-only wording, tighten the generated-arg assertions, and keep the first-time .codex Seatbelt unit test focused on policy shape. Move runtime first-time .codex creation coverage into the Unix sandbox integration suite and skip default protocol carveouts when an explicit user rule already exists for that path.
Assert the full writable-root exclusion slice in the full-disk unreadable-root test so the default /.codex carveout and the explicit unreadable carveout are both checked in stable order.
Treat bwrap's masked .codex path as a non-directory instead of requiring the path to be absent, while still asserting that .codex/config.toml was not created.
aa6bf08 to
c2acfa6
Compare
| #[cfg(target_os = "linux")] | ||
| let sandbox_env = match linux_sandbox_test_env().await { | ||
| Some(env) => env, | ||
| None => return, |
There was a problem hiding this comment.
Is this an error: should it panic!?
There was a problem hiding this comment.
I don’t think this should panic.
None here means the Linux capability probe determined that Landlock is not actually enforceable on the current host, so this test should skip rather than fail. The helper is being used as a skip gate, not as an assertion that the environment must support Linux sandbox enforcement.
| "/".to_string(), | ||
| "--ro-bind".to_string(), | ||
| "/dev/null".to_string(), | ||
| "/.codex".to_string(), |
There was a problem hiding this comment.
This is .codex in the filesystem root?
There was a problem hiding this comment.
Incidentally, I find this hard to read: we should probably have a comment before each group of three to clarify what it's doing?
There was a problem hiding this comment.
Yes. In this test the writable root is /, so the default protected subpath under that root shows up as /.codex in the generated bwrap args.
Added some comments!
| #[allow(clippy::expect_used)] | ||
| let top_level_agents = writable_root.join(".agents").expect("valid relative path"); | ||
| if top_level_agents.as_path().is_dir() { | ||
| subpaths.push(top_level_agents); | ||
| } |
There was a problem hiding this comment.
Can/should we treat .agents the same way if it is missing as we do with .codex?
There was a problem hiding this comment.
Afaik, we don't exec code directly out of .agents the same way MCP's can be spawned in config.toml. The config allows a sandbox-bypass.
| .iter() | ||
| .map(std::string::ToString::to_string) | ||
| .collect(); | ||
| let args = create_seatbelt_command_args_with_extensions( |
There was a problem hiding this comment.
Should we run the command and verify that it's blocked? Or that's covered by codex-rs/exec/tests/suite/sandbox.rs?
There was a problem hiding this comment.
That runtime behavior is covered by codex-rs/exec/tests/suite/sandbox.rs.
The Seatbelt test here is intentionally narrower: it only checks the policy shape for denying both the exact .codex path and descendants. The end-to-end “command is actually blocked” coverage lives in the exec sandbox test.
| &policy, | ||
| repo_root.as_path(), | ||
| /*enforce_managed_network*/ false, | ||
| None, |
There was a problem hiding this comment.
Hmm, why doesn't the linter flag this None as needing a comment. Will have to ask Codex. I've seen this other times, too.
There was a problem hiding this comment.
I think this is just a lint gap rather than an intentional exemption.
I added a comment for clarity.
Problem
Codex already treated an existing top-level project
./.codexdirectory as protected, but there was a gap on first creation.If
./.codexdid not exist yet, a turn could create files under it, such as./.codex/config.toml, without going through the same approval path as later modifications. That meant the initial write could bypass the intended protection for project-local Codex state.What this changes
This PR closes that first-creation gap in the Unix enforcement layers:
codex-protocol./.codexpath as a protected carveout even when it does not exist yet./.codexitself is blocked in addition to writes inside it./.codex./.codexand explicit-rule collisions./.codexcreationScope
This change is intentionally scoped to protecting the top-level project
.codexsubtree from agent writes.It does not make
.codexunreadable, and it does not change the product behavior around loading project skills from.codexwhen project config is untrusted.Why this shape
The fix is pointed rather than broad:
.codexis protected from writes”Validation
cargo test -p codex-protocolcargo test -p codex-sandboxing seatbeltcargo test -p codex-exec --test all sandbox_blocks_first_time_dot_codex_creation -- --nocapture