Conversation
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: a547507e48
ℹ️ About Codex in GitHub
Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".
| &["Write-Output", "(Set-Content", "foo6.txt", "abc)"][..], | ||
| &["Write-Host", "(Remove-Item", "foo.txt)"][..], | ||
| &["Get-Content", "(New-Item", "bar.txt)"][..], |
There was a problem hiding this comment.
Keep nested side-effect cases on the parser code path
These cases were moved from is_safe_command_windows(..., "-Command", script) to is_safe_powershell_command with hand-built tokens, so they no longer validate the AST parser + safelist integration that production uses. If parser tokenization changes for parenthesized expressions (for example, returning one combined argument), this test can still pass while is_safe_command_windows starts allowing a side-effecting script. Please keep these nested-expression examples in a parser-path test.
Useful? React with 👍 / 👎.
| for words in [ | ||
| &["git", "-c", "core.pager=cat", "show", "HEAD:foo.rs"][..], | ||
| &[ |
There was a problem hiding this comment.
Preserve end-to-end test for unsafe git global options
Testing these only as pre-tokenized words bypasses PowerShell parsing, which is where option splitting/normalization happens for real -Command input. A regression in parser output (e.g., collapsing option+value or dropping a token) could make an unsafe git invocation appear safe, and this test would no longer catch it. At least one representative case should still go through is_safe_command_windows with a script string.
Useful? React with 👍 / 👎.
|
This was not the right fix: this is "cheating" in that it does the work for the powershell parser and therefore no longer verifies the powershell parser. Abandoning in favor of #16057. |
## Why `//codex-rs/shell-command:shell-command-unit-tests` became a real bottleneck in the Windows Bazel lane because repeated calls to `is_safe_command_windows()` were starting a fresh PowerShell parser process for every `powershell.exe -Command ...` assertion. PR #16056 was motivated by that same bottleneck, but its test-only shortcut was the wrong layer to optimize because it weakened the end-to-end guarantee that our runtime path really asks PowerShell to parse the command the way we expect. This PR attacks the actual cost center instead: it keeps the real PowerShell parser in the loop, but turns that parser into a long-lived helper process so both tests and the runtime safe-command path can reuse it across many requests. ## What Changed - add `shell-command/src/command_safety/powershell_parser.rs`, which keeps one mutex-protected parser process per PowerShell executable path and speaks a simple JSON-over-stdio request/response protocol - turn `shell-command/src/command_safety/powershell_parser.ps1` into a long-running parser server with comments explaining the protocol, the AST-shape restrictions, and why unsupported constructs are rejected conservatively - keep request ids and a one-time respawn path so a dead or desynchronized cached child fails closed instead of silently returning mixed parser output - preserve separate parser processes for `powershell.exe` and `pwsh.exe`, since they do not accept the same language surface - avoid a direct `PipelineChainAst` type reference in the PowerShell script so the parser service still runs under Windows PowerShell 5.1 as well as newer `pwsh` - make `shell-command/src/command_safety/windows_safe_commands.rs` delegate to the new parser utility instead of spawning a fresh PowerShell process for every parse - add a Windows-only unit test that exercises multiple sequential requests against the same parser process ## Testing - adds a Windows-only parser-reuse unit test in `powershell_parser.rs` - the main end-to-end verification for this change is the Windows CI lane, because the new service depends on real `powershell.exe` / `pwsh.exe` behavior
Why
//codex-rs/shell-command:shell-command-unit-testsbecame a real bottleneck in the Windows Bazel lane because thewindows_safe_commandstests were repeatedly callingis_safe_command_windows(), and that helper shells out to a real PowerShell parser process for everypowershell.exe -Command ...assertion.In the Windows Bazel run that motivated this extraction, the target hit Bazel's 300s timeout while several tests in this module were still reported as "running for over 60 seconds". That makes this a key performance improvement: it removes repeated PowerShell startup cost from a hot unit-test target without weakening the safety checks we actually care about, which should materially improve the Windows Bazel feedback loop.
This change was split out of #16053 so it can land immediately on its own.
What
is_safe_powershell_command()directly on already-parsed wordsis_safe_command_windows(), but limit it to cases that still need a real PowerShell AST parseThe net effect is that we still verify both layers:
but we stop paying interpreter startup cost for every single negative test case.
Testing
cargo test -p codex-shell-commandjust argument-comment-lint