Skip to content

shell-command: speed up Windows PowerShell safety tests#16056

Closed
bolinfest wants to merge 1 commit intomainfrom
pr16056
Closed

shell-command: speed up Windows PowerShell safety tests#16056
bolinfest wants to merge 1 commit intomainfrom
pr16056

Conversation

@bolinfest
Copy link
Copy Markdown
Collaborator

@bolinfest bolinfest commented Mar 27, 2026

Why

//codex-rs/shell-command:shell-command-unit-tests became a real bottleneck in the Windows Bazel lane because the windows_safe_commands tests were repeatedly calling is_safe_command_windows(), and that helper shells out to a real PowerShell parser process for every powershell.exe -Command ... assertion.

In the Windows Bazel run that motivated this extraction, the target hit Bazel's 300s timeout while several tests in this module were still reported as "running for over 60 seconds". That makes this a key performance improvement: it removes repeated PowerShell startup cost from a hot unit-test target without weakening the safety checks we actually care about, which should materially improve the Windows Bazel feedback loop.

This change was split out of #16053 so it can land immediately on its own.

What

  • add a small test helper that asserts is_safe_powershell_command() directly on already-parsed words
  • move parser-independent negative cases, including the unsafe git global override checks and obvious side-effecting PowerShell commands, onto that direct helper path
  • keep the true parser-integration coverage on is_safe_command_windows(), but limit it to cases that still need a real PowerShell AST parse

The net effect is that we still verify both layers:

  • the parser-independent safelist logic
  • the PowerShell parser integration points

but we stop paying interpreter startup cost for every single negative test case.

Testing

  • cargo test -p codex-shell-command
  • just argument-comment-lint
  • the changed test module is Windows-only, so the main end-to-end verification for the extracted improvement is this PR's Windows Bazel CI
@bolinfest bolinfest changed the title fix: speed up test that uses PowerShell Mar 27, 2026
Copy link
Copy Markdown
Contributor

@chatgpt-codex-connector chatgpt-codex-connector bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: a547507e48

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

Comment on lines +488 to +490
&["Write-Output", "(Set-Content", "foo6.txt", "abc)"][..],
&["Write-Host", "(Remove-Item", "foo.txt)"][..],
&["Get-Content", "(New-Item", "bar.txt)"][..],
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 Badge Keep nested side-effect cases on the parser code path

These cases were moved from is_safe_command_windows(..., "-Command", script) to is_safe_powershell_command with hand-built tokens, so they no longer validate the AST parser + safelist integration that production uses. If parser tokenization changes for parenthesized expressions (for example, returning one combined argument), this test can still pass while is_safe_command_windows starts allowing a side-effecting script. Please keep these nested-expression examples in a parser-path test.

Useful? React with 👍 / 👎.

Comment on lines +451 to +453
for words in [
&["git", "-c", "core.pager=cat", "show", "HEAD:foo.rs"][..],
&[
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 Badge Preserve end-to-end test for unsafe git global options

Testing these only as pre-tokenized words bypasses PowerShell parsing, which is where option splitting/normalization happens for real -Command input. A regression in parser output (e.g., collapsing option+value or dropping a token) could make an unsafe git invocation appear safe, and this test would no longer catch it. At least one representative case should still go through is_safe_command_windows with a script string.

Useful? React with 👍 / 👎.

@bolinfest
Copy link
Copy Markdown
Collaborator Author

This was not the right fix: this is "cheating" in that it does the work for the powershell parser and therefore no longer verifies the powershell parser. Abandoning in favor of #16057.

@bolinfest bolinfest closed this Mar 28, 2026
bolinfest added a commit that referenced this pull request Mar 28, 2026
## Why

`//codex-rs/shell-command:shell-command-unit-tests` became a real
bottleneck in the Windows Bazel lane because repeated calls to
`is_safe_command_windows()` were starting a fresh PowerShell parser
process for every `powershell.exe -Command ...` assertion.

PR #16056 was motivated by that same bottleneck, but its test-only
shortcut was the wrong layer to optimize because it weakened the
end-to-end guarantee that our runtime path really asks PowerShell to
parse the command the way we expect.

This PR attacks the actual cost center instead: it keeps the real
PowerShell parser in the loop, but turns that parser into a long-lived
helper process so both tests and the runtime safe-command path can reuse
it across many requests.

## What Changed

- add `shell-command/src/command_safety/powershell_parser.rs`, which
keeps one mutex-protected parser process per PowerShell executable path
and speaks a simple JSON-over-stdio request/response protocol
- turn `shell-command/src/command_safety/powershell_parser.ps1` into a
long-running parser server with comments explaining the protocol, the
AST-shape restrictions, and why unsupported constructs are rejected
conservatively
- keep request ids and a one-time respawn path so a dead or
desynchronized cached child fails closed instead of silently returning
mixed parser output
- preserve separate parser processes for `powershell.exe` and
`pwsh.exe`, since they do not accept the same language surface
- avoid a direct `PipelineChainAst` type reference in the PowerShell
script so the parser service still runs under Windows PowerShell 5.1 as
well as newer `pwsh`
- make `shell-command/src/command_safety/windows_safe_commands.rs`
delegate to the new parser utility instead of spawning a fresh
PowerShell process for every parse
- add a Windows-only unit test that exercises multiple sequential
requests against the same parser process

## Testing

- adds a Windows-only parser-reuse unit test in `powershell_parser.rs`
- the main end-to-end verification for this change is the Windows CI
lane, because the new service depends on real `powershell.exe` /
`pwsh.exe` behavior
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

1 participant