Skip to content

[ci] Bug Fix: Trusted-publishing follow-ups (TTY-aware setup script, unify nightly, drop NPM_TOKEN)#8597

Merged
etrepum merged 6 commits into
mainfrom
claude/trusted-release-workflow
May 31, 2026
Merged

[ci] Bug Fix: Trusted-publishing follow-ups (TTY-aware setup script, unify nightly, drop NPM_TOKEN)#8597
etrepum merged 6 commits into
mainfrom
claude/trusted-release-workflow

Conversation

@etrepum

@etrepum etrepum commented May 30, 2026

Copy link
Copy Markdown
Collaborator

Description

Follow-up to #8587. The first real dev-channel test publish under OIDC failed end-to-end because the helper script's pre-check was wrong, npm enforces a constraint I didn't account for, and the dual-auth scaffolding hid the actual failure mode. Three changes:

1. Fix setup-trusted-publishing.mjs pre-check (scripts/npm/setup-trusted-publishing.mjs)

pnpm run setup-trusted-publishing --setup-trust --replace was supposed to detect each package's existing call-release.yml trust config, revoke it, and add the new pre-release.yml one. Instead it reported unable to check for every package, fell back to "try anyway", which skipped the revoke step, and returned E409 Conflict for all 31 packages.

The pre-check shells out to npm trust list <pkg> --json to read the existing config. npm trust list requires auth and can require OTP — I had incorrectly documented it as read-only / no-OTP. npm only runs its interactive web-auth/OTP recovery when both process.stdin.isTTY and process.stdout.isTTY are true. The pre-check was capturing stdout (necessary to parse JSON), which makes stdout a non-TTY, so npm fell through to throwing EOTP immediately. The script's catch then collapsed every distinct failure into a single "unable to check" log line, hiding the real cause.

Fix:

  • New warmTrustAuth(): runs one npm trust list <firstPkg> with stdio: 'inherit' (full TTY) before the per-package loop. Lets the user complete web-auth once and choose "Skip 2FA for the next 5 minutes"; subsequent captured reads then succeed inside the window because they hit the warmed session instead of a fresh OTP challenge.
  • fetchTrustConfigs now returns {configs} or {error} instead of an array-or-null, and the loop prints the real npm message (unable to check (401 Unauthorized …)) instead of a bare string.
  • Deleted the old hand-rolled token resolver (fetchAuthToken, readNpmrcAuthTokens, escapedName) that read ~/.npmrc manually — npm's own auth resolution is what works, and that's what's used now via the CLI.

2. Unify the nightly schedule into pre-release.yml

After running real test publishes I confirmed two npm constraints that change the workflow shape:

  • npm matches trust against the OIDC workflow_ref claim (the calling workflow), same as PyPI. The earlier setup pointed trust at the reusable call-release.yml, which never appears in that claim, so every publish 404'd. The setup script now defaults --workflow to pre-release.yml.
  • npm enforces one trust configuration per package. POSTing a second one (even for a different filename) returns E409 while the old one is present. The setup script's --replace flag now revokes the existing non-matching config by id and adds the new one, so the migration from call-release.yml to pre-release.yml is a single command instead of 30+ manual revoke/add pairs.

To make the one-config rule actually workable, every publish has to come through the same caller workflow. Deleted nightly-release.yml and folded its schedule: trigger into pre-release.yml, branching on github.event_name:

  • schedule path hardcodes nightly behavior (increment-version=true, channel=nightly, ref=main, ignore-previously-published=true)
  • workflow_dispatch path threads the inputs through as before
  • guard job's "no increment-version + latest" check is restricted to workflow_dispatch (schedule is always nightly)
  • channel choice list adds nightly so a manual nightly retry is possible

3. Drop NPM_TOKEN fallback and the use-trusted-publishing toggle

Now that OIDC is confirmed working, the dual-auth scaffolding is dead weight that hides failures. Removed from call-release.yml and pre-release.yml:

  • use-trusted-publishing workflow_dispatch input (forced OIDC when NPM_TOKEN was set)
  • secrets.NPM_TOKEN declaration on call-release.yml and the passthrough from pre-release.yml
  • The Configure npm auth step (NODE_AUTH_TOKEN=$NPM_TOKEN setup, mode-detection logging)
  • The Verify npm authentication step (npm whoami preflight — only meaningful for the persistent-token path; OIDC mints a per-call token at publish time)

What stays: permissions: id-token: write, actions/setup-node with registry-url, and pnpm publish --provenance. A misconfigured trust setup now fails loudly at publish instead of silently falling through to token auth.

Defaults review

Confirmed pre-release.yml's workflow_dispatch defaults still match step 4 of the Release Procedure ("publish to NPM after the version PR is merged to main"):

Input Default Step-4 expectation
ref main publish whatever main has now
channel latest real release goes to default dist-tag
increment-version unchecked version.yml already bumped on the merged PR
ignore-previously-published unchecked fail loudly on duplicate version

So a normal release run is "Actions → Publish to NPM → Run workflow → Run", no toggling.

Test plan

Before

  • pnpm run setup-trusted-publishing --setup-trust --replace reported unable to check; will try anyway for every package, skipped the revoke step, and got E409 on every add.
  • The dev-channel manual test publish from a PR branch produced 31 E404 Not Found - PUT https://registry.npmjs.org/<pkg> errors after provenance signing succeeded — npm matched the OIDC token's workflow_ref (pre-release.yml) against a stored config that said call-release.yml.
  • The publish workflow accepted NPM_TOKEN as a silent fallback, so a misconfigured OIDC setup wouldn't necessarily be visible.

After

  • pnpm install --frozen-lockfile clean on pnpm 10.34.1
  • pnpm run ci-check — green (tsc, tsc-scripts, tsc-extension, tsc-website, flow, prettier, lint)
  • Both edited workflow YAMLs parse cleanly (yaml.safe_load)
  • scripts/npm/setup-trusted-publishing.mjsnode --check clean; symbol audit shows zero orphan helpers from the old token-resolver path
  • pnpm run setup-trusted-publishing --setup-trust --replace on a laptop with npm whoami working: warmTrustAuth opens web-auth once; after ticking "Skip 2FA for 5 minutes" the per-package loop reports CONFLICT: existing config(s) for [call-release.yml] for each, revokes them, and registers pre-release.yml. Subsequent runs report already configured for all.
  • Re-run the dev-channel test publish (Actions → Publish to NPM, branch + ref = PR branch, channel = dev, increment-version = ✓) — should publish 0.X.Y-dev.0 under the dev dist-tag via OIDC with provenance; latest untouched.
  • Confirm the next scheduled nightly (Mon 02:30 UTC) goes through pre-release.yml's schedule path with event_name == 'schedule' and produces the same 0.X.Y-nightly.YYYYMMDD artifact as before.
…'s own auth)

The pre-check resolved its own npm token by hand (NPM_TOKEN env, then
parsing ./.npmrc and ~/.npmrc, then `npm config get`) and made a raw
fetch() to /-/package/<name>/trust. After a browser-based `npm login`
on macOS that lookup comes up empty, so every package returned "unable
to check; will try anyway" with existing=[]. In --replace mode that
empty existing meant nothing got revoked, so `npm trust github` then
hit E409 against the still-present call-release.yml config for every
package — the migration never happened.

Replace the hand-rolled token + fetch with `npm trust list <pkg>
--json`, which uses npm's own auth resolution (the same path npm whoami
and npm trust github already use) and is a read, so no OTP. Parse
defensively: 404 / empty => no config, JSON error body or non-zero
exit with stderr => unknown, otherwise the returned config(s).

config matching reads both the flattened shape from `npm trust list
--json` ({file, repository, environment}) and the raw registry shape
({claims:{workflow_ref:{file}}}) via configFile/configRepo/
configEnvironment helpers, so conflicts are detected and their ids are
available for --replace to revoke. Removes the now-unused
fetchAuthToken / readNpmrcAuthTokens / escapedName helpers and warns
when any package couldn't be pre-checked.

https://claude.ai/code/session_01UEAeW4ZkfApZnrJTGc6Xm9
@meta-cla meta-cla Bot added the CLA Signed This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed. label May 30, 2026
@vercel

vercel Bot commented May 30, 2026

Copy link
Copy Markdown

The latest updates on your projects. Learn more about Vercel for GitHub.

Project Deployment Actions Updated (UTC)
lexical Ready Ready Preview, Comment May 30, 2026 4:52pm
lexical-playground Ready Ready Preview, Comment May 30, 2026 4:52pm

Request Review

claude added 2 commits May 30, 2026 16:15
Root cause of the persistent "unable to check" for every package:
`npm trust list` DOES require auth and may require OTP. npm only runs
its interactive web-auth/OTP recovery when both stdin AND stdout are
TTYs (see otplease in npm's source). Both the original raw-fetch
pre-check and the `npm trust list` rewrite ran under exec()/piped
stdio, so npm hit the OTP branch, couldn't recover, and threw — which
the pre-check swallowed into "unable to check". (Run by hand in a
terminal it works because both streams are TTYs and the cached "skip
2FA" session satisfies it instantly.)

Fixes:
- Add warmTrustAuth(): one fully-interactive `npm trust list` (stdio
  inherited) before the per-package loop so npm can do web-auth and the
  user can pick "skip 2FA for 5 minutes". The captured reads that
  follow then succeed within the window.
- runNpm() gains captureStdout so the same spawn helper can both stream
  the OTP URL (inherited) and capture stdout for JSON parsing.
- fetchTrustConfigs() returns {configs}|{error} instead of array|null,
  and the caller prints the real error (e.g. the npm stderr summary)
  instead of a bare "unable to check", so failures are never silent
  again.

https://claude.ai/code/session_01UEAeW4ZkfApZnrJTGc6Xm9
…stConfigs

Follow-up to the TTY/OTP fix — the JSDoc still claimed `npm trust list`
never triggers OTP, which is exactly the wrong assumption that caused
the bug. Document that it captures stdout and relies on warmTrustAuth
having established the session.

https://claude.ai/code/session_01UEAeW4ZkfApZnrJTGc6Xm9
claude added 3 commits May 30, 2026 16:40
…orkflow)

npm enforces one trust configuration per package, so every publish has
to come through the same caller workflow for OIDC to keep working.
This was done before but was lost when the previous PR landed as a
squash that kept nightly-release.yml — redoing it on top of main.

Delete nightly-release.yml; fold its schedule trigger into
pre-release.yml with a github.event_name branch that hardcodes the
nightly behavior on the schedule path (increment-version=true,
channel=nightly, ref=main, ignore-previously-published=true) and
otherwise threads inputs through.

The channel choice list adds 'nightly' for manual dispatch; the guard
job's "increment-version + latest is forbidden" check is restricted
to workflow_dispatch (the schedule path always uses nightly).

https://claude.ai/code/session_01UEAeW4ZkfApZnrJTGc6Xm9
Trusted publishing is confirmed working, so the dual-auth scaffolding
is no longer needed. Removed:

  - \`use-trusted-publishing\` workflow_dispatch input on
    pre-release.yml and the matching input on call-release.yml
  - \`secrets.NPM_TOKEN\` declaration on call-release.yml and the
    passthrough from pre-release.yml
  - \`Configure npm auth\` step in call-release.yml's publish job
    (set NODE_AUTH_TOKEN from NPM_TOKEN, log which mode is active)
  - \`Verify npm authentication\` step (the \`npm whoami\` preflight
    only checked the persistent NPM_TOKEN; OIDC mints a per-call
    token at publish time)
  - Mentions of NPM_TOKEN / use-trusted-publishing in the
    maintainers' guide and setup-trusted-publishing.mjs error
    message

The publish job still grants \`id-token: write\` (for OIDC) and the
setup-node \`registry-url\` setting (writes the registry into .npmrc
so pnpm publish targets the right host). \`pnpm publish
--provenance\` does the rest.

Misconfigured trust now fails loudly at publish time instead of
silently falling through to token auth.

https://claude.ai/code/session_01UEAeW4ZkfApZnrJTGc6Xm9
Picks up #8593 (linear-time regexes / prototype-pollution guards /
faster CSS parsing) and #8598 (TabNode.setTextContent Safari IME
fix) — both touch lexical/extension/markdown/playground/yjs and
don't intersect with the release tooling on this branch, so the
merge is clean.

https://claude.ai/code/session_01UEAeW4ZkfApZnrJTGc6Xm9
@etrepum etrepum added the extended-tests Run extended e2e tests on a PR label May 30, 2026
@etrepum etrepum changed the title [ci] setup-trusted-publishing: pre-check via npm trust list (uses npm's own auth) May 30, 2026
@etrepum etrepum marked this pull request as ready for review May 30, 2026 16:55
@etrepum etrepum enabled auto-merge May 30, 2026 17:14
@potatowagon

Copy link
Copy Markdown
Contributor

Review by Navi (potatowagon's AI assistant)

This is a well-structured follow-up to #8587 that resolves real issues encountered during the first OIDC test publish. I reviewed the full 900-line diff across workflow YAML and the setup script.

What I verified:

  1. Workflow consolidation logic is correctpre-release.yml now handles both scheduled nightly and manual dispatch via github.event_name branching. The schedule path hardcodes nightly defaults (increment-version=true, channel=nightly, ref=main, ignore-previously-published=true), while dispatch threads inputs through unchanged. The guard job's "no increment + latest" check is correctly gated to workflow_dispatch only.

  2. NPM_TOKEN removal is safe — since OIDC is confirmed working, removing the dual-auth path eliminates a failure mode where misconfigured trust silently falls through to token auth. The id-token: write permission and registry-url setup-node config are preserved (both required for OIDC).

  3. warmTrustAuth() TTY fix addresses the root cause — npm's OTP recovery requires both stdin and stdout to be TTYs. The pre-check was piping stdout to parse JSON, which broke OTP. Running one fully-interactive read first (stdio: 'inherit') lets the user auth once, then captured reads succeed within the "skip 2FA for 5 minutes" window. Clever fix.

  4. fetchTrustConfigs() refactor — replaces hand-rolled .npmrc parsing with shelling out to npm trust list --json, which uses npm's own auth resolution. Returns {configs} or {error} instead of the old null sentinel, so failures are never silently swallowed. Handles 404 (no config), JSON parse errors, and non-zero exits distinctly.

  5. --replace flow — correctly detects conflicting configs (different workflow file), revokes by id, then registers the new one. The E409 failure mode from the first attempt is fully addressed.

  6. Fork guardgithub.repository_owner == 'facebook' added to the release job prevents the scheduled trigger from running on forks. Correct placement.

  7. Documentation updates — maintainers guide reflects the simplified workflow (no more use-trusted-publishing toggle, no NPM_TOKEN mentions).

CI status: ALL GREEN (30+ jobs including all e2e matrices, core tests, integration tests, CodeQL).

Verdict: Safe to approve. The changes are logically sound, CI confirms no regressions, and the test plan in the PR description is thorough (real E2E validation on the npm registry).

@etrepum etrepum added this pull request to the merge queue May 31, 2026
Merged via the queue into main with commit b08ebf7 May 31, 2026
43 checks passed

@potatowagon potatowagon left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewed by Navi (Tater Thoughts Bobblehead) on behalf of @potatowagon.

LGTM ✅ — CI infrastructure cleanup: consolidates release workflows and fully adopts npm trusted publishing.

What this does:

  1. Removes the separate nightly-release.yml workflow — merges nightly schedule into pre-release.yml (now the single entry point for all publishes).
  2. Removes the NPM_TOKEN secret and use-trusted-publishing input from call-release.yml — OIDC trusted publishing is now the ONLY auth method.
  3. Removes the "Configure npm auth" and "Verify npm authentication" steps (no longer needed with OIDC).
  4. Adds id-token: write permission to the publish job (required for OIDC).
  5. Adds github.repository_owner == 'facebook' guard to prevent forks from running scheduled publishes.

What I checked:

  • ✅ Logic correctness: npm trusted publishing via OIDC requires exactly: (a) id-token: write permission, (b) the workflow file path matching what's configured on npmjs.com. By consolidating to a single workflow file, all publishes share one OIDC identity — correct approach since npm allows only one trusted-publisher config per package.
  • ✅ Schedule preserved: The nightly cron: '30 2 * * 1-5' schedule is preserved in the consolidated workflow. The schedule path correctly forces channel=nightly, increment-version=true, ref=main, ignore-previously-published=true.
  • ✅ Guard logic correct: The "refuse increment+latest" guard now correctly gates on github.event_name == 'workflow_dispatch' so scheduled runs don't hit it.
  • ✅ Fork protection: github.repository_owner == 'facebook' in the release job prevents scheduled trigger from running on forks.
  • ✅ Backward compat: The NPM_TOKEN removal only affects this workflow (CI infra). No library API changes. No www impact.
  • ✅ CI: All checks green (including CodeQL).

Safe to land. Clean consolidation of release infrastructure with proper OIDC adoption.

@etrepum etrepum mentioned this pull request Jun 25, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

CLA Signed This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed. extended-tests Run extended e2e tests on a PR

4 participants