Skip to content

fix(lexical): reconcile DOM mutations targeting the root element#8613

Merged
potatowagon merged 2 commits into
facebook:mainfrom
potatowagon:navi/fix/root-targeted-mutation-reconcile
Jun 3, 2026
Merged

fix(lexical): reconcile DOM mutations targeting the root element#8613
potatowagon merged 2 commits into
facebook:mainfrom
potatowagon:navi/fix/root-targeted-mutation-reconcile

Conversation

@potatowagon

Copy link
Copy Markdown
Contributor

Description

#8588 removed the dom === rootElement → RootNode carveout from both $getNearestManagedNodePairFromDOMNode (the MutationObserver flush path) and $getNodeFromDOM, replacing it with a __lexicalKey_* stash set on the root element in resetEditor.

In practice the stash is not reliably observable when a DOM mutation targets the root element directly — which the browser does when typing into a fresh / near-empty contenteditable, before a managed child has received the input. In that window:

  • $getNearestManagedNodePairFromDOMNode walks strayNode → rootElement, finds no key, and (with the carveout removed) returns undefined
  • the mutation is silently skipped
  • the typed text never reconciles into the editor state — no JS exception, the editor just appears to ignore the first input(s)

This surfaced downstream as: type N words into an empty Lexical editor → editor word count stays 0 → a word-count-gated control never enables. The failure is at the typing/reconcile step, not at any later editing step.

Fix

Restore the root-element fallback in both functions as a belt-and-suspenders guard:

  • $getNearestManagedNodePairFromDOMNode: re-add the rootElement param and the else if (dom === rootElement) → [rootElement, internalGetRoot(editorState)] branch.
  • $getNodeFromDOM: re-add the dom === editor.getRootElement() → $getNodeByKey('root') fallback when the keyed tree walk misses.

The __lexicalKey_* root stash and captureSelection behavior from #8588 are preserved untouched — the fallback only fires when the stash lookup misses on a root-targeted node, so there's no regression to the root-key-stash / captureSelection path.

Test plan

Adds LexicalRootMutation.test.tsx:

  1. Asserts the root element carries the __lexicalKey_root stash after mount ([Breaking Change][lexical] Feature: Generalize captured selection via setDOMUnmanaged({captureSelection}) + root __lexicalKey_* stash #8588 invariant).
  2. Drives a root-targeted childList mutation with the root key stash cleared (simulating the real-world window where the stash isn't observable on the mutation target) and asserts the foreign DOM is reverted — i.e. the mutation resolves to the RootNode.

The second test fails on current main (foreign DOM left in place → mutation skipped) and passes with this change.

  • npx vitest run --project unit packages/lexical/src/__tests__/unit/ → all 568 unit tests pass (incl. the new ones).
  • tsc -p tsconfig.build.json --noEmit → clean.
  • prettier + eslint clean (via lint-staged).

cc @mayrang (author of #8588 / #8519) — would appreciate a sanity check that the fallback doesn't conflict with the intended root-key-stash design; happy to instead make the stash lookup robust at the source if you'd prefer that over the carveout.

facebook#8588 removed the `dom === rootElement` -> RootNode carveout from both
`$getNearestManagedNodePairFromDOMNode` (mutation flush) and
`$getNodeFromDOM`, replacing it with a `__lexicalKey_*` stash set on the
root element in `resetEditor`. When a DOM mutation targets the root
element directly — which the browser does when typing into a fresh /
near-empty contenteditable, before a managed child has received the
input — the stash is not reliably observable on the mutation target, so
`$getNearestManagedNodePairFromDOMNode` returns undefined, the mutation
is silently skipped, and the typed text never reconciles into the editor
state (e.g. a word-count gate stays at 0 with no JS exception).

Restore the root-element fallback in both functions as a belt-and-
suspenders guard. The `__lexicalKey_*` stash from facebook#8588 is preserved, so
the captureSelection / root-key-stash behavior is unchanged; the fallback
only fires when the stash lookup misses on a root-targeted node.

Adds a regression test that drives a root-targeted childList mutation
with the root key stash cleared and asserts the foreign DOM is reverted
(i.e. the mutation resolves to the RootNode). The test fails without this
change and passes with it.
@vercel

vercel Bot commented Jun 2, 2026

Copy link
Copy Markdown

The latest updates on your projects. Learn more about Vercel for GitHub.

Project Deployment Actions Updated (UTC)
lexical Ready Ready Preview, Comment Jun 2, 2026 4:34pm
lexical-playground Ready Ready Preview, Comment Jun 2, 2026 4:34pm

Request Review

@meta-cla meta-cla Bot added the CLA Signed This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed. label Jun 2, 2026
@mayrang

mayrang commented Jun 2, 2026

Copy link
Copy Markdown
Contributor

Author of #8588 here, did the sanity check you asked for.

I don't doubt you hit this in your app, but I can't reproduce it here — not the mechanism, and not even the symptom (first input dropped, word count stuck at 0). On a normally-mounted editor the root carries the __lexicalKey_* stash, and on current main a root-targeted childList mutation already resolves to the RootNode and reverts the foreign DOM with no carveout. I confirmed by appending a stray text node directly to the root without touching the stash, and it reverts on main as-is. The new test only fails because it calls clearNodeKeyOnDOMNode first to reach a "stash absent" state, which the typing flow never does. I also couldn't find a path that clears the stash during normal typing (only resetEditor on a root swap does), and the stash is set before the observer attaches, so there's no mount-time race.

Could you share a repro where the stash is genuinely absent through the real typing/mount flow (or the app scenario you hit)? If it truly can't be observed there, knowing why is what decides between making the lookup robust at the source vs re-adding the carveout. This carveout re-introduces the per-site root special-case that #8588 removed to unify on the single key lookup, so I'd want to confirm the stash is actually unobservable before reverting that.

@potatowagon

Copy link
Copy Markdown
Contributor Author

Status + a verification ask.

Quick recap of where this stands, since I want to be transparent about what's confirmed vs. still open:

Confirmed (culprit + mechanism):

Confirmed downstream: the internal e2e that triggered the auto-revert reproduces on a clean OnDemand against the synced (post-#8588) Lexical — typing into a fresh editor doesn't register, leaving a word-count-gated control disabled. Exactly the signature.

Where I'm blocked: I haven't been able to verify the fix-forward against that internal e2e locally. The synced Lexical in www is a generated bundle, and hand-patching it doesn't take effect — the dev server serves from a different source root than my checkout, so the change has to flow through the actual Lexical→www sync to be exercised by the e2e. We're keeping the sync reverted, so I can't easily re-run the failing test with the fix applied without re-syncing.

Ask: does the JSDOM-level repro here look like it captures the right invariant to you, given you authored the stash? If you're confident in the root-mapping restoration, I'd propose we land this and let the next sync attempt serve as the e2e confirmation. If you'd rather make the stash lookup robust at the source instead of re-adding the carveout, happy to revise. Either way I don't want to merge a "fix" that hasn't been run against the actual failing test without flagging that clearly.

cc @mayrang

@mayrang

mayrang commented Jun 2, 2026

Copy link
Copy Markdown
Contributor

Agreed on the mechanism — #8588 swapping the carveout for the root stash isn't in question.

Where I'm stuck is the link between your two pieces of evidence. The test gets to the failing state by clearing the stash up front — exactly the window we're after. What I can't place is a normal typing/mount path that produces that same window: the stash is stamped on the root the moment it's attached and only cleared on a root swap, for the old root. So the test pins down "stash absent → mutation skipped," and the open part is whether the real flow ever reaches stash-absent. The e2e shows the symptom but not the stash state at the mutation, which is the piece that connects the two.

Since the sync is reverted, the cleanest read doesn't need a fix-forward at all: on the failing OnDemand, log whether the root carries __lexicalKey_${editor._key} when the mutation fires (or whether $getNodeFromDOM(root) resolves to the RootNode there). It's a read on the already-broken state.

Present → root-mapping isn't the gap and the cause is elsewhere. Absent → we've found the real window the stash goes missing, and that's what decides re-adding the carveout vs making the lookup robust at the source.

@mayrang

mayrang commented Jun 2, 2026

Copy link
Copy Markdown
Contributor

One refinement on what to log: the same root-key lookup also feeds $handleInput, which early-returns when isDOMCapturingSelection(event.target, editor) is true. That check walks ancestors and stops at the first Lexical-keyed DOM, so if the root key isn't resolvable there the walk escapes the editor — meaning the symptom could surface on the input path before (or without) any mutation. Worth logging what isDOMCapturingSelection returns on the first keystroke alongside the mutation-time check; a mutation-only probe would miss that path. Same reason this points at the lookup itself rather than the mutation-path carveout — isDOMCapturingSelection reads the key directly, not through $getNearestManagedNodePairFromDOMNode.

@mayrang

mayrang commented Jun 2, 2026

Copy link
Copy Markdown
Contributor

Also flagging in case I've framed this wrong — I've been pushing on the stash-absent repro, but I could just as easily be misreading the failure path, or missing something about how #8588 reshaped the lookup. If that's the case, point it out and I'll rework it rather than keep digging on a wrong premise.

@etrepum etrepum added the extended-tests Run extended e2e tests on a PR label Jun 2, 2026

@etrepum etrepum left a comment

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What would be great to see is a reproduction for how the __lexicalKey_${editor._key} prop is reset on a DOM node while preserving the identity of that DOM node. I can only think of two ways:

  • Explicitly clearing it from JavaScript, like this unit test, which would be very strange to do intentionally
  • Mutating the editor._key to some other value after initial mount, also quite unusual

Porting the internal e2e test more directly would be best. If you want to merge and test this with the sync that's fine, but if it does happen to work I think all it's really doing it covering up a different issue.

@etrepum

etrepum commented Jun 2, 2026

Copy link
Copy Markdown
Collaborator

I guess a third potential root cause would be directly manipulating editor._rootElement instead of going through the setter

@potatowagon

Copy link
Copy Markdown
Contributor Author

Thanks for the approve, and for the steady pushback — it kept me honest. Here's where my testing landed.

What I could verify (against committed revisions):

So I have a clean broken↔passing signal — but only across whole committed revisions.

What I could NOT cleanly isolate: "broken sync + only this PR's diff." Our internal Lexical is the generated/bundled sync artifact, and the test harness serves it from the built/committed state, not my working-copy edit to that bundle. I confirmed this the hard way: I dropped both the broken bundle and console.error probes into the working copy, and the run still served the committed (fixed) bundle — test passed, zero probe output. Net: I can't hand-patch the artifact to get either the isolated-fix read or the stash-state instrumentation you (rightly) asked for; that has to ride a real sync.

On your three "how does the stash go missing" candidates (explicit clear / editor._key mutation after mount / direct editor._rootElement assignment bypassing the setter): I couldn't find any of those in the XDS editor path either, which is consistent with your read that the carveout may be masking something rather than fixing root cause. The isDOMCapturingSelection$handleInput early-return remains my leading suspect for "first input dropped," and that path doesn't go through the mutation carveout at all — so if that's it, this PR wouldn't be the real fix.

Proposal, given the above: since I can't reach a strong enough isolated conclusion locally, we'll merge this, carry it on the next sync attempt, and run the real e2e there. If it greens the test, great (with the caveat that it may be covering a deeper issue, which I'll keep chasing via the $handleInput path). If it doesn't fix the regression, we revert this PR — no harm done upstream. I'll report the sync result back here either way.

@mayrang

mayrang commented Jun 2, 2026

Copy link
Copy Markdown
Contributor

Makes sense — with the harness serving the committed artifact, merge-then-sync is the right call, and the revert path keeps it cheap if it doesn't hold.

One thing I'd keep front of mind for that run: the $handleInput early-return on isDOMCapturingSelection short-circuits before the rest of the handler, so if that's where the first input is being dropped, it sits outside the mutation carveout this PR touches — which would make a passing e2e the carveout masking it rather than fixing it. That's still a suspicion I haven't isolated, not a conclusion, but it's the open question I'd want the sync to actually settle.

If it greens, probably worth a targeted test driving that early-return directly, so the thread doesn't get lost behind a passing e2e.

…y-editor typing smoke test

Replace LexicalRootMutation.test.tsx (which manufactured the failing state by
clearing the root `__lexicalKey_*` stash up front — a state normal typing
never reaches) with a black-box test that types into a freshly-mounted empty
editor via real `beforeinput` insertText events and asserts the text
reconciles.

Honest scoping (documented in the test): this guards the basic
"type into an empty editor -> text reconciles" contract, but it does NOT
reproduce the facebook#8588 regression at the jsdom level — under jsdom the root stash
is present and resolution succeeds with or without the rootElement carveout,
so it passes on both pre- and post-facebook#8588 code. The regression only surfaces in
a real browser, consistent with maintainer feedback that the failure depends on
browser-specific mutation/selection timing.

The rootElement-carveout fix in LexicalMutations/LexicalUtils is retained; the
internal sync e2e is the authoritative verification and will confirm or refute it.
@potatowagon

Copy link
Copy Markdown
Contributor Author

Pushed a test change based on your feedback, and it turned up a useful (if slightly anticlimactic) result.

What changed:

  • Removed the white-box LexicalRootMutation.test.tsx — you were both right that manufacturing the failing state via clearNodeKeyOnDOMNode doesn't correspond to anything the real flow does, so it overclaimed.
  • Added LexicalEmptyEditorTyping.test.ts: a black-box test shaped like the internal e2e — mount a fresh empty editor, type words via real beforeinput insertText events, assert the text reconciles and the word count adds up. No internals poking.

The finding: that black-box test passes on both pre- and post-#8588 code. I verified by reverting just the carveout and re-running — still green. So the regression does not reproduce at the jsdom level: under jsdom the root __lexicalKey_* stash is present and resolution succeeds with or without the carveout. That lines up with exactly what you've both been saying — the failure depends on real-browser mutation/selection timing, not the jsdom code path, which is why it only ever showed up in the browser e2e. I've documented that scoping directly in the test so it's not mistaken for a regression guard for this bug; it stands as a behavioral smoke test for the typing contract.

So I'm treating the internal sync e2e as the authoritative check, per the earlier plan: merge → carry on the next sync → if it greens the e2e, keep it (while I keep chasing whether isDOMCapturingSelection/$handleInput is the deeper cause); if it doesn't, revert. I'll report the sync result back here.

Thanks again for the careful review — it kept this honest and stopped me shipping a misleading test.

@potatowagon potatowagon added this pull request to the merge queue Jun 3, 2026
Merged via the queue into facebook:main with commit 4d3598c Jun 3, 2026
35 checks passed
@potatowagon

Copy link
Copy Markdown
Contributor Author

Update: this wasn't the fix, and you both called it. Reverting the carveout in #8616 (keeping the typing smoke test).

The real root cause was #8519, not #8588. Once this merged and rode the sync, the e2e still failed — and the browser console had the actual throw: LEXICAL:ERROR_INTERNAL from $assumeActiveEditor ← $generateHtmlFromNodes ← documentToGFM, fired by a command handler on typing.

The culprit is a downstream serialization helper (our mlcToGFM convertor):

const headless = createHeadlessClone(editor);   // active editor inside the update = headless
headless.update(() => {
  const selection = ...;
  html = $generateHtmlFromNodes(editor, selection);   // ← passes the ORIGINAL editor
});

#8519 made $generateHtmlFromNodes scope-sensitive via $assumeActiveEditor(editor), which throws when the passed editor ≠ the active-update editor. The helper opens a headless.update() scope but passes the original editor → mismatch → throw → the update aborts → typed text never commits → word count stays 0 → the gated button stays disabled. Pre-#8519 this was harmless (no active-editor assertion), so it's classic BC fallout — and the same class as the #8589 shim, just at a call site that shim doesn't cover.

Fix is a one-liner at that call site (pass the active/headless editor). Verified: the failing e2e (both configs) goes green with that fix, and stays green with this PR's carveout reverted.

So exactly as you both said — the carveout was masking nothing and fixing nothing here; the regressor was #8519 surfacing in a downstream caller. Thanks for holding the line on the repro instead of letting me ship it as "the fix." Reverted in #8616.

pull Bot pushed a commit to code/app-lexical that referenced this pull request Jun 3, 2026
@etrepum etrepum mentioned this pull request Jun 25, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

CLA Signed This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed. extended-tests Run extended e2e tests on a PR

3 participants