[lexical][lexical-yjs] Bug Fix: avoid empty-paragraph echo and splice crash on collab undo (#6614) and don't preserve tags on non-dirty update#8651
Conversation
|
Hi @ivanscm! Thank you for your pull request and welcome to our community. Action RequiredIn order to merge any pull request (code, docs, etc.), we require contributors to sign our Contributor License Agreement, and we don't seem to have one on file for you. ProcessIn order for us to review and merge your suggested changes, please sign at https://code.facebook.com/cla. If you are contributing on behalf of someone else (eg your employer), the individual CLA may not be sufficient and your employer may need to sign the corporate CLA. Once the CLA is signed, our tooling will perform checks and validations. Afterwards, the pull request will be tagged with If you have received this in error or have any questions, please contact us at cla@meta.com. Thanks! |
|
The latest updates on your projects. Learn more about Vercel for GitHub.
|
|
Thank you for signing our Contributor License Agreement. We can now accept your code for this (and any) Meta Open Source project. Thanks! |
etrepum
left a comment
There was a problem hiding this comment.
It would be good to see unit tests for this, I understand that the iframe e2e tests are non-deterministic but I don't see why we couldn't put together unit tests for them?
…raph echo (facebook#8651) Adds a regression test that reproduces the empty-paragraph echo on collaborative undo without the synchronous test harness (which serializes away the mid-reconcile transient). Two peers are wired with the exported @lexical/yjs binding infrastructure over a mock async transport; a fuzzer-found, minimized sequence of ordinary edits + undos drives the binding into the buggy state. Without the fix the undo recovery echoes a spurious empty paragraph into the shared document (the test sees 2 paragraphs); with the fix it sees 1. https://claude.ai/code/session_017fxm6RcQegBrGL2kXDJBo8
…st removed nodes
Fuzzing concurrent collaborative editing (2-3 peers, async delivery, edits + undo)
surfaced reconciliation invariants that crash and corrupt the session when an
event arrives for a node that was concurrently removed, or a delta references a
shared type that was concurrently deleted:
- syncChildrenFromYjs / syncPropertiesFromYjs (element & decorator) /
syncPropertiesAndTextFromYjs: getNode() is null
- applyChildrenYjsDelta: an insert references a deleted shared type with no
__type ("Expected shared type to include type attribute")
These are now skipped gracefully (the collab tree still updates and self-heals
when the parent reconciles), so the peers converge instead of throwing. A fuzz
sweep of 1400 runs goes from ~12 crashes to 0 with peers still converging.
This also removes the root cause of the empty-paragraph echo (facebook#8651): that
transient only arose downstream of one of these crashes, so it no longer occurs.
https://claude.ai/code/session_017fxm6RcQegBrGL2kXDJBo8
…urrent-edit crashes, and update-tag desync (facebook#8651) Hardens collaborative editing against three related failure modes, found while building deterministic repros for facebook#8651 and then fuzzing the binding. lexical-yjs: - Avoid the empty-paragraph echo and the splice crash on collab undo. When a top-level paragraph collision empties the root, re-add the paragraph from outside the COLLABORATION/HISTORIC update so it syncs to peers (facebook#8651). - Don't crash reconciling concurrent edits against removed nodes. When an event references a node that was concurrently removed (getNode() is null) or a shared type that was concurrently deleted (no __type), skip it gracefully instead of throwing an invariant ("could not find element node", "Expected shared type to include type attribute", ...) and corrupting the session. lexical: - Clear update tags after every commit, including no-op ones. Tags are added from the `tag` update option independent of dirtiness, and the 'update' listener fires for every commit. Previously a no-op commit left its tags in editor._updateTags, where they leaked into the next update. In collaboration this silently dropped local edits: a remote sync that reconciled to nothing (e.g. it targeted a concurrently-removed node) left its COLLABORATION tag on the peer's next local edit, which syncLexicalUpdateToYjs then skipped -- so the edit never reached Yjs and the peer's live editor diverged from its own persisted document. - This exposed a latent setEditorState bug: it cached _updateTags before an internal force-commit (which now resets it) and added the tag to the stale Set, dropping the tag from the applied state's commit and breaking history's undo classification. Read _updateTags fresh after the force-commit instead. Tests (deterministic, exported-API repros; fail without the fixes): - lexical-react: empty-paragraph echo repro; concurrent-reconcile crash repros; and a dropped-local-edit-after-no-op-remote-sync desync repro, all wiring peers over a mock async @lexical/yjs transport. - lexical: a no-op update must not leak its tag into the next update, and setEditorState must keep its tag when committed from inside an update. https://claude.ai/code/session_017fxm6RcQegBrGL2kXDJBo8
…urrent-edit crashes, and update-tag desync (facebook#8651) Hardens collaborative editing against three related failure modes, found while building deterministic repros for facebook#8651 and then fuzzing the binding. lexical-yjs: - Avoid the empty-paragraph echo and the splice crash on collab undo. When a top-level paragraph collision empties the root, re-add the paragraph from outside the COLLABORATION/HISTORIC update so it syncs to peers (facebook#8651). - Don't crash reconciling concurrent edits against removed nodes. When an event references a node that was concurrently removed (getNode() is null) or a shared type that was concurrently deleted (no __type), skip it gracefully instead of throwing an invariant ("could not find element node", "Expected shared type to include type attribute", ...) and corrupting the session. lexical: - Clear update tags after every commit, including no-op ones. Tags are added from the `tag` update option independent of dirtiness, and the 'update' listener fires for every commit. Previously a no-op commit left its tags in editor._updateTags, where they leaked into the next update. In collaboration this silently dropped local edits: a remote sync that reconciled to nothing (e.g. it targeted a concurrently-removed node) left its COLLABORATION tag on the peer's next local edit, which syncLexicalUpdateToYjs then skipped -- so the edit never reached Yjs and the peer's live editor diverged from its own persisted document. - This exposed a latent setEditorState bug: it cached _updateTags before an internal force-commit (which now resets it) and added the tag to the stale Set, dropping the tag from the applied state's commit and breaking history's undo classification. Read _updateTags fresh after the force-commit instead. Tests (deterministic, exported-API repros; fail without the fixes): - lexical-react: empty-paragraph echo repro; concurrent-reconcile crash repros; and a dropped-local-edit-after-no-op-remote-sync desync repro, all wiring peers over a mock async @lexical/yjs transport. - lexical: a no-op update must not leak its tag into the next update, and setEditorState must keep its tag when committed from inside an update. https://claude.ai/code/session_017fxm6RcQegBrGL2kXDJBo8
Description
Follow-up to #6614 (and the already-merged #6670). Two remaining ways an undo in a
collaborative session corrupts the shared document:
Empty-paragraph echo.
$ensureEditorNotEmpty()runs in the synconUpdateinside its own untagged
editor.update(). During an undo the Lexical root can betransiently empty while the Yjs doc still has content; the untagged insert is then
treated as a local edit and echoed back to Yjs, duplicating the paragraph for other
clients. Fix: only re-add the recovery paragraph when the shared (Yjs) root is
genuinely empty —
binding.root.isEmpty()(V1) /binding.root.length === 0(V2).Splice crash. Reconciling an undo against a concurrent remote edit can reach
CollabElementNode.splicewith no node atindexand nothing to insert, whichthrows
invariant(Error #94: splice: could not find collab element node) andcorrupts the session. Fix: treat that state as a no-op recovery instead of crashing.
Seeded, delta-minimized fuzzing of the binding surfaced two further corruptions, also
fixed here:
Concurrent-reconcile crashes. Applying an event that references a node
concurrently removed in Lexical (
getNode()isnull) or a shared type concurrentlydeleted (no
__type) threw bindinginvariants (could not find element node,Expected shared type to include type attribute, …) and desynced the session. Fix:skip those as no-op recoveries in
Collab{Element,Text,Decorator}Node.Dropped local edit → live/persisted desync (lexical core).
$commitPendingUpdatesreset
editor._updateTagsonly on dirty commits, but tags come from thetagoptionregardless and the
updatelistener fires on every commit. A remote sync thatreconciled to nothing (e.g. a concurrently-removed target) thus leaked its
COLLABORATIONtag into the peer's next local edit, whichsyncLexicalUpdateToYjsskips — so that edit never reached Yjs and the peer's live editor diverged from its own
persisted document. Fix: always reset
_updateTagsonce listeners have observed them.This also exposed a latent
setEditorStatebug (it cached_updateTagsacross aninternal force-commit and tagged the stale Set, breaking history's undo classification),
now reading
_updateTagsfresh after the commit.Test plan
prettier,eslint,tscpass locally.Collaboration.spec.mjs, incl. the [lexical-yjs] Bug Fix: clean up dangling text after undo in collaboration #6670 undo/refreshtests) continues to pass.
the collaborator's text is preserved, no duplicate paragraphs, no
Error #94.@lexical/yjsAPIs over amock async transport — concurrent-reconcile crash repros and the dropped-local-edit
Yjs-desync repro fail without these changes — plus core unit tests for the
_updateTagsno-op leak andsetEditorStatetag preservation.green; the remaining macOS/webkit collab job is unaffected by this JS-only change.
The empty-paragraph echo itself stays a timing-dependent, mid-reconcile cross-client
transient that the synchronous e2e harness collapses, so it has no standalone e2e test.
Driving the exported sync functions over a mock async transport preserves that transient,
which is what makes the crash and desync defects above reproducible deterministically at
the unit level.