[lexical-markdown] Bug Fix: code spans should bind tighter than text-match transformers#8688
Conversation
…match transformers Inline code (`code`) has the highest inline parsing precedence in Markdown and must never be partially consumed by a text-match transformer. When a code-format match and a text-match (e.g. the playground equation `$...$` or a link) partially overlapped, the importer incorrectly applied the text match. For example `` `$a` `$b` `` parsed into a single equation instead of two code spans. findOutermostTextFormatTransformer now tags its result with isCodeSpan (true by transformer identity when the chosen match is the code span), so the reconciliation in importTextTransformers can deterministically prefer the code format whenever it conflicts with a text match, unless the text match fully wraps the code span (in which case the backticks are part of the match's raw content). Fixes facebook#8687 https://claude.ai/code/session_01KMePdiz855Y9DYP7BPeDcg
|
The latest updates on your projects. Learn more about Vercel for GitHub.
|
potatowagon
left a comment
There was a problem hiding this comment.
Reviewed by Navi (Tater Thoughts Bobblehead) on behalf of @potatowagon.
LGTM ✅
What this does: Fixes code spans (\...`) binding tighter than text-match transformers (e.g. equation ) during markdown import. Previously, `$a` `$b`` could be incorrectly consumed by the equation text-match transformer spanning across the code spans.
What I checked:
-
Logic correctness: The fix correctly tags
isCodeSpanfromfindOutermostTextFormatTransformerby identity-comparing againstcodeTransformer. InimportTextTransformers, when both a text-format and text-match are found and the format is a code span, it checks whether the text-match fully wraps the code span (in which case the text-match wins) or not (code span wins). This implements the correct precedence per CommonMark spec where code spans bind tighter. -
Edge cases: The wrapping condition (
startIndex <= ... && endIndex >= ...) correctly handles the case where a text-match legitimately contains code span syntax as raw content. The else branch (code span wins) properly nullifiesfoundTextMatchso processing continues normally. -
Test coverage: New test case
\$a` `$b`` directly exercises the bug from #8687. Round-trip test validates both import and export. -
No regressions: Existing test for
\$$hello`still passes (code span containing equation-like text). No API changes, no new exports. TheisCodeSpan` field is internal to the markdown import pipeline. -
www compat: No removed/renamed exports, no changed function signatures. Safe for Meta internal consumers.
CI: Core tests all green (unit 22.x+24.x, browser all platforms, integrity, integration, e2e canary chromium). Some platform e2e tests still pending but unrelated to this change.
Ready to approve.
Description
Inline code (
code) has the highest inline parsing precedence in Markdown and must never be partially consumed by a text-match transformer. When a code-format match and a text-match (e.g. the playground equation$...$or a link) partially overlapped, the importer incorrectly applied the text match. For example`$a` `$b`parsed into a single equation instead of two code spans.Prefer the code format whenever it conflicts with a text match, unless the text match fully wraps the code span (in which case the backticks are part of the match's raw content).
Closes #8687
Test plan
New unit tests