Skip to content

chore: Update tests in UnicodeHandling #507

Merged
stephenamar-db merged 1 commit intodatabricks:masterfrom
He-Pin:typo
Sep 23, 2025
Merged

chore: Update tests in UnicodeHandling #507
stephenamar-db merged 1 commit intodatabricks:masterfrom
He-Pin:typo

Conversation

@He-Pin
Copy link
Copy Markdown
Contributor

@He-Pin He-Pin commented Sep 18, 2025

Motivation:
refs: #508

@JoshRosen said, in both go/cpp jsonnet has behavior :

    // Unpaired surrogate handling - sjsonnet-specific behavior
    //
    // Note: This is an intentional divergence from go-jsonnet and C++ jsonnet:
    // - go/C++ reject unpaired surrogates in escape sequences at parse time
    // - go-jsonnet's std.char() replaces surrogate codepoints with U+FFFD
    // - sjsonnet preserves unpaired surrogates throughout
    //
    // This permissive behavior is maintained for backwards compatibility.

But in my recent PR, which actually will reject the wrong Unicode.

@JoshRosen @stephenamar-db Should we do this before 1.0.0? I think so, because the wrong Unicode will cause unexpected behavior anyway.

@He-Pin He-Pin changed the title chore: Fix typo Sep 18, 2025
// sjsonnet was parses these successfully (go/C++ would reject)
// the new behavior will reject thesee too
evalErr("\"\\uD800\"").contains("Expected") // High surrogate alone
evalErr("\"\\uDC00\"").contains("Expected") // Low surrogate alone
Copy link
Copy Markdown
Contributor Author

@He-Pin He-Pin Sep 21, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@JoshRosen the new behavior will fail now

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Error: example1.jsonnet:1:1-9 example1.jsonnet:1:1-9 Truncated unicode surrogate pair escape sequence in string literal.

"\uD800"

in jsonnet.org

@He-Pin
Copy link
Copy Markdown
Contributor Author

He-Pin commented Sep 23, 2025

@stephenamar-db I think this should be merged to get other PR's tests green.

@stephenamar-db stephenamar-db merged commit bb764a8 into databricks:master Sep 23, 2025
6 checks passed
@He-Pin He-Pin deleted the typo branch September 23, 2025 04:51
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

3 participants