[ES-449270] Fix duplicate key error in object comprehension by ckolb-db · Pull Request #156 · databricks/sjsonnet

ckolb-db · 2022-11-01T10:21:46Z

Fixes #121

Please advise on checking for size twice vs calling contains once per key, I haven't benchmarked this.

This might also break some existing files which rely on this behavior without realizing it's out of spec, either with duplicate list entries getting reduced or more complex replacements where only the last entry matters. Not sure if we're ok with that or what other options we have. Breaking these probably beats allowing duplicate entries in the future.

iuliand-db

I wonder how do these duplicate entries happen? Since the fields are keys in a map, aren't they unique by construction? Even the error is on the map not increasing in size. I wonder if the current tests would fail without the fix, and if yes, then where do we get the two copies from in the materialized json?

ckolb-db · 2022-11-01T12:44:33Z

It's not an object as input, but rather a list of arbitrary values/objects from which the object entry can be constructed yet again arbitrarily. The first uniqueness check in the system is the one that was missing here.

The test would fail so far as it would not throw an error and rather return an object with only the last mention of the key in it.

This surfaced again due to an issue in our teams.jsonnet file where I think a team name was duplicated in two unrelated entries.

I'm checking the map size rather than map.contains before inserting as an optimization, since contains is only armortized O(1) and contains the string length comparison.

…#470) This PR fixes a bug in the detection of duplicate fields: Official `jsonnet` and `go-jsonnet` both detect and fail on duplicate field names even when the field names are dynamically computed via [field name expression](https://jsonnet.org/ref/language.html#field-name-expressions): ```jsonnet { k: 1, ["k"]: 2 } ``` produces ``` Error: RUNTIME ERROR: Duplicate field name: "k" example1.jsonnet:1:1-19 $ During evaluation ``` However, sjsonnet currently accepts this and outputs ``` {"k" : 2} ``` with a last-assignment-wins policy. sjsonnet already correctly detects _literal_ field name duplication [at parse time](https://github.com/databricks/sjsonnet/blob/0.5.4/sjsonnet/src/sjsonnet/Parser.scala#L401-L413) (e.g. `{k: 1, k: 2}`) and detects field name duplication in object comprehensions at both [parse](https://github.com/databricks/sjsonnet/blob/0.5.4/sjsonnet/src/sjsonnet/Parser.scala#L472-L497) and [evaluation](https://github.com/databricks/sjsonnet/blob/0.5.4/sjsonnet/src/sjsonnet/Evaluator.scala#L771-L788) time (see #100 and #156). This PR adds duplicate field name detection at normal object evaluation time. The core check is simple and cheap: when adding a member to the object's fields hashmap, check the `put()` return value and fail if we detect that a duplicate key has been inserted. One subtlety to handle involves sjsonnet's static object optimizations: it's possible that we might detect duplicates during static optimization because a dynamic field expression might be constant-folded / statically optimized into a fixed literal. Even though we can detect this at optimization time, we must defer the throwing of an error until the object is actually evaluated or accessed during program execution: this is necessary in order to faithfully preserve lazy evaluation semantics. I chose to handle this by rendering such objects ineligible for static optimization: they will remain as `MemberList`s. I added new regression tests in EvaluatorSuite to cover this bug and confirmed that the expected behavior matches the jsonnet and go-jsonnet implementations.

Fix duplicate key in comprehension

6491441

ckolb-db requested a review from iuliand-db November 1, 2022 10:22

iuliand-db reviewed Nov 1, 2022

View reviewed changes

iuliand-db approved these changes Nov 1, 2022

View reviewed changes

lihaoyi-databricks merged commit 1b4a5c0 into databricks:master May 30, 2023

JoshRosen mentioned this pull request Aug 14, 2025

Fix detection of duplicate fields with dynamic field name expressions #470

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[ES-449270] Fix duplicate key error in object comprehension#156

[ES-449270] Fix duplicate key error in object comprehension#156
lihaoyi-databricks merged 1 commit intodatabricks:masterfrom
ckolb-db:ES-449270

ckolb-db commented Nov 1, 2022

iuliand-db left a comment

ckolb-db commented Nov 1, 2022 •

edited

Loading

Labels

3 participants

Conversation

ckolb-db commented Nov 1, 2022

iuliand-db left a comment

Choose a reason for hiding this comment

ckolb-db commented Nov 1, 2022 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Labels

3 participants

ckolb-db commented Nov 1, 2022 •

edited

Loading