Skip to content

Conversation

@gatesn
Copy link
Contributor

@gatesn gatesn commented Jan 29, 2026

This will be used prior to serialization to remove any unwanted arrays from the tree.

Should it actually just be fused with serialization?

Signed-off-by: Nicholas Gates <nick@nickgates.com>
@gatesn gatesn added feature A feature request changelog/feature A new feature and removed feature A feature request labels Jan 29, 2026
Signed-off-by: Nicholas Gates <nick@nickgates.com>
@codspeed-hq
Copy link

codspeed-hq bot commented Jan 29, 2026

Merging this PR will degrade performance by 30.46%

⚠️ Unknown Walltime execution environment detected

Using the Walltime instrument on standard Hosted Runners will lead to inconsistent data.

For the most accurate results, we recommend using CodSpeed Macro Runners: bare-metal machines fine-tuned for performance measurement consistency.

⚡ 2 improved benchmarks
❌ 3 regressed benchmarks
✅ 1174 untouched benchmarks
⏩ 1323 skipped benchmarks1

⚠️ Please fix the performance issues or acknowledge them on CodSpeed.

Performance Changes

Mode Benchmark BASE HEAD Efficiency
WallTime u16_FoR[10M] 8 µs 9.6 µs -16.37%
WallTime 1M_50pct[500000] 57.1 µs 82.1 µs -30.46%
WallTime 1M_10pct[100000] 21 µs 29 µs -27.44%
WallTime u64_FoR[10M] 388.1 µs 351.3 µs +10.47%
Simulation chunked_bool_into_canonical[(1000, 10)] 64.1 µs 43.2 µs +48.57%

Comparing ngates/normalize (9b78ebe) with develop (1dd2d66)

Open in CodSpeed

Footnotes

  1. 1323 benchmarks were skipped, so the baseline results were used instead. If they were deleted from the codebase, click here and archive them to remove them from the performance reports.

///
/// This operation performs a recursive traversal of the array. Any non-allowed encoding is
/// normalized per the configured operation.
pub fn normalize(&self, options: &mut NormalizeOptions) -> VortexResult<ArrayRef> {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

When would we use this?

@joseph-isaacs
Copy link
Contributor

joseph-isaacs commented Jan 30, 2026

I am concerned that the is more of a bandaid than a fix. It will make it very easy to compress an array (include a accidental filter) and then undo it just before we serialise. I think we actually want to throw an error instead and stop the compressor adding the unwanted array in the first place

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

changelog/feature A new feature

3 participants