Skip to content

graph, store: Make sure vid batching works with large vids #5970

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 1 commit into from
Apr 25, 2025

Conversation

lutter
Copy link
Collaborator

@lutter lutter commented Apr 23, 2025

Changing to the new vid scheme of block_num << 32 + sequence_num revealed some numerical problems in the batching code.

@lutter lutter requested a review from zorancv April 23, 2025 21:33
Copy link
Contributor

@zorancv zorancv left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is clear improvement over the current situation and is equivalent to it from algebraic point of view, so I approved it with one remark for consideration in the future. Nice that you added a regression test case!

// as f64) as i64 < points[j]`
//
// We therefore try to only convert differences between points to f64
// which are much smaller.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Here the assumption is that differences of the subsequent points is never more than 2'000'000 blocks, as we have only 21 bits of the mantissa available (53 of f64 - 32 of the shift)? If that's not absolutely guaranteed I would suggest to convert all the calculations to work with i128. The bins_size could be represented as a ratio of two integers. The only drawback are somewhat slow divisions...

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That's a really good point. I'll leave this as-is for now, but if I ever need to touch it, I'll adopt your suggestion.

Changing to the new vid scheme of `block_num << 32 + sequence_num` revealed
some numerical problems in the batching code.
@lutter lutter force-pushed the lutter/ogive-roundoff branch from 1a634c4 to 72834fd Compare April 25, 2025 22:15
@lutter lutter merged commit 72834fd into master Apr 25, 2025
6 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
2 participants