Improve locking iterator performance#1
Draft
dutow wants to merge 1 commit into
Draft
Conversation
hermanlee
reviewed
Apr 6, 2022
| bool m_valid; | ||
|
|
||
| ulonglong *m_lock_count; | ||
| ulonglong m_self_lock_count; |
There was a problem hiding this comment.
This variable needs to be initialized to 0. Otherwise, it causes random failures when m_self_lock_count > max_lock_count.
Author
There was a problem hiding this comment.
You are right, I missed that when I extracted this patch from a larger changeset and then I only tested it in a debug build after that. I'll fix it.
Author
|
@hermanlee what are your thought on the two possible improvements I mentioned earlier? (I also tested it with TokuDB, and that seems to work just like this patch: if there's some more complex query based on a secondary key, it simply locks the entire table) |
Issue: locking iterator is several times slower than the default iterator in queries based on secondary keys. For example: select COUNT(*) from sbtest1 where k % 10 = 1 for update; This query ends up iterating over the entire table, and in the end locks the entire table, but does so by lots of small locks and seeks every time. This diff shows that by simply locking the entire table instead we can improve the performance of the iterator by 50% - it's still slower than the default implementation, but now it's much closer to it. This modification of course results in overlocking in queries such as select COUNT(*) from sbtest1 where k % 10 = 1 for update limit 3000; where instead of locking only a few thousands records, with this change we lock the entire table instead. This change is in this form a proof of concept, for a real patch I think we could go in two directions: * instead of a hard coded value, make this an adjustable session variable * or instead of simply stopping after a threshold, continue locking in linear or exponential batches
25ea94a to
2368e93
Compare
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Issue: locking iterator is several times slower than the default
iterator in queries based on secondary keys.
For example:
select COUNT(*) from sbtest1 where k % 10 = 1 for update;
This query ends up iterating over the entire table, and in the end locks
the entire table, but does so by lots of small locks and seeks every
time.
This diff shows that by simply locking the entire table instead we can
improve the performance of the iterator by 50% - it's still slower than
the default implementation, but now it's much closer to it.
This modification of course results in overlocking in queries such as
select COUNT(*) from sbtest1 where k % 10 = 1 for update limit 3000;
where instead of locking only a few thousands records, with this change
we lock the entire table instead.
This change is in this form a proof of concept, for a real patch I
think we could go in two directions:
variable
linear or exponential batches