Resume remote compaction aborted due to primary restart#12177
Open
hx235 wants to merge 1 commit into
Open
Conversation
c82c2cc to
7b881cd
Compare
Contributor
|
@hx235 has imported this pull request. If you are a Meta employee, you can view this diff on Phabricator. |
7b881cd to
e669318
Compare
Contributor
|
@hx235 has updated the pull request. You must reimport the pull request before landing. |
Contributor
|
@hx235 has imported this pull request. If you are a Meta employee, you can view this diff on Phabricator. |
e669318 to
fde50d9
Compare
Contributor
|
@hx235 has updated the pull request. You must reimport the pull request before landing. |
Contributor
|
@hx235 has imported this pull request. If you are a Meta employee, you can view this diff on Phabricator. |
hx235
commented
Dec 25, 2023
| @@ -0,0 +1 @@ | |||
| Provide an experimental option `Options::resume_compaction` to resume unfinished compactions left from the last db session. Right now only unfinished remote compactions due to primary db restart or failed remote compaction are supported. This options is turned on by default and has no effect to users with no remote compaction (i.e, `Options::compaction_service == nullptr`) or disable auto compaction (i.e, `Options::disable_auto_compactions = true`) | |||
Contributor
Author
There was a problem hiding this comment.
minor TODO: "... this option"
hx235
commented
Dec 25, 2023
| metadata.clear(); | ||
| db_->GetLiveFilesMetaData(&metadata); | ||
| if (compaction_unfinished_ && resume_compaction) { | ||
| ASSERT_LT(metadata.size(), prev_reopen_live_file_num); |
Contributor
Author
There was a problem hiding this comment.
minor TODO: assert sync point is called even manually tracing through debugger shows it is called.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Context:
If the primary db is restarted after requesting a remote compaction but before installing the compaction, the same compaction will be scheduled and requested like a new compaction again. Therefore, the compaction progress made in the remote site will be wasted.
Summary:
This PR allows the restarted primary db wait for the remote compaction to return from the remote site instead of rescheduling a same new one. At the high level, we persist essential compaction information in the manifest to wait for the corresponding remote compaction. So upon restart, we can reconstruct the memory state to wait for the remote compaction and prevent compaction conflict from other new compaction after restart.
Test:
TEST_F(CompactionServiceResumableCompactionTest, ResumableCompaction)Options::resume_compactionto crash test to ensure it has no impact on existing feature when remote compaction is not used.Limitations: