Skip to content

Resume remote compaction aborted due to primary restart#12177

Open
hx235 wants to merge 1 commit into
facebook:mainfrom
hx235:resume_remote_compaction
Open

Resume remote compaction aborted due to primary restart#12177
hx235 wants to merge 1 commit into
facebook:mainfrom
hx235:resume_remote_compaction

Conversation

@hx235

@hx235 hx235 commented Dec 23, 2023

Copy link
Copy Markdown
Contributor

Context:
If the primary db is restarted after requesting a remote compaction but before installing the compaction, the same compaction will be scheduled and requested like a new compaction again. Therefore, the compaction progress made in the remote site will be wasted.

Summary:
This PR allows the restarted primary db wait for the remote compaction to return from the remote site instead of rescheduling a same new one. At the high level, we persist essential compaction information in the manifest to wait for the corresponding remote compaction. So upon restart, we can reconstruct the memory state to wait for the remote compaction and prevent compaction conflict from other new compaction after restart.

Test:

  • New UT TEST_F(CompactionServiceResumableCompactionTest, ResumableCompaction)
  • Add Options::resume_compaction to crash test to ensure it has no impact on existing feature when remote compaction is not used.
    • Crash test currently does not use remote compaction.
  • Run stress test patch with dummy remote service to test upgrade/downgrade compatibly on manifest

Limitations:

  • Failed remote compaction will also be resumed on next db open in addition to the unfinished ones due to primary db restart (noted in API)
  • Failed resumed remote compaction will not fall back to local in any case since we now only persist minimum compaction information which can be less than local compaction needs.
  • Compaction are only resumed once. If the resume fail, we will not resume it again and the failure is like any other compaction failure.
  • If a second reopen happens quickly after the first reopen but before the first reopen is able to finish resuming the compaction, then the second reopen may not be able to resume the compaction. That's because the compaction resumed in the 1st db open is not persisted to manifest and the manifest with that info can be deleted before the 2nd DB reopen.
@hx235 hx235 changed the title Resume remote compaction aborted due to primary restart or failure Dec 23, 2023
@hx235 hx235 force-pushed the resume_remote_compaction branch 2 times, most recently from c82c2cc to 7b881cd Compare December 23, 2023 05:40
@facebook-github-bot

Copy link
Copy Markdown
Contributor

@hx235 has imported this pull request. If you are a Meta employee, you can view this diff on Phabricator.

@hx235 hx235 force-pushed the resume_remote_compaction branch from 7b881cd to e669318 Compare December 23, 2023 18:21
@facebook-github-bot

Copy link
Copy Markdown
Contributor

@hx235 has updated the pull request. You must reimport the pull request before landing.

@facebook-github-bot

Copy link
Copy Markdown
Contributor

@hx235 has imported this pull request. If you are a Meta employee, you can view this diff on Phabricator.

@hx235 hx235 force-pushed the resume_remote_compaction branch from e669318 to fde50d9 Compare December 23, 2023 20:06
@facebook-github-bot

Copy link
Copy Markdown
Contributor

@hx235 has updated the pull request. You must reimport the pull request before landing.

@facebook-github-bot

Copy link
Copy Markdown
Contributor

@hx235 has imported this pull request. If you are a Meta employee, you can view this diff on Phabricator.

@@ -0,0 +1 @@
Provide an experimental option `Options::resume_compaction` to resume unfinished compactions left from the last db session. Right now only unfinished remote compactions due to primary db restart or failed remote compaction are supported. This options is turned on by default and has no effect to users with no remote compaction (i.e, `Options::compaction_service == nullptr`) or disable auto compaction (i.e, `Options::disable_auto_compactions = true`)

@hx235 hx235 Dec 25, 2023

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

minor TODO: "... this option"

metadata.clear();
db_->GetLiveFilesMetaData(&metadata);
if (compaction_unfinished_ && resume_compaction) {
ASSERT_LT(metadata.size(), prev_reopen_live_file_num);

@hx235 hx235 Dec 25, 2023

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

minor TODO: assert sync point is called even manually tracing through debugger shows it is called.

@hx235 hx235 requested a review from ajkr December 25, 2023 19:43
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

2 participants