Skip to content

Augment cf_paths: add path placement strategy and dynamic path choosing logic#5120

Open
tiden0614 wants to merge 52 commits into
facebook:mainfrom
tiden0614:cf_path_id_picker-rebase1
Open

Augment cf_paths: add path placement strategy and dynamic path choosing logic#5120
tiden0614 wants to merge 52 commits into
facebook:mainfrom
tiden0614:cf_path_id_picker-rebase1

Conversation

@tiden0614

@tiden0614 tiden0614 commented Mar 27, 2019

Copy link
Copy Markdown

Notes

We are prototyping to use rocksdb as our next-gen storage engine. One problem that we have encountered is that rocksdb doesn't provide a good way to utilize multiple disk mounts evenly on our hosts. We didn't really like the raid idea, which was mentioned in the rocksdb FAQ: our current prod software is running on these hosts with multiple equal-sized disk mounts; using a raid would require a risky disk setup before the data migration.

This change introduces a dynamic path choose class (DbPathSupplier) when creating output files. It is built on top the already-existing feature of cf_paths. The original behavior of using paths (gradually moving cold data to towards the end of paths list) is kept as the default path placement strategy. We are introducing a new random placement strategy to allow us distribute sst files evenly on our disks.

Major changes

  • Added a set of classes to (DbPathSupplier) to provide dynamic db_path picking logic to replace the original fixed path_id logic
  • Added an db_path_supplier_factory option in DBOptions
  • Modified the FileDescriptor struct: removed 2 higher bits from the packed_number_and_path_id that used to represent file_num. This allows us to increase the limit of path_id from 3 to 15
  • Modified the flush: instead of always using path 0, now it dynamically chooses a path to create the output file
  • Modified the compaction: instead of using the pre-chosen path for all output files in a compaction, dynamically choose a path (using DbPathSupplier) for every single output file.
  • Modified the java interface to include the db_path_supplier_factory options
  • Modified the java interface to include cf_paths options, which were not added in previous changes
  • Removed the limitation of 4 path_ids in manifest serialization

Testing

  • All unit tests passed
  • We have been doing some basic load tests with these changes in our test environment (involving a small fleet consisting of dozens of hosts) for a few weeks and haven't discovered major issues.
@tiden0614 tiden0614 changed the title Augment cf_paths: adding path use strategy and dynamic path choosing logic Mar 27, 2019
@tiden0614 tiden0614 marked this pull request as ready for review March 27, 2019 23:45
@tiden0614

Copy link
Copy Markdown
Author

Apologies first for the large bundle of changes. We've been buffering quite a few tweaks for a while. I think it will get better for our future contributions.

@tiden0614 tiden0614 force-pushed the cf_path_id_picker-rebase1 branch from 2c87a0d to 421bc82 Compare March 28, 2019 00:37
@siying

siying commented Apr 11, 2019

Copy link
Copy Markdown
Contributor

Thank you for your contribution! While we still haven't got bandwidth to look into this giant PR, it's on our list.

@tiden0614

Copy link
Copy Markdown
Author

@siying Thanks! This is an important feature for us so we'd really like it to get merged.

I noticed that I've broken 2 universal compaction tests with the change and I'm working on fixing those (I changed the way universal compaction calculating the size of compaction files in subtle ways).

@tiden0614 tiden0614 force-pushed the cf_path_id_picker-rebase1 branch 5 times, most recently from 2edb911 to 45a6372 Compare August 14, 2019 02:33
@tiden0614 tiden0614 changed the title Augment cf_paths: add path use strategy and dynamic path choosing logic Aug 14, 2019
@tiden0614

tiden0614 commented Aug 15, 2019

Copy link
Copy Markdown
Author

@siying Hey I have updated the PR to a relatively good state to be reviewed. The previous broken test cases have been fixed. Sorry for it taking me so long to update the PR.

I would appreciate it if you guys can allocate some bandwidth to take a look at this change.

@tiden0614

Copy link
Copy Markdown
Author

ping @siying

This PR hasn't been updated until recently. I'm pinging the team to make sure it has visibility. Thanks!

@siying siying self-requested a review September 6, 2019 00:48
@siying siying self-assigned this Sep 10, 2019

@siying siying left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you for working on this complicated PR. It's great! The basic approach looks good to me. I left some comments.

Comment thread db/compaction/compaction_job.cc Outdated
Comment thread db/compaction/compaction_job.cc Outdated
Comment thread db/compaction/compaction.cc Outdated
Comment thread db/db_path_supplier.h Outdated
Comment thread db/db_path_supplier.h Outdated
Comment thread db/db_path_supplier.h Outdated
Comment thread db/db_path_supplier.h Outdated
Comment thread db/flush_job.h Outdated
Comment thread db/version_edit.h Outdated

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we have to waste this 8 bytes? Can you explain why it is absolutely needed?

Copy link
Copy Markdown
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The motivation for isolating the path_id to its own integer is that: the current way of packing path id and file number only gives a limit of 4 for path id.

I didn't realize that FileDescriptor can leave a big memory footprint, so took the simple approach of giving path id its own integer to avoid the casting.

Glad you asked. I made a change to give 2 more bits to path id so that the upper limit now goes up to 16, which should be sufficient for majority cases. However, it does shrink the upper limit of file numbers quite a bit. It is still a very large upper limit so I'm assuming that it should be fine.

Let me know if this works.

Comment thread include/rocksdb/options.h Outdated
@tiden0614 tiden0614 force-pushed the cf_path_id_picker-rebase1 branch 3 times, most recently from 03e8bd1 to e1a4b30 Compare September 25, 2019 06:08
@tiden0614

Copy link
Copy Markdown
Author

@siying this pr has been updated to a ready state to be reviewed. Thanks for allocating bandwidth on this change.

I addressed comments from last revision.

@tiden0614 tiden0614 requested a review from siying October 7, 2019 01:27
@tiden0614

Copy link
Copy Markdown
Author

ping

@tiden0614

Copy link
Copy Markdown
Author

@siying appreciate a lot if the team can allocate some cycles review this change. We would like this feature to be merged into rocksdb so that we can continue to use it in a larger scale.

It would be nice to provide an update, even if it's just an estimation of when we can get some traction makes us happy.

@tiden0614

Copy link
Copy Markdown
Author

ping

@tiden0614

Copy link
Copy Markdown
Author

@siying ping

@tiden0614

Copy link
Copy Markdown
Author

ping

1 similar comment
@tiden0614

Copy link
Copy Markdown
Author

ping

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

3 participants