Skip to content

[Draft] Add the use_secondary_cache to MutableCFOptions#9025

Open
zhichao-cao wants to merge 4 commits into
facebook:mainfrom
zhichao-cao:secondary_option
Open

[Draft] Add the use_secondary_cache to MutableCFOptions#9025
zhichao-cao wants to merge 4 commits into
facebook:mainfrom
zhichao-cao:secondary_option

Conversation

@zhichao-cao

Copy link
Copy Markdown
Contributor

Currently, if Secondary Cache is provided to the lru cache, it is used by default. We introduce the "use_secondary_cache" as an MutableCFOptions.By default, it is true. We can set the option when DB is running using SetOptions() for a certain column family. The option is passed to BlockBasedTableReader to decide if we want to use the new cache API (using secondary cache) or the legacy (no secondary cache).

test plan: add test to the lru_cache_test

@mrambacher mrambacher left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this change is at the wrong level. It feels like it belongs inside the Cache but at least inside the BlockBasedTableOptions. I am not sure why if there are multiple caches (block and compressed) why you would control them through a single flag (rather than each having a flag). I also am not sure why I would use the secondary cache for one database but not another using the same cache.

Comment thread cache/lru_cache.cc Outdated
Comment thread cache/lru_cache_test.cc Outdated
Comment thread table/block_based/block_based_table_reader.cc Outdated
Comment thread include/rocksdb/options.h Outdated
Comment thread cache/lru_cache_test.cc Outdated
@zhichao-cao

Copy link
Copy Markdown
Contributor Author

I think this change is at the wrong level. It feels like it belongs inside the Cache but at least inside the BlockBasedTableOptions. I am not sure why if there are multiple caches (block and compressed) why you would control them through a single flag (rather than each having a flag). I also am not sure why I would use the secondary cache for one database but not another using the same cache.

The requirement from production is, they want to control enable or disable secondary cache per DB instance when it is running

@mrambacher

Copy link
Copy Markdown
Contributor

I think this change is at the wrong level. It feels like it belongs inside the Cache but at least inside the BlockBasedTableOptions. I am not sure why if there are multiple caches (block and compressed) why you would control them through a single flag (rather than each having a flag). I also am not sure why I would use the secondary cache for one database but not another using the same cache.

The requirement from production is, they want to control enable or disable secondary cache per DB instance when it is running

The current solution does not allow the cache to be controlled on a DB Instance -- it is on a per ColumnFamily.

It also does not seem like it should be a Mutable option -- what would you expect to happen if you disabled this during a run? Would you expect the keys associated with this table factory to be scrubbed from the SecondaryCache (they would theoretically never be accessed).

Comment thread options/cf_options.cc Outdated
Comment thread HISTORY.md Outdated
Comment thread cache/lru_cache.cc Outdated
Comment thread include/rocksdb/options.h Outdated
Comment thread table/block_based/block_based_table_reader.cc Outdated
Comment thread cache/lru_cache_test.cc Outdated
Comment thread include/rocksdb/options.h Outdated
@zhichao-cao

Copy link
Copy Markdown
Contributor Author

I think this change is at the wrong level. It feels like it belongs inside the Cache but at least inside the BlockBasedTableOptions. I am not sure why if there are multiple caches (block and compressed) why you would control them through a single flag (rather than each having a flag). I also am not sure why I would use the secondary cache for one database but not another using the same cache.

The requirement from production is, they want to control enable or disable secondary cache per DB instance when it is running

The current solution does not allow the cache to be controlled on a DB Instance -- it is on a per ColumnFamily.

It also does not seem like it should be a Mutable option -- what would you expect to happen if you disabled this during a run? Would you expect the keys associated with this table factory to be scrubbed from the SecondaryCache (they would theoretically never be accessed).

Based on the code, it is better to put in the MutableCFOptions for the table, because, tables are managed by table cache, and it is referenced by CFD.

The production requires it to be able to changed when it is running, so, must be mutable. Otherwise, we can just put as a ImmutableDBOptions. If we disable it, this CF block reads just does not lookup and insert the secondary cache. The stored blocks in secondary cache will never be accessed and gradually removed.

Comment thread table/block_based/block_based_table_reader.cc Outdated
Comment thread table/block_based/block_based_table_reader.cc Outdated
@facebook-github-bot

Copy link
Copy Markdown
Contributor

@zhichao-cao has imported this pull request. If you are a Facebook employee, you can view this diff on Phabricator.

@facebook-github-bot

Copy link
Copy Markdown
Contributor

@zhichao-cao has updated the pull request. You must reimport the pull request before landing.

@facebook-github-bot

Copy link
Copy Markdown
Contributor

@zhichao-cao has updated the pull request. You must reimport the pull request before landing.

@zhichao-cao zhichao-cao reopened this Oct 17, 2021
@facebook-github-bot

Copy link
Copy Markdown
Contributor

@zhichao-cao has updated the pull request. You must reimport the pull request before landing.

@mrambacher mrambacher left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Using OptionTypeInfo::Enum will simplify this code.

There should also be a check added to db_test that verifies the cache tier can be changed via SetOptions.

And if the changes to BlockBasedTableReader are required now (to do different things based on this flag), can we at least add an issue to add a CacheOptions to the cache methods and a "TODO" comment here, so we can look to fix it in the next release?

Comment thread include/rocksdb/options.h
Comment thread include/rocksdb/advanced_options.h Outdated
Comment thread include/rocksdb/advanced_options.h Outdated
Comment thread include/rocksdb/options.h Outdated
Comment thread include/rocksdb/options.h Outdated
Comment thread include/rocksdb/utilities/options_type.h Outdated
Comment thread options/options_helper.cc Outdated
Comment thread include/rocksdb/convenience.h Outdated
Comment thread options/cf_options.cc Outdated
Comment thread options/cf_options.h Outdated
@zhichao-cao zhichao-cao reopened this Oct 17, 2021
@facebook-github-bot

Copy link
Copy Markdown
Contributor

@zhichao-cao has updated the pull request. You must reimport the pull request before landing.

Comment thread options/cf_options.h Outdated

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@anand1976 There is a race issue when someone is calling SetOptions() and RocksDB try to get the value of lowest_used_cache_tier from this option. I added the mutex. What's your suggestion for this. The race is caught by TSAN and some tests

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Would it make more sense to remove the table from the table cache if the settings have changed? This way there would not need to be locking of the value (which is also not completely correct as it is not locked during SetOptions) and the table reader code can -- upon initialization -- copy the "cache tier" into the Rep structure.

@facebook-github-bot

Copy link
Copy Markdown
Contributor

@zhichao-cao has updated the pull request. You must reimport the pull request before landing.

@zhichao-cao zhichao-cao changed the title Add the use_secondary_cache to MutableCFOptions Oct 18, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

4 participants