Skip to content

[RFC] Refactor block cache tracing APIs#9326

Open
anand1976 wants to merge 2 commits into
facebook:mainfrom
anand1976:tracing_refactor
Open

[RFC] Refactor block cache tracing APIs#9326
anand1976 wants to merge 2 commits into
facebook:mainfrom
anand1976:tracing_refactor

Conversation

@anand1976

Copy link
Copy Markdown
Contributor

Refactor the classes, APIs and data structures for block cache tracing to allow a user provided trace writer to be used. Currently, only a TraceWriter is supported, with a default built-in implementation of FileTraceWriter. The TraceWriter, however, takes a flat trace record and is thus only suitable for file tracing. This PR introduces an abstract BlockCacheTraceWriter class that takes a structured BlockCacheTraceRecord. The BlockCacheTraceWriter implementation can then format and log the record in whatever way it sees fit. The default BlockCacheTraceWriterImpl does file tracing using a user provided TraceWriter.

This same philosophy can be applied to KV and IO tracing as well.

Comment thread include/rocksdb/block_cache_trace_writer.h Outdated

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we need "WriteFooter"?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, we could call it on EndTrace() to keep it symmetrical.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why not just use bool?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good question. The original definition of BlockCacheTraceRecord in block_cache_tracer.h used it and I just moved the definitions. There is a case for using an enum instead of bool if the constants have descriptive names, as it improves readability. But in this case, the constants are just kTrue and kFalse, which seems rather pointless. It might make more sense if we use something like enum CacheLookupResult : char { kCacheHit = 1, kCacheMiss = 0};.

Comment on lines 111 to 150

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why not make the BlockCacheTraceWriter a Customizable class and use that infrastructure? That will make it simpler if you ever intend to add alternative implementations.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

At the very least, the class should have a "virtual const char *Name() const = 0" to allow alternate implementations later.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I plan to make it a Customizable class.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you add a comment stating what this method does?

Comment thread table/table_test.cc Outdated
Comment on lines 1162 to 1157

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Shouldn't this be an ASSERT_OK?

@anand1976 anand1976 left a comment

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @zhichao-cao and @mrambacher for the review!

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good question. The original definition of BlockCacheTraceRecord in block_cache_tracer.h used it and I just moved the definitions. There is a case for using an enum instead of bool if the constants have descriptive names, as it improves readability. But in this case, the constants are just kTrue and kFalse, which seems rather pointless. It might make more sense if we use something like enum CacheLookupResult : char { kCacheHit = 1, kCacheMiss = 0};.

Comment thread include/rocksdb/block_cache_trace_writer.h Outdated

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I plan to make it a Customizable class.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, we could call it on EndTrace() to keep it symmetrical.

@mdcallag

Copy link
Copy Markdown
Contributor

Block cache tracing caused the largest regression early in the 6.x cycle, it was eventually fixed (or mostly fixed). Do you have results to confirm this doesn't add CPU overhead when disabled? db_bench --benchmarks=fillseq,readrandom should be sufficient.

@riversand963

Copy link
Copy Markdown
Contributor

Block cache tracing caused the largest regression early in the 6.x cycle, it was eventually fixed (or mostly fixed). Do you have results to confirm this doesn't add CPU overhead when disabled? db_bench --benchmarks=fillseq,readrandom should be sufficient.

Thanks @mdcallag for the input. Just to catch up, are you referring to a8975b6 as the "eventual fix"?

@anand1976 anand1976 added the WIP Work in progress label Jan 14, 2022
@anand1976

Copy link
Copy Markdown
Contributor Author

Block cache tracing caused the largest regression early in the 6.x cycle, it was eventually fixed (or mostly fixed). Do you have results to confirm this doesn't add CPU overhead when disabled? db_bench --benchmarks=fillseq,readrandom should be sufficient.

@mdcallag I haven't run any benchmarks yet, but will do and ensure there's no regression. This was a draft in order to get some early feedback


namespace ROCKSDB_NAMESPACE {
// A record for block cache lookups/inserts. This is passed by the table
// reader to the BlockCacheTraceWriter for every block cache op.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should we indicate that the class/interfaces/structs in this file is experimental and may be changed in the future?

@zhichao-cao

Copy link
Copy Markdown
Contributor

So, this change will not continue?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

CLA Signed WIP Work in progress

6 participants