Skip to content

Add DB::KeyExists() public API to skip blob file reads for existence checks#14644

Open
m-nagarajan wants to merge 1 commit into
facebook:mainfrom
m-nagarajan:keyexists-blob-skip
Open

Add DB::KeyExists() public API to skip blob file reads for existence checks#14644
m-nagarajan wants to merge 1 commit into
facebook:mainfrom
m-nagarajan:keyexists-blob-skip

Conversation

@m-nagarajan

@m-nagarajan m-nagarajan commented Apr 21, 2026

Copy link
Copy Markdown

Summary

The Java keyExists() was added as a JNI convenience that calls KeyMayExist + Get. For BlobDB stores, the Get step resolves blob indexes and reads potentially large (10KB–1MB+) blob values that are immediately discarded. This PR eliminates that waste by introducing a C++ DB::KeyExists() that both C++ and Java callers benefit from.

Details

Add a new DB::KeyExists() virtual method that checks key existence without resolving blob indexes. The existing Java keyExists() JNI method internally calls Get() which reads the full blob value only to discard it. The new API uses GetImpl with a non-null is_blob_index pointer, causing Version::Get to skip blob file I/O entirely.

  • Add DB::KeyExists(ReadOptions, ColumnFamilyHandle*, Slice) returning Status::OK() or Status::NotFound()
  • Add io_activity validation matching DB::Get() pattern
  • Add StackableDB::KeyExists() override to prevent undefined behavior on TransactionDB/DBWithTTL
  • Propagate is_blob_index in DBImplReadOnly::GetImpl and DBImplSecondary::GetImpl (all mem->Get, imm->Get, and Version::Get call sites)
  • Simplify JNI key_exists_helper to delegate to DB::KeyExists() with a KeyMayExist bloom filter pre-check for fast rejection of absent keys
  • Add C++ tests: blob-skip verification via BLOB_DB_BLOB_FILE_BYTES_READ statistics, basic existence, column families, merge operands, invalid io_activity, read-only DB
  • Add Java tests: BlobDB correctness, blob-skip verification via PerfContext, column families, ReadOptions variant, memtable path, DirectByteBuffer

Related: #12921 reports keyExists false negatives from KeyMayExist. The C++ DB::KeyExists() avoids this by calling GetImpl directly without KeyMayExist. The JNI layer retains KeyMayExist as a fast-path (accepted trade-off for backward compatibility).

Testing

  • db_blob_basic_test: 6 new KeyExists* test cases
  • KeyExistsBlobTest.java: 6 new Java test cases including PerfContext-based blob-read verification
  • Both test suites verify that keyExists produces zero blob reads while get() on the same key produces non-zero blob reads (control)
…checks

Summary:
Add a new virtual method DB::KeyExists() that checks key existence without
reading blob file values. For BlobDB stores, the existing keyExists() JNI
method internally calls Get() which resolves blob indexes and reads the full
blob value only to discard it. The new API uses GetImpl with a non-null
is_blob_index pointer, causing Version::Get to report blob indexes back
without resolving them — skipping blob file I/O entirely.

Changes:
- Add DB::KeyExists() with io_activity validation and blob-skip via is_blob_index
- Add StackableDB::KeyExists() override to prevent UB on TransactionDB/DBWithTTL
- Propagate is_blob_index in DBImplReadOnly::GetImpl and DBImplSecondary::GetImpl
- Simplify JNI key_exists_helper to call DB::KeyExists with KeyMayExist pre-check
- Add C++ tests (blob-skip, basic, column family, merge, io_activity, read-only DB)
- Add Java BlobDB tests with PerfContext blob-read verification
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

2 participants