Skip to content

Error during compaction that produces empty SST file & writes blobs #14897

Description

@kmorkos

I recently upgraded from v10.9.1 --> v11.1.1 and started seeing errors like the following during compaction:

2026/06/30-00:35:21.541704 173606 [ERROR] [/db_impl/db_impl_compaction_flush.cc:3729] Waiting after background compaction error: IO error: No such file or directory: while stat a file for size: /tmp/compact_bug_repro/000889.sst: No such file or directory, Accumulated background error counts: 1

The errors always follow an "Empty SST file not kept" status like the following:

2026/06/30-00:35:21.537015 173606 EVENT_LOG_v1 {"time_micros": 1782794121536847, "cf_name": "default", "job": 348, "event": "table_file_creation", "file_number": 0, "file_size": 0, "file_checksum": "", "file_checksum_func_name": "Unknown", "smallest_seqno": 72057594037927935, "largest_seqno": 0, "table_properties": {"data_size": 0, "index_size": 13, "index_partitions": 0, "top_level_index_size": 0, "index_key_is_user_key": 1, "index_value_is_delta_encoded": 1, "filter_size": 0, "raw_key_size": 0, "raw_average_key_size": 0, "raw_value_size": 0, "raw_average_value_size": 0, "num_data_blocks": 0, "num_entries": 0, "num_filter_entries": 0, "num_deletions": 0, "num_merge_operands": 0, "num_range_deletions": 0, "format_version": 7, "fixed_key_len": 0, "filter_policy": "", "column_family_name": "default", "column_family_id": 0, "comparator": "leveldb.BytewiseComparator", "user_defined_timestamps_persisted": 1, "key_largest_seqno": 0, "key_smallest_seqno": 18446744073709551615, "merge_operator": "nullptr", "prefix_extractor_name": "nullptr", "property_collectors": "[]", "compression": ";;", "compression_options": "window_bits=-14; level=32767; strategy=0; max_dict_bytes=0; zstd_max_train_bytes=0; enabled=0; max_dict_buffer_bytes=0; use_zstd_dict_trainer=1; max_compressed_bytes_per_kb=896; checksum=0; ", "creation_time": 1782794121, "oldest_key_time": 1782794121, "newest_key_time": 0, "file_creation_time": 1782794121, "slow_compression_estimated_data_size": 0, "fast_compression_estimated_data_size": 0, "db_id": "b1ab3e4a-23eb-4ec4-af97-f827682c014c", "db_session_id": "ET603AQHW1DIR96IXHZ1", "orig_file_number": 889, "seqno_to_time_mapping": "N/A"}, "status": "Operation aborted: Empty SST file not kept"}

Claude theorized that the following commit introduced the regression: 656b734. Specifically, this line which now passes a vector containing all file types, not just blob files for the blob_file_paths parameter in the BlobFileBuilder ctor:

sub_compact->Current().GetOutputFilePathsPtr(),

The theory is that there now exists the following race condition:

  1. Open a blob file, and push that to the tail of the vector
  2. Open an SST file, and push that to the tail of the vector
  3. Decide that the SST file from [2] should not be written (the "Operation aborted..." outcome)
  4. Close the blob file from [1], which assumes it's the last file in the vector and tries to open that file to get the metadata <-- this is where the failure happens
    a) It's worth noting that even if the SST file was written, the metadata retrieved here would report the incorrect size IIUC.

The following is able to consistently reproduce the failure in ~5 seconds:

// OPTIONS

[Version]
  rocksdb_version=11.1.1
  options_file_version=1.1

[DBOptions]
  create_if_missing=true
  max_subcompactions=2
  

[CFOptions "default"]
  blob_garbage_collection_age_cutoff=1.000000
  target_file_size_base=1024
  enable_blob_files=true
  max_bytes_for_level_base=2048
  max_bytes_for_level_multiplier=2.000000
  enable_blob_garbage_collection=true
  level_compaction_dynamic_level_bytes=false
  
[TableOptions/BlockBasedTable "default"]
  block_size=16384
// C++ repro

std::string key(uint8_t header1, uint8_t header2, uint64_t body) {
  auto bytes = std::to_string(header1) + std::to_string(header2);
  bytes.resize(2 + sizeof(body));

  for (size_t i = 0; i < sizeof(body); ++i) {
    size_t shift_amount = (sizeof(body) - 1 - i) * 8;
    bytes[i + 2] = static_cast<char>((body >> shift_amount) & 0xFF);
  }

  return bytes;
}

int main() {
  rocksdb::Options options;
  std::vector<rocksdb::ColumnFamilyDescriptor> all_cfs;
  rocksdb::Status s = rocksdb::LoadOptionsFromFile(
      {}, "/path/to/OPTIONS", &options, &all_cfs);
  assert(s.ok());

  std::vector<rocksdb::ColumnFamilyHandle*> cf_handles;
  std::unique_ptr<rocksdb::DB> db;
  s = rocksdb::DB::Open(options, "/path/to/db", all_cfs, &cf_handles, &db);
  assert(s.ok());

  rocksdb::CompactRangeOptions compact_options;
  compact_options.bottommost_level_compaction =
      rocksdb::BottommostLevelCompaction::kForce;

  std::string largeVal(4096, '0');

  for (uint8_t i = 1; i <= 3; i++) {
    s = db->Put(rocksdb::WriteOptions(), key(i, 0, 0), largeVal);
    assert(s.ok());
    s = db->Put(rocksdb::WriteOptions(), key(i, 2, 0), largeVal);
    assert(s.ok());
  }

  s = db->Flush(rocksdb::FlushOptions());
  assert(s.ok());

  s = db->CompactRange(compact_options, db->DefaultColumnFamily(), nullptr,
                       nullptr);
  assert(s.ok());

  for (uint64_t i = 1;; i++) {
    s = db->Put(rocksdb::WriteOptions(), key(1, 1, i), largeVal);
    assert(s.ok());

    s = db->Put(rocksdb::WriteOptions(), key(2, 1, i), largeVal);
    assert(s.ok());

    s = db->Put(rocksdb::WriteOptions(), key(3, 1, i), largeVal);
    assert(s.ok());

    if (i % 10 == 0) {
      s = db->Flush(rocksdb::FlushOptions());
      assert(s.ok());

      s = db->DeleteRange(rocksdb::WriteOptions(), db->DefaultColumnFamily(),
                          key(2, 0, 0), key(2, UINT8_MAX, 0));
      assert(s.ok());

      s = db->DeleteRange(rocksdb::WriteOptions(), db->DefaultColumnFamily(),
                          key(3, 0, 0), key(3, UINT8_MAX, 0));
      assert(s.ok());

      s = db->Flush(rocksdb::FlushOptions());
      assert(s.ok());
    }

    if (i % 100 == 0) {
      s = db->CompactRange(compact_options, db->DefaultColumnFamily(), nullptr,
                           nullptr);
      assert(s.ok());
    }
  }

  db.reset();

  return 0;
}

This consistently fails on 656b734 in ~5s and runs for several minutes on 21a8b5f until I interrupt it, so I think this is almost certainly the root cause.

CC @xingbowang as the original author and @anand1976 as the reviewer

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions