Skip to content

Commit 43d60bf

Browse files
jaykoreanmeta-codesync[bot]
authored andcommitted
Revert reverse-comparator handling in range tree lock manager (#14887)
Summary: Pull Request resolved: #14887 This reverts two changes to the range-lock endpoint comparator and restores the prior, direction-agnostic behavior: - #14831 ("Fix range lock manager crash with reverse comparator"), which made RangeTreeLockManager::CompareDbtEndpoints reverse-aware, and - #14880 (a follow-up that only corrected the equal-length / point-range branch). Why: #14831 taught the endpoint comparator about reverse-ordered column families by flipping the suffix tie-break based on comparator direction. However, callers that use range locking on a reverse-ordered CF already flip the start/end endpoints themselves before calling lock_range() -- for example, MyRocks does this in ha_rocksdb::set_range_lock() (see its "RangeFlagsShouldBeFlippedForRevCF" comment). The two corrections stack and produce left > right again, which: (a) trips paranoid_invariant(compare(left, right) <= 0) in toku::locktree::try_acquire_lock() and aborts on equality/point locks, and (b) for range scans, mis-orders the endpoints and produces an incorrect lock range, leading to silent correctness issues under concurrency. #14880 only addressed (a) (the equal-length / point-range branch), not (b), so it was an incomplete fix at the wrong layer. Decision: keep CompareDbtEndpoints direction-agnostic and make it an explicit caller requirement to pass a flipped (swapped start/end, with negated infimum/supremum flags) range for range locking to work on a reverse-ordered column family. A function-level comment documenting this contract is added so reverse-awareness is not reintroduced here. Removes the RangeLockWithReverseComparator unit test added by #14831 (it asserts the reverted direction-flipped ordering). Keeps a regression test that a point range lock [infimum(K), supremum(K)] works on a reverse-comparator column family. Reviewed By: xingbowang Differential Revision: D109713989 fbshipit-source-id: 4ea0999c92645e1adc9cd1114d924cad142f2001
1 parent 8c0790b commit 43d60bf

4 files changed

Lines changed: 29 additions & 73 deletions

File tree

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1 @@
1+
Reverted PR14831 that made range_lock_manager aware of reverse-order CF

‎utilities/transactions/lock/range/range_locking_test.cc‎

Lines changed: 0 additions & 34 deletions
Original file line numberDiff line numberDiff line change
@@ -102,40 +102,6 @@ TEST_F(RangeLockingTest, BasicRangeLocking) {
102102
delete txn1;
103103
}
104104

105-
// CompareDbtEndpoints must honor ReverseBytewiseComparator direction
106-
// in length-disparity case.
107-
TEST_F(RangeLockingTest, RangeLockWithReverseComparator) {
108-
delete db;
109-
db = nullptr;
110-
ASSERT_OK(DestroyDB(dbname, options));
111-
112-
options.comparator = ReverseBytewiseComparator();
113-
range_lock_mgr.reset(NewRangeLockManager(nullptr));
114-
txn_db_options.lock_mgr_handle = range_lock_mgr;
115-
116-
ASSERT_OK(TransactionDB::Open(options, txn_db_options, dbname, &db));
117-
118-
WriteOptions write_options;
119-
TransactionOptions txn_options;
120-
txn_options.lock_timeout = 50;
121-
auto cf = db->DefaultColumnFamily();
122-
123-
Transaction* txn0 = db->BeginTransaction(write_options, txn_options);
124-
Transaction* txn1 = db->BeginTransaction(write_options, txn_options);
125-
126-
ASSERT_OK(txn0->GetRangeLock(cf, Endpoint("aa"), Endpoint("a")));
127-
128-
auto s = txn1->GetRangeLock(cf, Endpoint("ab"), Endpoint("a"));
129-
ASSERT_TRUE(s.IsTimedOut());
130-
131-
ASSERT_OK(txn1->GetRangeLock(cf, Endpoint("b"), Endpoint("b")));
132-
133-
txn0->Rollback();
134-
txn1->Rollback();
135-
delete txn0;
136-
delete txn1;
137-
}
138-
139105
// A point-equality range lock passes [infimum(K), supremum(K)]: the same user
140106
// key with the infimum endpoint as the left bound and the supremum endpoint as
141107
// the right bound. The infimum endpoint must sort before the supremum endpoint

‎utilities/transactions/lock/range/range_tree/range_tree_lock_manager.cc‎

Lines changed: 27 additions & 35 deletions
Original file line numberDiff line numberDiff line change
@@ -39,18 +39,6 @@ RangeLockManagerHandle* NewRangeLockManager(
3939
static const char SUFFIX_INFIMUM = 0x0;
4040
static const char SUFFIX_SUPREMUM = 0x1;
4141

42-
struct RangeTreeCmpContext {
43-
const Comparator* cmp;
44-
bool is_reverse;
45-
};
46-
47-
namespace {
48-
[[nodiscard]] bool IsReverseComparator(const Comparator* cmp) {
49-
return cmp == ReverseBytewiseComparator() ||
50-
cmp == ReverseBytewiseComparatorWithU64Ts();
51-
}
52-
} // namespace
53-
5442
// Convert Endpoint into an internal format used for storing it in locktree
5543
// (DBT structure is used for passing endpoints to locktree and getting back)
5644
void serialize_endpoint(const Endpoint& endp, std::string* buf) {
@@ -213,6 +201,20 @@ void RangeTreeLockManager::UnLock(PessimisticTransaction* txn,
213201
((RangeTreeLockTracker*)range_tracker)->ReleaseLocks(this, txn, all_keys);
214202
}
215203

204+
// Comparator for range-lock endpoints. This is intentionally
205+
// direction-AGNOSTIC: it orders the user-key bytes with the column family's own
206+
// comparator, then breaks ties on the endpoint-type suffix byte (infimum before
207+
// supremum), which is a positional marker independent of the CF's sort
208+
// direction.
209+
//
210+
// CALLER REQUIREMENT: it is the caller's responsibility to pass in a flipped
211+
// (swapped start/end, with negated infimum/supremum flags) range for RangeLock
212+
// to work on a REVERSE-ORDERED column family. The locktree requires every range
213+
// to satisfy left <= right, and this comparator does NOT compensate for reverse
214+
// ordering on the caller's behalf. MyRocks does this flip today in
215+
// ha_rocksdb::set_range_lock() (see its "RangeFlagsShouldBeFlippedForRevCF"
216+
// comment). Do not re-add reverse-awareness here (that double-corrects callers
217+
// like MyRocks that already flip).
216218
int RangeTreeLockManager::CompareDbtEndpoints(void* arg, const DBT* a_key,
217219
const DBT* b_key) {
218220
const char* a = (const char*)a_key->data;
@@ -225,28 +227,22 @@ int RangeTreeLockManager::CompareDbtEndpoints(void* arg, const DBT* a_key,
225227

226228
// Compare the values. The first byte encodes the endpoint type, its value
227229
// is either SUFFIX_INFIMUM or SUFFIX_SUPREMUM.
228-
const auto* ctx = static_cast<const RangeTreeCmpContext*>(arg);
229-
const auto* cmp = ctx->cmp;
230-
const bool is_reverse = ctx->is_reverse;
230+
Comparator* cmp = (Comparator*)arg;
231231
int res = cmp->CompareWithoutTimestamp(
232232
Slice(a + 1, min_len - 1), /*a_has_ts=*/false, Slice(b + 1, min_len - 1),
233233
/*b_has_ts=*/false);
234234
if (!res) {
235+
// The user keys compared equal above, so order by the endpoint-type suffix
236+
// byte (a positional boundary marker: infimum before the key, supremum
237+
// after) -- direction-independent; see the function-level comment.
235238
if (b_len > min_len) {
236-
int r = a[0] == SUFFIX_INFIMUM ? -1 : 1;
237-
return is_reverse ? -r : r;
239+
// a's user key is a prefix of b's; a is shorter.
240+
return a[0] == SUFFIX_INFIMUM ? -1 : 1;
238241
} else if (a_len > min_len) {
239-
int r = b[0] == SUFFIX_INFIMUM ? 1 : -1;
240-
return is_reverse ? -r : r;
242+
// b's user key is a prefix of a's; b is shorter.
243+
return b[0] == SUFFIX_INFIMUM ? 1 : -1;
241244
} else {
242-
// Same user key, differing only in the endpoint-type suffix byte. The
243-
// suffix (SUFFIX_INFIMUM vs SUFFIX_SUPREMUM) is a positional boundary
244-
// marker for that key, not key content, so the infimum endpoint always
245-
// sorts before the supremum endpoint regardless of the column family's
246-
// sort direction. This must NOT be flipped for a reverse comparator:
247-
// doing so makes m_cmp(infimum(K), supremum(K)) > 0 and trips the
248-
// locktree's "left <= right" invariant for a [infimum(K), supremum(K)]
249-
// point range (e.g. a MyRocks equality lookup on a reverse 'rev:' CF).
245+
// Same user key: infimum sorts before supremum.
250246
if (a[0] < b[0]) {
251247
return -1;
252248
} else if (a[0] > b[0]) {
@@ -373,11 +369,10 @@ RangeLockManagerHandle::Counters RangeTreeLockManager::GetStatus() {
373369
}
374370

375371
std::shared_ptr<toku::locktree> RangeTreeLockManager::MakeLockTreePtr(
376-
toku::locktree* lt, std::unique_ptr<RangeTreeCmpContext> ctx) {
372+
toku::locktree* lt) {
377373
toku::locktree_manager* ltm = &ltm_;
378374
return std::shared_ptr<toku::locktree>(
379-
lt,
380-
[ltm, ctx = std::move(ctx)](toku::locktree* p) { ltm->release_lt(p); });
375+
lt, [ltm](toku::locktree* p) { ltm->release_lt(p); });
381376
}
382377

383378
void RangeTreeLockManager::AddColumnFamily(const ColumnFamilyHandle* cfh) {
@@ -386,18 +381,15 @@ void RangeTreeLockManager::AddColumnFamily(const ColumnFamilyHandle* cfh) {
386381
InstrumentedMutexLock l(&ltree_map_mutex_);
387382
if (ltree_map_.find(column_family_id) == ltree_map_.end()) {
388383
DICTIONARY_ID dict_id = {.dictid = column_family_id};
389-
auto ctx = std::make_unique<RangeTreeCmpContext>(RangeTreeCmpContext{
390-
cfh->GetComparator(), IsReverseComparator(cfh->GetComparator())});
391384
toku::comparator cmp;
392-
cmp.create(CompareDbtEndpoints, ctx.get());
385+
cmp.create(CompareDbtEndpoints, (void*)cfh->GetComparator());
393386
toku::locktree* ltree =
394387
ltm_.get_lt(dict_id, cmp,
395388
/* on_create_extra*/ static_cast<void*>(this));
396389
// This is ok to because get_lt has copied the comparator:
397390
cmp.destroy();
398391

399-
ltree_map_.insert(
400-
{column_family_id, MakeLockTreePtr(ltree, std::move(ctx))});
392+
ltree_map_.insert({column_family_id, MakeLockTreePtr(ltree)});
401393
}
402394
}
403395

‎utilities/transactions/lock/range/range_tree/range_tree_lock_manager.h‎

Lines changed: 1 addition & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -20,8 +20,6 @@ namespace ROCKSDB_NAMESPACE {
2020

2121
typedef DeadlockInfoBufferTempl<RangeDeadlockPath> RangeDeadlockInfoBuffer;
2222

23-
struct RangeTreeCmpContext;
24-
2523
// A Range Lock Manager that uses PerconaFT's locktree library
2624
class RangeTreeLockManager : public RangeLockManagerBase,
2725
public RangeLockManagerHandle {
@@ -118,8 +116,7 @@ class RangeTreeLockManager : public RangeLockManagerBase,
118116

119117
RangeDeadlockInfoBuffer dlock_buffer_;
120118

121-
std::shared_ptr<toku::locktree> MakeLockTreePtr(
122-
toku::locktree* lt, std::unique_ptr<RangeTreeCmpContext> ctx = nullptr);
119+
std::shared_ptr<toku::locktree> MakeLockTreePtr(toku::locktree* lt);
123120
static int CompareDbtEndpoints(void* arg, const DBT* a_key, const DBT* b_key);
124121

125122
// Callbacks

0 commit comments

Comments
 (0)