Skip to content

Conversation

@laurynas-biveinis
Copy link
Contributor

  • Introduce Rdb_transaction_impl::m_rdb_tx_mutex to protect writes to
    m_rocksdb_tx[USER_TABLE] from the query thread and reads from other threads.
  • To protect query thread writes, add critical sections to begin_rdb_tx and
    release_tx methods.
  • Make get_rdb_trx lock this mutex before returning the RocksDB transaction
    pointer and introduce unlock_rdb_trx method to unlock this mutex.
  • Call the above methods in Rdb_trx_info_aggregator::process_tran. Move
    unrelated code out of this critical section.
@facebook-github-bot
Copy link

@luqun has imported this pull request. If you are a Meta employee, you can view this diff on Phabricator.

@luqun
Copy link
Contributor

luqun commented Jan 21, 2025

could you add a MTR for this if possible?

@laurynas-biveinis
Copy link
Contributor Author

Patch:

diff --git a/storage/rocksdb/ha_rocksdb.cc b/storage/rocksdb/ha_rocksdb.cc
index 0727fc66a87..b091e141292 100644
--- a/storage/rocksdb/ha_rocksdb.cc
+++ b/storage/rocksdb/ha_rocksdb.cc
@@ -5407,6 +5407,9 @@ class Rdb_transaction_impl : public Rdb_transaction {
     assert(get_statement_snapshot_type() == snapshot_type::NONE);
     assert_snapshot_invariants();
     assert(m_rocksdb_reuse_tx[TABLE_TYPE::USER_TABLE] == nullptr);
+
+    DEBUG_SYNC(m_thd, "myrocks_release_tx");
+
     auto tx = std::move(m_rocksdb_tx[TABLE_TYPE::USER_TABLE]);
     if (rocksdb_write_batch_mem_free_threshold > 0 &&
         wb_size > rocksdb_write_batch_mem_free_threshold) {
@@ -7740,6 +7743,9 @@ class Rdb_trx_info_aggregator : public Rdb_tx_list_walker {
       const int is_replication = (thd->rli_slave != nullptr);
       uint32_t waiting_cf_id;
       std::string waiting_key;
+
+      DEBUG_SYNC(current_thd, "myrocks_in_rocksdb_trx_with_trx");
+
       rdb_trx->GetWaitingTxns(&waiting_cf_id, &waiting_key),
 
           m_trx_info->push_back(

Test:

--source include/have_debug_sync.inc

--source include/count_sessions.inc

CREATE TABLE t1(a INT PRIMARY KEY) ENGINE=ROCKSDB;

SET @saved_rdb_wb_mem_free_threshold = @@global.rocksdb_write_batch_mem_free_threshold;
SET GLOBAL rocksdb_write_batch_mem_free_threshold = 1;

BEGIN;
INSERT INTO t1 VALUES (1);

SET DEBUG_SYNC = "myrocks_release_tx SIGNAL ready1 WAIT_FOR continue1";
send COMMIT;

--connect(con1,localhost,root)
SET DEBUG_SYNC = "now WAIT_FOR ready1";
SET DEBUG_SYNC = "myrocks_in_rocksdb_trx_with_trx SIGNAL ready2 WAIT_FOR continue2";
send SELECT * FROM INFORMATION_SCHEMA.ROCKSDB_TRX;

--connect(con2,localhost,root)

SET DEBUG_SYNC = "now WAIT_FOR ready2";
SET DEBUG_SYNC = "now SIGNAL continue1";

--disconnect con2

--connection default
reap;

SET DEBUG_SYNC = "now SIGNAL continue2";

--connection con1
reap;

--disconnect con1
--connection default

SET GLOBAL rocksdb_write_batch_mem_free_threshold = @saved_rdb_wb_mem_free_threshold;
DROP TABLE t1;

--source include/wait_until_count_sessions.inc

Output:

=================================================================
==21725==ERROR: AddressSanitizer: heap-use-after-free on address 0x617000225c00 at pc 0x000107ffd510 bp 0x00017ab2c650 sp 0x00017ab2c648
READ of size 8 at 0x617000225c00 thread T83
    #0 0x107ffd50c in myrocks::Rdb_trx_info_aggregator::process_tran(myrocks::Rdb_transaction const*) ha_rocksdb.cc:7749
    #1 0x107f78ebc in myrocks::Rdb_transaction_list::for_each(myrocks::Rdb_tx_list_walker&) ha_rocksdb.cc:6590
    #2 0x107f79d58 in myrocks::rdb_get_all_trx_info() ha_rocksdb.cc:7769
    #3 0x10829dc20 in myrocks::rdb_i_s_trx_info_fill_table(THD*, Table_ref*, Item*) rdb_i_s.cc:2317
    #4 0x104245590 in do_fill_information_schema_table(THD*, Table_ref*, Item*) sql_show.cc:5784
    #5 0x103538d38 in MaterializeInformationSchemaTableIterator::Init() composite_iterators.cc:2232
    #6 0x1043ce944 in Query_expression::ExecuteIteratorQuery(THD*) sql_union.cc:1763
    #7 0x1043cf49c in Query_expression::execute(THD*) sql_union.cc:1825
    #8 0x1041e7fc8 in Sql_cmd_dml::execute_inner(THD*) sql_select.cc:868
    #9 0x1041e603c in Sql_cmd_dml::execute(THD*) sql_select.cc:616
    #10 0x104062d7c in mysql_execute_command(THD*, bool, unsigned long long*) sql_parse.cc:5542
    #11 0x10405a3b8 in dispatch_sql_command(THD*, Parser_state*, unsigned long long*) sql_parse.cc:6350
    #12 0x10404f76c in dispatch_command(THD*, COM_DATA const*, enum_server_command) sql_parse.cc:2566
    #13 0x104055708 in do_command(THD*) sql_parse.cc:1746
    #14 0x1046dc2fc in handle_connection(void*) connection_handler_per_thread.cc:307
    #15 0x10860b73c in pfs_spawn_thread(void*) pfs.cc:3022
    #16 0x122d2d858 in asan_thread_start(void*)+0x40 (libclang_rt.asan_osx_dynamic.dylib:arm64e+0x51858)
    #17 0x1925782e0 in _pthread_start+0x84 (libsystem_pthread.dylib:arm64e+0x72e0)
    #18 0x1925730f8 in thread_start+0x4 (libsystem_pthread.dylib:arm64e+0x20f8)

0x617000225c00 is located 0 bytes inside of 680-byte region [0x617000225c00,0x617000225ea8)
freed by thread T82 here:
    #0 0x122d402d4 in _ZdlPv+0x74 (libclang_rt.asan_osx_dynamic.dylib:arm64e+0x642d4)
    #1 0x1080274d8 in myrocks::Rdb_transaction_impl::release_tx(unsigned long) ha_rocksdb.cc:5416
    #2 0x10801ef1c in myrocks::Rdb_transaction_impl::commit_no_binlog(myrocks::TABLE_TYPE) ha_rocksdb.cc:5529
    #3 0x107fafce0 in myrocks::Rdb_transaction::commit() ha_rocksdb.cc:4176
    #4 0x1080ae8ec in myrocks::rocksdb_commit(handlerton*, THD*, bool) ha_rocksdb.cc:7371
    #5 0x102f6fcd4 in ha_commit_low(THD*, bool, bool) handler.cc:1886
    #6 0x104537598 in trx_coordinator::commit_in_engines(THD*, bool, bool) tc_log.cc:146
    #7 0x105fb96ac in MYSQL_BIN_LOG::finish_transaction_in_engines(THD*, bool, bool) binlog.cc:15984
    #8 0x105fb899c in MYSQL_BIN_LOG::process_commit_stage_queue(THD*, THD*) binlog.cc:11387
    #9 0x105f7b350 in MYSQL_BIN_LOG::ordered_commit(THD*, bool, bool) binlog.cc:12412
    #10 0x105f686e8 in MYSQL_BIN_LOG::commit(THD*, bool) binlog.cc:11051
    #11 0x102f6c7b8 in ha_commit_trans(THD*, bool, bool) handler.cc:1729
    #12 0x104550758 in trans_commit(THD*, bool) transaction.cc:268
    #13 0x10406b164 in mysql_execute_command(THD*, bool, unsigned long long*) sql_parse.cc:5102
    #14 0x10405a3b8 in dispatch_sql_command(THD*, Parser_state*, unsigned long long*) sql_parse.cc:6350
    #15 0x10404f76c in dispatch_command(THD*, COM_DATA const*, enum_server_command) sql_parse.cc:2566
    #16 0x104055708 in do_command(THD*) sql_parse.cc:1746
    #17 0x1046dc2fc in handle_connection(void*) connection_handler_per_thread.cc:307
    #18 0x10860b73c in pfs_spawn_thread(void*) pfs.cc:3022
    #19 0x122d2d858 in asan_thread_start(void*)+0x40 (libclang_rt.asan_osx_dynamic.dylib:arm64e+0x51858)
    #20 0x1925782e0 in _pthread_start+0x84 (libsystem_pthread.dylib:arm64e+0x72e0)
    #21 0x1925730f8 in thread_start+0x4 (libsystem_pthread.dylib:arm64e+0x20f8)

@laurynas-biveinis
Copy link
Contributor Author

The testcase does not port directly to the fix branch because one of the debug sync points ends up in the fixing mutex critical section

- Introduce Rdb_transaction_impl::m_rdb_tx_mutex to protect writes to
  m_rocksdb_tx[USER_TABLE] from the query thread and reads from other threads.
- To protect query thread writes, add critical sections to begin_rdb_tx and
  release_tx methods.
- Make get_rdb_trx lock this mutex before returning the RocksDB transaction
  pointer and introduce unlock_rdb_trx method to unlock this mutex.
- Call the above methods in Rdb_trx_info_aggregator::process_tran. Move
  unrelated code out of this critical section.
@facebook-github-bot
Copy link

@laurynas-biveinis has updated the pull request. You must reimport the pull request before landing.

@laurynas-biveinis
Copy link
Contributor Author

@luqun, ready for review again

@facebook-github-bot
Copy link

@luqun has imported this pull request. If you are a Meta employee, you can view this diff on Phabricator.

@facebook-github-bot
Copy link

This pull request has been merged in c6e4b9f.

@laurynas-biveinis laurynas-biveinis deleted the rdb-trx-concurrency branch January 30, 2025 11:45
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

3 participants