Skip to content

Conversation

@laurynas-biveinis
Copy link
Contributor

Almost all accesses to Rdb_transaction objects are from the owning
query-executing thread, but some fields are read from other threads, for
example, to execute SHOW ENGINE ROCKSDB STATUS, causing intermittent crashes due
to e.g. wild snapshot pointer read to get its timestamp.

Fix by moving all shared fields to the beginning of the class, documenting their
protection and making atomic as needed. For atomic fields, use only relaxed
memory order accesses, which should result in the same compiled code as before.
For the cases where something is read through a pointer, cache that data instead
in an atomic field and don't dereference the pointer from other threads:
Rdb_transaction::m_snapshot_ts instead of
m_read_opts[USER_TABLE].snapshot->GetUnixTime() and m_num_ongoing_bulk_load
instead of m_bulk_load_ctx->num_bulk_load().

Almost all accesses to Rdb_transaction objects are from the owning
query-executing thread, but some fields are read from other threads, for
example, to execute SHOW ENGINE ROCKSDB STATUS, causing intermittent crashes due
to e.g. wild snapshot pointer read to get its timestamp.

Fix by moving all shared fields to the beginning of the class, documenting their
protection and making atomic as needed. For atomic fields, use only relaxed
memory order accesses, which should result in the same compiled code as before.
For the cases where something is read through a pointer, cache that data instead
in an atomic field and don't dereference the pointer from other threads:
Rdb_transaction::m_snapshot_ts instead of
m_read_opts[USER_TABLE].snapshot->GetUnixTime() and m_num_ongoing_bulk_load
instead of m_bulk_load_ctx->num_bulk_load().
@facebook-github-bot
Copy link

@luqun has imported this pull request. If you are a Meta employee, you can view this diff on Phabricator.

statement_snapshot_type.store(snapshot_type::NONE,
std::memory_order_relaxed);
rdb->ReleaseSnapshot(m_read_opts[table_type].snapshot);
m_read_opts[table_type].snapshot = nullptr;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For line 6231 and line 6232, may also not safe for other threads(use after delete issue)

current thread finished execute "rdb->ReleaseSnapshot()" but before execute "m_read_opts[table_type].snapshot = nullptr", while another thread is calling tx->get_snapshot_ts();(https://github.com/facebook/mysql-5.6/blob/fb-mysql-8.0.32/storage/rocksdb/ha_rocksdb.cc#L3987). the snapshot has been cleared but m_read_opts[USER_TABLE].snapshot may still reference old memory

maybe change to

      m_read_opts[table_type].snapshot = nullptr; 
      rdb->ReleaseSnapshot(m_read_opts[table_type].snapshot);
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

get_snapshot_ts does not race here, because this PR introduces a caching field exactly for this scenario.

Having said that, m_read_opts[table_type].snapshot is indeed accessed incorrectly in other contexts (see my e-mail) and will be fixed one way or another in a follow-up

@facebook-github-bot
Copy link

This pull request has been merged in 7b2d125.

@laurynas-biveinis laurynas-biveinis deleted the trx-obj-concurrency branch January 16, 2025 08:53
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

3 participants