[rocksandra] support cassandra partition deletion#3874
Open
wpc wants to merge 1 commit into
Open
Conversation
facebook-github-bot
left a comment
Contributor
There was a problem hiding this comment.
@wpc has imported this pull request. If you are a Facebook employee, you can view this diff on Phabricator.
Contributor
|
@wpc has updated the pull request. |
* add a merge operator for parition meta data (currently partition deletion info only) * read partition deletion in cassandra compaction filter and drop rows if it's partition has been deleted
Contributor
|
@wpc has updated the pull request. |
Contributor
Author
|
update the PR make sure iterator on partition meta cf is deleted after use |
DikangGu
pushed a commit
to Instagram/cassandra
that referenced
this pull request
Sep 2, 2018
Summary: For supporting partition level deletion we create a partition meta cf in each rocksdb instance, and store partition deletion info into it. On rocksdb side compaction filter will read partition deletion info from this cf and drop data base on marked_for_delete_at. (facebook/rocksdb#3874) Streaming for partition meta data will be in a separated diff Test Plan: Fucntional ======== partition dump after deletion ``` --- metadata: 0x816099270c387b3c989d7ecf13053ec1 0x5b0254a800056cb0504844ab --- rows: 0x816099270c387b3c989d7ecf13053ec180000000238e8b0e 0x7fffffff8000000000000000000000056b954457139f00000010be10e78051a611e88080808080808080 0x816099270c387b3c989d7ecf13053ec18000000046a34f04 0x7fffffff8000000000000000000000056b955bb808da000000109a01d60051a711e88080808080808080 0x816099270c387b3c989d7ecf13053ec180000000c1b912cd 0x7fffffff8000000000000000000000056b952ddfd27100000010e51ae98051a511e88080808080808080 0x816099270c387b3c989d7ecf13053ec180000000db4233f7 0x7fffffff8000000000000000000000056b9533f4a2a3000000101a273c0051a611e88080808080808080 0x816099270c387b3c989d7ecf13053ec180000000f7465f43 0x7fffffff8000000000000000000000056b954625e8fa00000010d386118051a611e88080808080808080 0x816099270c387b3c989d7ecf13053ec18000000116af6e77 0x7fffffff8000000000000000000000056b959d045dbf000000103e85178051aa11e88080808080808080 ``` dump after full compaction finish ``` --- metadata: 0x816099270c387b3c989d7ecf13053ec1 0x5b0254a800056cb0504844ab --- rows: ``` Performance ========== No obvious CPU/IO regresssion https://fburl.com/ods/e9m1tdr2 https://fburl.com/ods/o5j049hb Reviewers: svemuri, dikang, sdev, #ig-cassandra Reviewed By: dikang Subscribers: fdeliege, trunkagent Differential Revision: https://phabricator.intern.facebook.com/D8063994 Signature: 8063994:1527988646:7d236751d82d4fee40e5b0ca3dd1da94d8e97e57
wpc
added a commit
to wpc/cassandra
that referenced
this pull request
Jan 29, 2019
Summary: For supporting partition level deletion we create a partition meta cf in each rocksdb instance, and store partition deletion info into it. On rocksdb side compaction filter will read partition deletion info from this cf and drop data base on marked_for_delete_at. (facebook/rocksdb#3874) Streaming for partition meta data will be in a separated diff Test Plan: Fucntional ======== testing it in storyarchive cluster partition dump after deletion ``` [23:32:32 root@priv_prn/instagram/cassandra-data-storyarchiverocks/25 /var/log/cassandra]$ nodetool dumppartition storyarchive reel_media_viewer_by_ts_perm_compact_001 1773713256096284353 --- metadata: 0x816099270c387b3c989d7ecf13053ec1 0x5b0254a800056cb0504844ab --- rows: 0x816099270c387b3c989d7ecf13053ec180000000238e8b0e 0x7fffffff8000000000000000000000056b954457139f00000010be10e78051a611e88080808080808080 0x816099270c387b3c989d7ecf13053ec18000000046a34f04 0x7fffffff8000000000000000000000056b955bb808da000000109a01d60051a711e88080808080808080 0x816099270c387b3c989d7ecf13053ec180000000c1b912cd 0x7fffffff8000000000000000000000056b952ddfd27100000010e51ae98051a511e88080808080808080 0x816099270c387b3c989d7ecf13053ec180000000db4233f7 0x7fffffff8000000000000000000000056b9533f4a2a3000000101a273c0051a611e88080808080808080 0x816099270c387b3c989d7ecf13053ec180000000f7465f43 0x7fffffff8000000000000000000000056b954625e8fa00000010d386118051a611e88080808080808080 0x816099270c387b3c989d7ecf13053ec18000000116af6e77 0x7fffffff8000000000000000000000056b959d045dbf000000103e85178051aa11e88080808080808080 ``` dump after full compaction finish ``` [01:19:46 root@priv_prn/instagram/cassandra-data-storyarchiverocks/25 /var/log/cassandra]$ nodetool dumppartition storyarchive reel_media_viewer_by_ts_perm_compact_001 1773713256096284353 --- metadata: 0x816099270c387b3c989d7ecf13053ec1 0x5b0254a800056cb0504844ab --- rows: ``` Performance ========== Tested on priv_ftw/instagram/cassandra-data-feedviewstaterocks/15 (high compaction, no deletion), deploy at 5/31 2:54pm No obvious CPU/IO regresssion https://fburl.com/ods/e9m1tdr2 https://fburl.com/ods/o5j049hb Reviewers: svemuri, dikang, sdev, #ig-cassandra Reviewed By: dikang Subscribers: fdeliege, trunkagent Differential Revision: https://phabricator.intern.facebook.com/D8063994 Signature: 8063994:1527988646:7d236751d82d4fee40e5b0ca3dd1da94d8e97e57
Contributor
|
@wpc Do we still need it? |
Contributor
Yes, this is still needed. |
Contributor
Contributor
|
@wpc @cooldoger the change LGTM! I haven't gone into details of cassandra specific logic, hopefully someone on your team can review that. Please rebase and make sure all tests pass and I'd be happy to land it. |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
To support partition deletion in Rocksandra, we created a separated partition meta cf in each database and passing db and cf handle into the compaction filter. The compaction filter is in charge of dropping the deleted data based on deletion info it read from the partition meta cf. This PR is the first step just for releasing the disk space. Next step would change in cassandra merge operator to convert partition deleted rows into tombstones.
deletion info only)
if it's partition has been deleted
make formatfor cassandra related test files