Skip to content

Add ColumnDB projected blob read prototype#14847

Draft
xingbowang wants to merge 1 commit into
facebook:mainfrom
xingbowang:export-D108213380
Draft

Add ColumnDB projected blob read prototype#14847
xingbowang wants to merge 1 commit into
facebook:mainfrom
xingbowang:export-D108213380

Conversation

@xingbowang

Copy link
Copy Markdown
Contributor

Summary:
Add a ColumnDB stackable DB prototype in fbcode/internal_repo_rocksdb/repo for projected reads from blob-backed wide-column entities. ColumnDB reads the inline schema column, invokes a caller-provided translate callback to map requested columns to blob byte ranges, coalesces adjacent ranges, and calls DBImpl::MultiGetBlobRanges to issue partial blob-file reads.

This also adds lazy V2 wide-column blob-index exposure for callers that explicitly request raw blob indexes, partial blob-range read plumbing through DBImpl and Version, external blob-file support in SstFileWriter and external-file ingestion, and db_bench workloads plus tools/run_column_db_bench.sh for full-read versus projected-read comparison.

Benchmark result from the prior direct-IO host run:

value_size projected stride full_read_ops column_read_ops read_speedup full_read_cpu_s column_read_cpu_s read_cpu_s_ratio full_read_fs_inputs column_read_fs_inputs read_fs_input_ratio
102400 5 1 32401 44437 1.371 40.660 25.480 0.627 172803016 26247264 0.152
102400 5 97 32667 31848 0.975 40.670 39.480 0.971 172803016 51852160 0.300
102400 10 1 32659 44076 1.350 40.490 25.960 0.641 172803016 27046424 0.157
102400 10 97 32297 27431 0.849 42.310 60.310 1.425 172803016 84649256 0.490
1048576 5 1 7802 41900 5.370 225.720 28.860 0.128 1651236952 33620536 0.020
1048576 5 97 7793 37065 4.756 226.570 39.910 0.176 1651236952 59209224 0.036
1048576 10 1 7795 41071 5.269 224.660 28.500 0.127 1651236952 41813008 0.025
1048576 10 97 7806 31307 4.011 223.940 62.210 0.278 1651236952 99409896 0.060

Differential Revision: D108213380

Summary:
Add a `ColumnDB` stackable DB prototype in `fbcode/internal_repo_rocksdb/repo` for projected reads from blob-backed wide-column entities. `ColumnDB` reads the inline schema column, invokes a caller-provided translate callback to map requested columns to blob byte ranges, coalesces adjacent ranges, and calls `DBImpl::MultiGetBlobRanges` to issue partial blob-file reads.

This also adds lazy V2 wide-column blob-index exposure for callers that explicitly request raw blob indexes, partial blob-range read plumbing through `DBImpl` and `Version`, external blob-file support in `SstFileWriter` and external-file ingestion, and `db_bench` workloads plus `tools/run_column_db_bench.sh` for full-read versus projected-read comparison.

Benchmark result from the prior direct-IO host run:

| value_size | projected | stride | full_read_ops | column_read_ops | read_speedup | full_read_cpu_s | column_read_cpu_s | read_cpu_s_ratio | full_read_fs_inputs | column_read_fs_inputs | read_fs_input_ratio |
| --- | --- | --- | --- | --- | --- | --- | --- | --- | --- | --- | --- |
| 102400 | 5 | 1 | 32401 | 44437 | 1.371 | 40.660 | 25.480 | 0.627 | 172803016 | 26247264 | 0.152 |
| 102400 | 5 | 97 | 32667 | 31848 | 0.975 | 40.670 | 39.480 | 0.971 | 172803016 | 51852160 | 0.300 |
| 102400 | 10 | 1 | 32659 | 44076 | 1.350 | 40.490 | 25.960 | 0.641 | 172803016 | 27046424 | 0.157 |
| 102400 | 10 | 97 | 32297 | 27431 | 0.849 | 42.310 | 60.310 | 1.425 | 172803016 | 84649256 | 0.490 |
| 1048576 | 5 | 1 | 7802 | 41900 | 5.370 | 225.720 | 28.860 | 0.128 | 1651236952 | 33620536 | 0.020 |
| 1048576 | 5 | 97 | 7793 | 37065 | 4.756 | 226.570 | 39.910 | 0.176 | 1651236952 | 59209224 | 0.036 |
| 1048576 | 10 | 1 | 7795 | 41071 | 5.269 | 224.660 | 28.500 | 0.127 | 1651236952 | 41813008 | 0.025 |
| 1048576 | 10 | 97 | 7806 | 31307 | 4.011 | 223.940 | 62.210 | 0.278 | 1651236952 | 99409896 | 0.060 |

Differential Revision: D108213380
@meta-cla meta-cla Bot added the CLA Signed label Jun 11, 2026
@meta-codesync

meta-codesync Bot commented Jun 11, 2026

Copy link
Copy Markdown

@xingbowang has exported this pull request. If you are a Meta employee, you can view the originating Diff in D108213380.

@github-actions

Copy link
Copy Markdown

⚠️ clang-tidy: 1 warning(s) on changed lines

Completed in 425.4s.

Summary by check

Check Count
performance-no-automatic-move 1
Total 1

Details

table/sst_file_writer.cc (1 warning(s))
table/sst_file_writer.cc:277:14: warning: constness of 's' prevents automatic move [performance-no-automatic-move]
@xingbowang xingbowang marked this pull request as draft June 11, 2026 14:20
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

1 participant