Skip to content

Conversation

@iceboundrock
Copy link

@iceboundrock iceboundrock commented Jan 29, 2026

I tested it locally

cargo clippy --all-targets --all-features
    Blocking waiting for file lock on build directory
   Compiling pyo3-build-config v0.27.2
   Compiling vortex-cuda v0.1.0 (/Users/ruoshi/code/github/vortex/vortex-cuda)
   Compiling pyo3-ffi v0.27.2
   Compiling pyo3-macros-backend v0.27.2
   Compiling pyo3 v0.27.2
   Compiling pyo3-macros v0.27.2
    Checking vortex-error v0.1.0 (/Users/ruoshi/code/github/vortex/vortex-error)
    Checking pyo3-bytes v0.5.0
    Checking pyo3-log v0.13.2
    Checking vortex-buffer v0.1.0 (/Users/ruoshi/code/github/vortex/vortex-buffer)
    Checking vortex-utils v0.1.0 (/Users/ruoshi/code/github/vortex/vortex-utils)
    Checking vortex-session v0.1.0 (/Users/ruoshi/code/github/vortex/vortex-session)
    Checking vortex-metrics v0.1.0 (/Users/ruoshi/code/github/vortex/vortex-metrics)
    Checking vortex-flatbuffers v0.1.0 (/Users/ruoshi/code/github/vortex/vortex-flatbuffers)
    Checking vortex-mask v0.1.0 (/Users/ruoshi/code/github/vortex/vortex-mask)
    Checking vortex-io v0.1.0 (/Users/ruoshi/code/github/vortex/vortex-io)
    Checking vortex-dtype v0.1.0 (/Users/ruoshi/code/github/vortex/vortex-dtype)
    Checking vortex-vector v0.1.0 (/Users/ruoshi/code/github/vortex/vortex-vector)
    Checking vortex-scalar v0.1.0 (/Users/ruoshi/code/github/vortex/vortex-scalar)
    Checking vortex-compute v0.1.0 (/Users/ruoshi/code/github/vortex/vortex-compute)
    Checking vortex-array v0.1.0 (/Users/ruoshi/code/github/vortex/vortex-array)
    Checking vortex-fastlanes v0.1.0 (/Users/ruoshi/code/github/vortex/encodings/fastlanes)
    Checking vortex-zigzag v0.1.0 (/Users/ruoshi/code/github/vortex/encodings/zigzag)
    Checking vortex-decimal-byte-parts v0.1.0 (/Users/ruoshi/code/github/vortex/encodings/decimal-byte-parts)
    Checking vortex-runend v0.1.0 (/Users/ruoshi/code/github/vortex/encodings/runend)
    Checking vortex-zstd v0.1.0 (/Users/ruoshi/code/github/vortex/encodings/zstd)
    Checking vortex-fsst v0.1.0 (/Users/ruoshi/code/github/vortex/encodings/fsst)
    Checking vortex-datetime-parts v0.1.0 (/Users/ruoshi/code/github/vortex/encodings/datetime-parts)
    Checking vortex-sparse v0.1.0 (/Users/ruoshi/code/github/vortex/encodings/sparse)
    Checking vortex-pco v0.1.0 (/Users/ruoshi/code/github/vortex/encodings/pco)
    Checking vortex-bytebool v0.1.0 (/Users/ruoshi/code/github/vortex/encodings/bytebool)
    Checking vortex-ipc v0.1.0 (/Users/ruoshi/code/github/vortex/vortex-ipc)
    Checking vortex-sequence v0.1.0 (/Users/ruoshi/code/github/vortex/encodings/sequence)
    Checking vortex-alp v0.1.0 (/Users/ruoshi/code/github/vortex/encodings/alp)
    Checking vortex-btrblocks v0.1.0 (/Users/ruoshi/code/github/vortex/vortex-btrblocks)
    Checking vortex-layout v0.1.0 (/Users/ruoshi/code/github/vortex/vortex-layout)
    Checking vortex-scan v0.1.0 (/Users/ruoshi/code/github/vortex/vortex-scan)
    Checking vortex-file v0.1.0 (/Users/ruoshi/code/github/vortex/vortex-file)
    Checking vortex v0.1.0 (/Users/ruoshi/code/github/vortex/vortex)
    Checking vortex-bench v0.1.0 (/Users/ruoshi/code/github/vortex/vortex-bench)
    Checking vortex-datafusion v0.1.0 (/Users/ruoshi/code/github/vortex/vortex-datafusion)
    Checking vortex-fuzz v0.1.0 (/Users/ruoshi/code/github/vortex/fuzz)
    Checking vortex-duckdb v0.1.0 (/Users/ruoshi/code/github/vortex/vortex-duckdb)
    Checking vortex-ffi v0.1.0 (/Users/ruoshi/code/github/vortex/vortex-ffi)
    Checking vortex-cxx v0.1.0 (/Users/ruoshi/code/github/vortex/vortex-cxx)
    Checking vortex-python v0.1.0 (/Users/ruoshi/code/github/vortex/vortex-python)
    Checking vortex-jni v0.1.0 (/Users/ruoshi/code/github/vortex/vortex-jni)
    Checking vortex-tui v0.1.0 (/Users/ruoshi/code/github/vortex/vortex-tui)
    Checking lance-bench v0.1.0 (/Users/ruoshi/code/github/vortex/benchmarks/lance-bench)
    Checking duckdb-bench v0.1.0 (/Users/ruoshi/code/github/vortex/benchmarks/duckdb-bench)
    Checking datafusion-bench v0.1.0 (/Users/ruoshi/code/github/vortex/benchmarks/datafusion-bench)
    Checking compress-bench v0.1.0 (/Users/ruoshi/code/github/vortex/benchmarks/compress-bench)
    Checking random-access-bench v0.1.0 (/Users/ruoshi/code/github/vortex/benchmarks/random-access-bench)
    Finished `dev` profile [unoptimized + debuginfo] target(s) in 38.28s
cargo test -p vortex-duckdb      
    Finished `test` profile [unoptimized + debuginfo] target(s) in 0.54s
     Running unittests src/lib.rs (target/debug/deps/vortex_duckdb-59d33a3f980a2476)

running 178 tests
test convert::dtype::tests::test_binary_type ... ok
test convert::dtype::tests::test_primitive_types::case_01 ... ok
test convert::dtype::tests::test_bool_type ... ok
test convert::dtype::tests::test_null_type ... ok
test convert::dtype::tests::test_primitive_types::case_02 ... ok
test convert::dtype::tests::test_list_type ... ok
test convert::dtype::tests::test_decimal_type ... ok
test convert::dtype::tests::test_date_extension_type ... ok
test convert::dtype::tests::test_f16_unsupported ... ok
test convert::dtype::tests::test_primitive_types::case_03 ... ok
test convert::dtype::tests::test_primitive_types::case_04 ... ok
test convert::dtype::tests::test_primitive_types::case_05 ... ok
test convert::dtype::tests::test_empty_struct ... ok
test convert::dtype::tests::test_primitive_types::case_06 ... ok
test convert::dtype::tests::test_primitive_types::case_07 ... ok
test convert::dtype::tests::test_primitive_types::case_08 ... ok
test convert::dtype::tests::test_primitive_types::case_09 ... ok
test convert::dtype::tests::test_primitive_types::case_10 ... ok
test convert::dtype::tests::test_string_type ... ok
test convert::dtype::tests::test_temporal_extension_invalid_time_units ... ok
test convert::dtype::tests::test_struct_with_invalid_field_name ... ok
test convert::dtype::tests::test_struct_type ... ok
test convert::dtype::tests::test_time_extension_type ... ok
test convert::dtype::tests::test_unsupported_extension_type ... ok
test convert::dtype::tests::test_timestamp_with_timezone ... ok
test convert::dtype::tests::test_timestamp_extension_types ... ok
test convert::scalar::tests::test_timestamp_roundtrip ... ok
test convert::scalar::tests::test_scalar_round_trip ... ok
test convert::vector::tests::test_integer_vector_conversion ... ok
test convert::vector::tests::test_timestamp_extreme_values ... ok
test convert::vector::tests::test_empty_struct ... ok
test convert::vector::tests::test_fixed_sized_list ... ok
test convert::vector::tests::test_struct ... ok
test convert::vector::tests::test_timestamp_single_value ... ok
test convert::vector::tests::test_timestamp_seconds_vector_conversion ... ok
test convert::vector::tests::test_timestamp_milliseconds_vector_conversion ... ok
test convert::vector::tests::test_boolean_vector_conversion ... ok
test convert::vector::tests::test_list ... ok
test convert::vector::tests::test_timestamp_vector_conversion ... ok
test convert::vector::tests::test_list_out_of_order ... ok
test convert::vector::tests::test_timestamp_with_nulls_conversion ... ok
test convert::vector::tests::test_list_with_trailing_null ... ok
test convert::vector::tests::test_list_null_garbage_data ... ok
test convert::vector::tests::test_vector_with_nulls ... ok
test duckdb::config::tests::test_config_count ... ok
test duckdb::config::tests::test_config_get_flag ... ok
test duckdb::config::tests::test_config_invalid_key ... ok
test duckdb::config::tests::test_config_creation ... ok
test duckdb::config::tests::test_config_invalid_value ... ok
test duckdb::config::tests::test_config_get_all ... ok
test duckdb::config::tests::test_config_set_and_get ... ok
test duckdb::config::tests::test_config_list_available_options ... ok
test duckdb::config::tests::test_config_update_value ... ok
test duckdb::connection::tests::test_query_and_get_row_count_create_table ... ok
test duckdb::connection::tests::test_execute_success ... ok
test duckdb::connection::tests::test_query_and_get_row_count_select ... ok
test duckdb::config::tests::test_config_persistence_through_database ... ok
test duckdb::connection::tests::test_execute_with_null_bytes ... ok
test duckdb::connection::tests::test_query_and_get_row_count_insert ... ok
test duckdb::connection::tests::test_object_cache_get_nonexistent ... ok
test duckdb::connection::tests::test_connection_creation ... ok
test duckdb::connection::tests::test_query_and_get_row_count_update ... ok
test duckdb::logical_type::tests::test_clone_array_logical_type ... ok
test duckdb::logical_type::tests::test_clone_decimal_logical_type ... ok
test duckdb::logical_type::tests::test_clone_list_logical_type ... ok
test duckdb::logical_type::tests::test_clone_logical_type ... ok
test duckdb::logical_type::tests::test_clone_map_logical_type ... ok
test duckdb::logical_type::tests::test_clone_struct_logical_type ... ok
test duckdb::logical_type::tests::test_clone_union_logical_type ... ok
test duckdb::value::tests::test_huge_int_from_parts ... ok
test duckdb::vector::tests::test_create_validity_all_nulls ... ok
test duckdb::vector::tests::test_create_validity_all_valid ... ok
test duckdb::vector::tests::test_create_validity_empty ... ok
test duckdb::vector::tests::test_create_validity_single_element ... ok
test duckdb::vector::tests::test_create_validity_single_element_valid ... ok
test duckdb::vector::tests::test_create_validity_with_nulls ... ok
test duckdb::vector::tests::test_dictionary ... ok
test duckdb::vector::tests::test_row_is_null_all_nulls ... ok
test duckdb::connection::tests::test_null_handling ... ok
test duckdb::vector::tests::test_row_is_null_all_valid ... ok
test duckdb::vector::tests::test_row_is_null_byte_boundaries ... ok
test duckdb::vector::tests::test_row_is_null_single_element ... ok
test duckdb::vector::tests::test_row_is_null_single_element_valid ... ok
test duckdb::vector::tests::test_row_is_null_with_nulls ... ok
test duckdb::connection::tests::test_object_cache_put_get ... ok
test duckdb::connection::tests::test_query_and_get_row_count_delete ... ok
test duckdb::connection::tests::test_query_multiple_columns ... ok
test duckdb::connection::tests::test_query_column_types ... ok
test duckdb::connection::tests::test_type_conversion ... ok
test duckdb::database::tests::test_file_database_with_config ... ok
test duckdb::database::tests::test_database_with_config ... ok
test duckdb::connection::tests::test_query_single_value ... ok
test duckdb::connection::tests::test_query_multiple_rows ... ok
test duckdb::file_system::tests::test_writer_roundtrip_local ... ok
test e2e_test::object_cache_test::test_table_function_with_object_cache ... ok
test e2e_test::vortex_scan_test::test_issue_5927_not_in_does_not_panic ... ok
test e2e_test::vortex_scan_test::test_vortex_scan_integers ... ok
test e2e_test::vortex_scan_test::test_vortex_scan_floats ... ok
test e2e_test::vortex_scan_test::test_vortex_multi_column ... ok
test e2e_test::vortex_scan_test::test_vortex_scan_list_of_ints ... ok
test e2e_test::vortex_scan_test::test_vortex_scan_constant ... ok
test e2e_test::vortex_scan_test::test_vortex_scan_booleans ... ok
test e2e_test::vortex_scan_test::test_scan_function_registration ... ok
test e2e_test::vortex_scan_test::test_vortex_scan_integers_between ... ok
test e2e_test::vortex_scan_test::test_vortex_scan_fixed_size_list_utf8 ... ok
test exporter::all_invalid::tests::all_null_array ... ok
test exporter::bool::tests::test_bool ... ok
test exporter::bool::tests::test_bool_all_invalid ... ok
test exporter::bool::tests::test_bool_long ... ok
test exporter::bool::tests::test_bool_nullable ... ok
test exporter::decimal::tests::test_decimal_zero_copy_exporter ... ok
test exporter::decimal::tests::test_decimal_zero_copy_exporter_subset ... ok
test exporter::decimal::tests::test_decimal_zero_copy_exporter_with_nulls ... ok
test exporter::dict::tests::test_constant_dict ... ok
test exporter::dict::tests::test_constant_dict_null ... ok
test exporter::dict::tests::test_export_empty_dict ... ignored, TODO(connor)[4809]: Exporters do not correctly handle empty vectors
test exporter::dict::tests::test_nullable_dict ... ok
test exporter::fixed_size_list::tests::test_export_all_null_lists ... ok
test exporter::fixed_size_list::tests::test_export_alternating_null_pattern ... ok
test exporter::fixed_size_list::tests::test_export_empty_fixed_size_list ... ok
test exporter::fixed_size_list::tests::test_export_fixed_size_list_with_nulls ... ok
test exporter::fixed_size_list::tests::test_export_list_size_one ... ok
test exporter::fixed_size_list::tests::test_export_nested_fixed_size_list ... ok
test exporter::fixed_size_list::tests::test_export_nested_fixed_size_list_with_nulls ... ok
test exporter::fixed_size_list::tests::test_export_non_empty_fixed_size_list ... ok
test exporter::fixed_size_list::tests::test_export_partial_range ... ok
test exporter::list::tests::test_export_empty_list ... ignored, TODO(connor)[4809]: Exporters do not correctly handle empty vectors
test exporter::list::tests::test_export_non_empty_list_of_strings ... ok
test exporter::list_view::tests::test_export_empty_list ... ignored, TODO(connor)[4809]: Exporters do not correctly handle empty vectors
test exporter::list_view::tests::test_export_non_empty_list_of_strings ... ok
test exporter::list_view::tests::test_export_non_empty_list_with_preceding_and_trailing_garbage ... ok
test e2e_test::vortex_scan_test::test_vortex_scan_integers_in_list ... ok
test exporter::primitive::tests::test_primitive_exporter ... ok
test exporter::sequence::tests::test_sequence ... ok
test exporter::struct_::tests::struct_export_non_flat_vectors ... ok
test exporter::struct_::tests::test_struct_exporter ... ok
test exporter::struct_::tests::test_struct_exporter_with_nulls ... ok
test exporter::temporal::tests::test_timestamp_s ... ok
test exporter::temporal::tests::test_timestamp_time_us ... ok
test exporter::primitive::tests::test_long_primitive_exporter ... ok
test exporter::temporal::tests::test_timestamp_us ... ok
test exporter::tests::test_copy_from_slice_64_to_64 ... ok
test exporter::tests::test_copy_from_slice_248_to_128_middle_non_empty ... ok
test exporter::tests::test_copy_from_slice_64_to_empty ... ok
test exporter::tests::test_copy_from_slice_80_to_0 ... ok
test exporter::tests::test_copy_from_slice_80_to_64_case_1 ... ok
test exporter::tests::test_copy_from_slice_80_to_64_case_2 ... ok
test exporter::tests::test_copy_from_slice_80_to_64_case_3 ... ok
test exporter::tests::test_copy_from_slice_80_to_64_case_4 ... ok
test exporter::tests::test_copy_from_slice_empty_to_empty ... ok
test exporter::tests::test_set_validity_all_false ... ok
test exporter::tests::test_set_validity_all_true ... ok
test exporter::tests::test_set_validity_values_64bit_alignment ... ok
test exporter::tests::test_set_validity_values_all_false ... ok
test exporter::tests::test_set_validity_values_all_true ... ok
test exporter::tests::test_set_validity_values_mixed ... ok
test exporter::tests::test_set_validity_values_with_offset ... ok
test exporter::tests::test_set_validity_values_with_offset_and_smaller_len ... ok
test exporter::varbinview::tests::all_invalid_varbinview ... ok
test exporter::varbinview::tests::all_invalid_varbinview_section ... ok
test exporter::varbinview::tests::partial_invalid_varbinview_section ... ok
test e2e_test::vortex_scan_test::test_write_file ... ok
test utils::glob::tests::test_expand_local_disk_glob_no_matches ... ok
test utils::glob::tests::test_expand_local_disk_glob_relative_path ... ok
test utils::glob::tests::test_expand_local_disk_glob_single_file ... ok
test utils::glob::tests::test_expand_local_disk_glob_subdirectories ... ok
test utils::glob::tests::test_expand_local_disk_glob_wildcard ... ok
test e2e_test::vortex_scan_test::test_write_timestamps ... ok
test tests::test_vortex_max_threads_option_registration ... ok
test e2e_test::vortex_scan_test::test_vortex_scan_list_of_utf8 ... ok
test e2e_test::vortex_scan_test::test_vortex_scan_nested_fixed_size_list_utf8 ... ok
test e2e_test::vortex_scan_test::test_vortex_scan_strings ... ok
test e2e_test::vortex_scan_test::test_vortex_scan_multiple_files ... ok
test e2e_test::vortex_scan_test::test_vortex_scan_strings_contains ... ok
test e2e_test::vortex_scan_test::test_vortex_scan_ultra_deep_nesting ... ok
test e2e_test::vortex_scan_test::test_vortex_encodings_roundtrip ... ok
test utils::glob::tests::test_duckdb_glob_http_hook ... ok
test e2e_test::vortex_scan_test::test_vortex_scan_over_http ... ok

test result: ok. 175 passed; 0 failed; 3 ignored; 0 measured; 0 filtered out; finished in 0.30s

   Doc-tests vortex_duckdb

running 0 tests

test result: ok. 0 passed; 0 failed; 0 ignored; 0 measured; 0 filtered out; finished in 0.00s
"http" | "https" | "s3" => {
let reader = open_duckdb_reader(client_ctx, &url);

// Fallback to the legacy object_store path for s3 if DuckDB fs isn't configured.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

when will that happen? I would really love to get rid of object_store here, it has some underlying dependencies on tokio that can cause some weird issues

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

HTTP/HTTPS/S3 now route solely through DuckDB FS; legacy object_store fallback removed, and failures now bubble as errors if httpfs ext of DuckDB is not initialized.

@joseph-isaacs joseph-isaacs added the action/benchmark Trigger full benchmarks to run on this PR label Jan 29, 2026
@codspeed-hq
Copy link

codspeed-hq bot commented Jan 29, 2026

CodSpeed Performance Report

Merging this PR will degrade performance by 45.74%

Comparing Lychee-Technology:develop (4fffe00) with develop (68130ce)1

⚠️ Unknown Walltime execution environment detected

Using the Walltime instrument on standard Hosted Runners will lead to inconsistent data.

For the most accurate results, we recommend using CodSpeed Macro Runners: bare-metal machines fine-tuned for performance measurement consistency.

Summary

⚡ 7 improved benchmarks
❌ 12 regressed benchmarks
✅ 1142 untouched benchmarks
🆕 18 new benchmarks
⏩ 1323 skipped benchmarks2

⚠️ Please fix the performance issues or acknowledge them on CodSpeed.

Performance Changes

Mode Benchmark BASE HEAD Efficiency
🆕 WallTime 10M_50pct[5000000] N/A 278.7 µs N/A
🆕 WallTime 1M_90pct[1000000] N/A 35.2 µs N/A
🆕 WallTime 1M_50pct[500000] N/A 50.9 µs N/A
🆕 WallTime 10M_90pct[10000000] N/A 363 µs N/A
🆕 WallTime 10M_10pct[1000000] N/A 132.4 µs N/A
WallTime u16_FoR[10M] 8.3 µs 10.2 µs -18.59%
🆕 WallTime 10M_10pct[1000000] N/A 219.9 µs N/A
WallTime u8_FoR[10M] 6.9 µs 12.8 µs -45.74%
🆕 WallTime 1M_50pct[500000] N/A 24.5 µs N/A
🆕 WallTime 1M_90pct[1000000] N/A 57 µs N/A
🆕 WallTime 10M_50pct[5000000] N/A 278.7 µs N/A
🆕 WallTime 1M_90pct[1000000] N/A 57.6 µs N/A
🆕 WallTime 1M_50pct[500000] N/A 50.6 µs N/A
🆕 WallTime 1M_10pct[100000] N/A 23.9 µs N/A
🆕 WallTime 10M_90pct[10000000] N/A 363.4 µs N/A
🆕 WallTime 10M_10pct[1000000] N/A 219.8 µs N/A
🆕 WallTime 10M_50pct[5000000] N/A 158.2 µs N/A
🆕 WallTime 1M_10pct[100000] N/A 76.7 µs N/A
🆕 WallTime 10M_90pct[10000000] N/A 195.8 µs N/A
🆕 WallTime 1M_10pct[100000] N/A 45.6 µs N/A
... ... ... ... ... ...

ℹ️ Only the first 20 benchmarks are displayed. Go to the app to view all benchmarks.

Footnotes

  1. No successful run was found on develop (db2e2af) during the generation of this report, so 68130ce was used instead as the comparison base. There might be some changes unrelated to this pull request in this report.

  2. 1323 benchmarks were skipped, so the baseline results were used instead. If they were deleted from the codebase, click here and archive them to remove them from the performance reports.

@0ax1 0ax1 self-requested a review January 29, 2026 10:40
@0ax1 0ax1 added action/benchmark-sql Trigger SQL benchmarks to run on this PR changelog/feature A new feature and removed action/benchmark Trigger full benchmarks to run on this PR labels Jan 29, 2026
@joseph-isaacs joseph-isaacs added action/benchmark-sql Trigger SQL benchmarks to run on this PR and removed action/benchmark-sql Trigger SQL benchmarks to run on this PR labels Jan 29, 2026
Copy link
Contributor

@0ax1 0ax1 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for taking the time to look into this. There's a couple of issues around thread safety / locking, error handling and memory leaks.

If you used AI to assist in writing the code for your PR please mention this in your PR description (see: https://github.com/vortex-data/vortex/blob/develop/CONTRIBUTING.md).

SESSION.write_options().write(&mut file, array_stream).await
// Prefer DuckDB FS (httpfs/s3/etc.), fallback to local async fs if unavailable.
let ctx_raw = ctx_ptr.0 as cpp::duckdb_vx_client_context;
if let Ok(writer) = unsafe { duckdb_fs_create_writer(ctx_raw, &file_path) } {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We should probably only use the DuckDB FS and not fallback to our prev logic implicitly but return an error if the new solution fails to init. We can handle this though in a follow up on our end.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for all your comments, I will address them ASAP.

*error_out = duckdb_vx_error_create(message.data(), message.size());
}

duckdb_state HandleException(duckdb_vx_error *error_out) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This snippet does not handle an exception really, right? We take the error, call throw unconditionally ourselves and then return a DuckDBError.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

added HandleException that rethrows the captured std::exception_ptr, assigns a duckdb_vx_error, and returns DuckDBError; all catch-all blocks now pass std::current_exception() to it.

if (!error_out) {
return;
}
*error_out = duckdb_vx_error_create(message.data(), message.size());
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Where does this error get freed?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

C++ allocates errors via duckdb_vx_error_create; Rust side consistently frees with duckdb_vx_error_free in fs_error.


using namespace duckdb;

namespace {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why the anonymous namespace?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

removed the anonymous namespace

handle->Truncate(0);
return reinterpret_cast<duckdb_vx_file_handle>(new FileHandleWrapper(std::move(handle)));
} catch (...) {
HandleException(error_out);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Did you mean to forward the exception being passed to catch to HandleException?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

retained SetError for central null-guarded assignment; it now directly uses duckdb_vx_error_create, and all call sites route through it.


/// A VortexReadAt implementation backed by DuckDB's filesystem (e.g., httpfs/s3).
pub struct DuckDbFsReadAt {
handle: Arc<Mutex<FsFileHandle>>,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why does this need a Mutex given that FsFileHandle is Send and Sync? If it is send and sync Arc<FsFileHandle> should be sufficient.

Copy link
Author

@iceboundrock iceboundrock Jan 31, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

handle is now Arc<FsFileHandle> without additional locking

max_size: 8 * 1024 * 1024, // 8 MB
};

const DEFAULT_CONCURRENCY: usize = 64;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you give more context, add a comment how you came up with this number?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

set DEFAULT_CONCURRENCY = 64 as a conservative ceiling that keeps range reads parallel while staying under common per-host limits.

let mut size_out: cpp::idx_t = 0;
let status = unsafe {
cpp::duckdb_vx_fs_get_size(
handle.lock().as_ptr(),
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We only lock the handle for the duration of retrieving the pointer but not for the duration of duckdb_vx_fs_get_size. But as questioned before, isn't the file handle thread safe (send + sync) anyway?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

handle is now Arc<FsFileHandle> without additional locking; Send/Sync markers retained.

let mut out_len: cpp::idx_t = 0;
let status = unsafe {
cpp::duckdb_vx_fs_read(
handle.lock().as_ptr(),
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same here, if we really need a lock, it would need to held for duration of duckdb_vx_fs_read.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

addressed

pub struct VortexCopyFunction;

#[derive(Clone, Copy)]
struct SendableClientCtx(usize);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why do we encode the pointer as a usize?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

replaced usize casting with NonNull<duckdb_vx_client_context_> (SendableClientCtx) and pass the raw pointer via as_ptr(). This avoids truncation/aliasing issues from integer casts and preserves provenance; all FFI touchpoints now use the typed wrapper.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

action/benchmark-sql Trigger SQL benchmarks to run on this PR changelog/feature A new feature

5 participants