I have a dataset (originally large CSV) that I filtered using duckdb and dbplyr.
This is a small script that get to my idea :
library(duckdb)
library(DBI)
library(dplyr)
library(dbplyr)
path_data_csv = "data.csv"
data = duckdb::tbl_file(DBI::dbConnect(duckdb()), path_data_csv)
new_data = data |>
filter(datasetKey == '12345678',
kingdom %in% c('Animalia')) |>
collect()
Then since the data is in memory, I can export it using write.csv() :
write.csv(x = new_data,
file = 'newdata.csv',
row.names = FALSE)
But I was wondering if there is a way to export it directly into CSV without 'collecting' the data first. Essentially, I want to export the data of class c("tbl_duckdb_connection", "tbl_dbi", "tbl_sql", "tbl_lazy", "tbl") to CSV without materialization. I wish to find something that is not requiring to use dbExecute or dbWriteTable (which would be ways to do it).
duckdb --csv -c "from 'beausoliel.csv' where datasetKey = '12345678' and kingdom = 'Animalia'"On Linux you may need different quoting.compute_csv()does the job!