I wish to efficiently extract values from a matrix (vals), using a single column number (val_col) and a matrix of row numbers (val_rows). Specifically, I also want my results in a matrix corresponding to val_rows.
# Matrix of values.
n_cols <- 3
n_rows <- 5
n_vals <- n_cols * n_rows
vals <- outer(
seq_len(n_rows), seq_len(n_cols),
paste, sep = "x"
)
vals
#> [,1] [,2] [,3]
#> [1,] "1x1" "1x2" "1x3"
#> [2,] "2x1" "2x2" "2x3"
#> [3,] "3x1" "3x2" "3x3"
#> [4,] "4x1" "4x2" "4x3"
#> [5,] "5x1" "5x2" "5x3"
# Column number.
val_col <- 2
# Matrix of row numbers.
n_rows_out <- 4
n_vals_out <- n_cols * n_rows_out
set.seed(111)
val_rows <- sample.int(n_rows, size = n_vals_out, replace = TRUE) |>
matrix(ncol = n_cols, byrow = TRUE)
val_rows
#> [,1] [,2] [,3]
#> [1,] 3 4 3
#> [2,] 1 3 5
#> [3,] 3 4 2
#> [4,] 1 5 5
Now the result I want is a matrix like this:
result <- structure(
c("3x2", "1x2", "3x2", "1x2", "4x2", "3x2", "4x2", "5x2", "3x2", "5x2", "2x2", "5x2"),
dim = 4:3
)
result
#> [,1] [,2] [,3]
#> [1,] "3x2" "4x2" "3x2"
#> [2,] "1x2" "3x2" "5x2"
#> [3,] "3x2" "4x2" "2x2"
#> [4,] "1x2" "5x2" "5x2"
But when I simply extract with [ my results are "flattened" into a vector.
result <- vals[val_rows, val_col]
result
#> [1] "3x2" "1x2" "3x2" "1x2" "4x2" "3x2" "4x2" "5x2" "3x2" "5x2" "2x2" "5x2"
Even when I specify drop = FALSE the matrix is not structured like val_rows.
result <- vals[val_rows, val_col, drop = FALSE]
result
#> [,1]
#> [1,] "3x2"
#> [2,] "1x2"
#> [3,] "3x2"
#> [4,] "1x2"
#> [5,] "4x2"
#> [6,] "3x2"
#> [7,] "4x2"
#> [8,] "5x2"
#> [9,] "3x2"
#> [10,] "5x2"
#> [11,] "2x2"
#> [12,] "5x2"
It seems I can simply modify the dim() in a one-liner.
result <- vals[val_rows, val_col] |>
`dim<-`(c(n_rows_out, n_cols))
result
#> [,1] [,2] [,3]
#> [1,] "3x2" "4x2" "3x2"
#> [2,] "1x2" "3x2" "5x2"
#> [3,] "3x2" "4x2" "2x2"
#> [4,] "1x2" "5x2" "5x2"
result <- vals[val_rows, val_col, drop = TRUE] |>
`dim<-`(c(n_rows_out, n_cols))
result
#> [,1] [,2] [,3]
#> [1,] "3x2" "4x2" "3x2"
#> [2,] "1x2" "3x2" "5x2"
#> [3,] "3x2" "4x2" "2x2"
#> [4,] "1x2" "5x2" "5x2"
And obviously, I can feed the vector into matrix() to restructure it like val_rows.
result <- vals[val_rows, val_col] |>
matrix(ncol = n_cols, byrow = FALSE)
result
#> [,1] [,2] [,3]
#> [1,] "3x2" "4x2" "3x2"
#> [2,] "1x2" "3x2" "5x2"
#> [3,] "3x2" "4x2" "2x2"
#> [4,] "1x2" "5x2" "5x2"
But if I recall, populating values via matrix() is inefficient for repetition at scale. And mustn't we pass the entire result to dim() by value, simply to modify its dim attribute in place?
Anyway, these approaches redundantly "repair" the "damage" that should never have occurred in the first place. So what is the most efficient way to extract the values while retaining the structure of val_rows?
vals[val_rows, val_col] |> 'dim<-'(rev(dim(vals))) |> t(). Not sure if it's really faster than matrix though.`dim<-`()approach actually does give me what I want. I have updated my post with a sample output and some clarifications. I'm waiting to see if alternatives emerge that are more efficient for speed and/or memory, but I suspect Nadir's answer is best.outer(seq_len(n_rows), seq_len(n_cols), paste, sep = "x")?outer()! I simply didn't know it existed. 😂