You can do this by generating your own grid for tuning the hyperparameters where the hidden_units column is made up of vectors of doubles. The length of the vector is the number of hidden layers, and the values are the number of hidden units.
For example, this could look like:
grid_hidden_units <- tribble(
~hidden_units,
c(8, 8),
c(8, 8, 8),
c(16, 16),
c(16, 16, 16),
)
grid_penalty <- tibble(penalty = c(0.01, 0.02))
grid <- grid_hidden_units |>
crossing(grid_penalty)
grid
# # A tibble: 12 × 2
# hidden_units penalty
# <list> <dbl>
# 1 <dbl [2]> 0.01
# 2 <dbl [2]> 0.02
# 3 <dbl [3]> 0.01
# 4 <dbl [3]> 0.02
# 5 <dbl [2]> 0.01
# 6 <dbl [2]> 0.02
# 7 <dbl [3]> 0.01
# 8 <dbl [3]> 0.02
You can select any values you want for tuning here - try vectors of different lengths and values.
With this grid, we can easily tune the model with tune_grid! This might look like:
mlp_mod <- mlp(hidden_units = tune(), penalty = tune()) |>
set_engine("brulee", importance = "permutation") |>
set_mode("regression")
mlp_workflow <- workflow() |>
add_recipe(Sac_recipe) |>
add_model(mlp_mod)
mlp_tune <- tune_grid(
mlp_workflow,
resamples = vfold_cv(Sac_train, v = 2),
grid = grid
)
Then, you can inspect the results by looking at mlp_tune$.metrics. It is also easy to extract the best model:
mlp_best <- mlp_tune |> select_best("rmse")
mlp_best
# # A tibble: 1 × 3
# hidden_units penalty .config
# <list> <dbl> <chr>
# 1 <dbl [3]> 0.01 Preprocessor1_Model3
mlp_best$hidden_units
# [[1]]
# [1] 8 8 8
In this example, our best hyperparameters includes 3 hidden layers, 8 hidden units in each hidden layer, and a penalty of 0.01. This successfully tunes the number of hidden layers and the number of hidden units!
In order to update the model, I found that it is easiest to enter the hyperparameters as a list - brulee doesn't like if you use a tibble where one column is a list, which happens when we use multiple hidden layers. So you can do:
# Turn best hyperparameters into list
mlp_best_list <- mlp_best |> as.list()
mlp_best_list$hidden_units <- mlp_best_list$hidden_units |> unlist()
# Update the workflow with the best hyperparameters
mlp_workflow_final <- mlp_workflow |>
finalize_workflow(mlp_best_list)
Now you can fit your model and make predictions as normal.
# Fit the model
mlp_fit <- fit(mlp_workflow_final, data = Sac_train)
# Make predictions
mlp_preds <- predict(mlp_fit, Sac_test)
brulee, it is possible to have multiple hidden layers in themlp. The reference is given brulee.tidymodels.org/reference/brulee_mlp.html, thanks.