How to tune a MLP model with more than 1 hidden layer within the tidymodels framework?

Question

I am building a multilayer perceptron (mlp) model with 2 or 3 hidden layers using the brulee package within the tidymodels framework. How can I tune the hyper-parameters including the number of hidden layers, number of hidden_units per layer, and penalty using tune_grid()?

library(tidymodels)
library(brulee)
data(Sacramento, package = "modeldata")

# Data splitting
set.seed(123)
data_split <- initial_split(Sacramento, prop = 0.75, strata = price)
Sac_train <- training(data_split)
Sac_test <- testing(data_split)

# Create the recipe
Sac_recipe <- recipe(price ~ ., data = Sac_train) %>% 
  step_rm(zip, latitude, longitude) %>% 
  step_corr(all_numeric_predictors(), threshold = 0.85) %>% 
  step_normalize(all_numeric_predictors()) %>%
  step_dummy(all_nominal_predictors())

A mlp model with 2 hidden layers (each has 30 and 20 hidden units) can be specified below:

# Build the model
mlp_mod <- mlp(hidden_units = c(30, 20), penalty = tune()) %>% 
           set_engine("brulee", importance = "permutation") %>% 
           set_mode("regression")

If using hidden_units = tune(), it will only tune the number of hidden_units for a single hidden layer mlp.

@UseR10085 With the latest R package brulee, it is possible to have multiple hidden layers in the mlp. The reference is given brulee.tidymodels.org/reference/brulee_mlp.html, thanks. — Yang Yang
– Yang Yang, Commented Feb 19, 2024 at 23:16

mfg3z0 · Accepted Answer · 2024-02-23 04:04:00Z

You can do this by generating your own grid for tuning the hyperparameters where the hidden_units column is made up of vectors of doubles. The length of the vector is the number of hidden layers, and the values are the number of hidden units.

For example, this could look like:

grid_hidden_units <- tribble(
    ~hidden_units,
    c(8, 8),
    c(8, 8, 8),
    c(16, 16),
    c(16, 16, 16),
)

grid_penalty <- tibble(penalty = c(0.01, 0.02))

grid <- grid_hidden_units |>
    crossing(grid_penalty)

grid
# # A tibble: 12 × 2
#    hidden_units penalty
#    <list>         <dbl>
# 1 <dbl [2]>       0.01
# 2 <dbl [2]>       0.02
# 3 <dbl [3]>       0.01
# 4 <dbl [3]>       0.02 
# 5 <dbl [2]>       0.01
# 6 <dbl [2]>       0.02 
# 7 <dbl [3]>       0.01
# 8 <dbl [3]>       0.02

You can select any values you want for tuning here - try vectors of different lengths and values.

With this grid, we can easily tune the model with tune_grid! This might look like:

mlp_mod <- mlp(hidden_units = tune(), penalty = tune()) |>
    set_engine("brulee", importance = "permutation") |>
    set_mode("regression")

mlp_workflow <- workflow() |>
    add_recipe(Sac_recipe) |>
    add_model(mlp_mod)

mlp_tune <- tune_grid(
        mlp_workflow,
        resamples = vfold_cv(Sac_train, v = 2),
        grid = grid
    )

Then, you can inspect the results by looking at mlp_tune$.metrics. It is also easy to extract the best model:

mlp_best <- mlp_tune |> select_best("rmse")

mlp_best
# # A tibble: 1 × 3
#   hidden_units penalty .config             
#   <list>         <dbl> <chr>               
# 1 <dbl [3]>       0.01 Preprocessor1_Model3

mlp_best$hidden_units
# [[1]]
# [1] 8 8 8

In this example, our best hyperparameters includes 3 hidden layers, 8 hidden units in each hidden layer, and a penalty of 0.01. This successfully tunes the number of hidden layers and the number of hidden units!

In order to update the model, I found that it is easiest to enter the hyperparameters as a list - brulee doesn't like if you use a tibble where one column is a list, which happens when we use multiple hidden layers. So you can do:

# Turn best hyperparameters into list
mlp_best_list <- mlp_best |> as.list()
mlp_best_list$hidden_units <- mlp_best_list$hidden_units |> unlist()

# Update the workflow with the best hyperparameters
mlp_workflow_final <- mlp_workflow |>
    finalize_workflow(mlp_best_list)

Now you can fit your model and make predictions as normal.

# Fit the model
mlp_fit <- fit(mlp_workflow_final, data = Sac_train)

# Make predictions
mlp_preds <- predict(mlp_fit, Sac_test)

Is it possible this doesn't work anymore (in a later version, I assume)? When I do this, I get error: brulee_mlp() expected 'hidden_units' to be integer. It seems to not like it that the hidden_units column is a list-column?

Collectives™ on Stack Overflow

How to tune a MLP model with more than 1 hidden layer within the tidymodels framework?

1 Answer 1

1 Comment

Hot Network Questions

Collectives™ on Stack Overflow

1 Answer 1

1 Comment

Related