I am a little bit lost in tidymodels. I have a some data from topicmodeling:
- prevalent_topic: factor variable with most prevalent topic, ranging from "Topic_1" to "Topic_5"
- value1 and value2: two numeric variables used as predictor
I want to predict/classify the prevalent_topic based on value1 and value2:
prevalent_topic ~ value1 + value2
I started with multiclass classification using glmnet and nnet with tidymodels. Now I want to try "one-vs-rest" binary classification and created a recipe to begin with:
dfFT_rec <- recipe( ~ value1 + value2, data = dfFT_train) %>%
step_dummy(prevalent_topic, one_hot = TRUE) %>%
step_normalize(c(value1, value2))
The second step creates dummy variables that I would like to use as outcome, e.g. "prevalent_topic_Topic_1", ""prevalent_topic_Topic_2", ...
I tried to update the recipe's formula to use "prevalent_topic_Topic_1 ~ value1 + value2" but that did not work. I also tried to fit a workflow to my data without specifying the outcome but only got an error: "logistic_reg() was unable to find an outcome."
Is this possible at all? Or is there a different way to turn an outcome factor variable into dummy-coded outcome variables?