1
$\begingroup$

This is my script:

    library(xgboost)
library(tidyverse)
library(caret)
library(readxl)

library(data.table)
library(mlr)

data <- iris
righe_train <- sample(nrow(data),nrow(data)*0.8)
train <- data[righe_train,]
test <- data[-righe_train,]

setDT(train) 
setDT(test)

labels <- train$Species
ts_label <- test$Species
new_tr <- model.matrix(~.+0,data = train[,-c("Species"),with=F]) 
new_ts <- model.matrix(~.+0,data = test[,-c("Species"),with=F])

#convert factor to numeric 
labels <- as.numeric(labels)-1
ts_label <- as.numeric(ts_label)-1


#preparing matrix 
dtrain <- xgb.DMatrix(data = new_tr,label = labels) 
dtest <- xgb.DMatrix(data = new_ts,label=ts_label)

#default parameters
params <- list(booster = "gbtree",
                 objective = "binary:logistic",
                 eta=0.3,
                 gamma=0,
                 max_depth=6,
                 min_child_weight=1,
                 subsample=1,
                 colsample_bytree=1)

xgbcv <- xgb.cv( params = params,
                 data = dtrain,
                 nrounds = 100,
                 nfold = 5,
                 showsd = T,
                 stratified = T,
                 print_every_n = 10,
                 early_stopping_round = 20,
                 maximize = F)
##best iteration = 79

min(xgbcv$test.error.mean)


#first default - model training
xgb1 <- xgb.train (params = params, data = dtrain, nrounds = 79, watchlist = list(val=dtest,train=dtrain), print.every.n = 10, early.stop.round = 10, maximize = F , eval_metric = "error")
#model prediction
xgbpred <- predict (xgb1,dtest)
xgbpred <- ifelse (xgbpred > 0.5,1,0)

#confusion matrix
library(caret)
confusionMatrix (xgbpred, ts_label)
#Accuracy - 86.54%` 

#view variable importance plot
mat <- xgb.importance (feature_names = colnames(new_tr),model = xgb1)
xgb.plot.importance (importance_matrix = mat[1:20])

But when I run the instruction xgbcv I have this error:

Error in xgb.iter.update(fd$bst, fd$dtrain, iteration - 1, obj) :
[15:21:18] amalgamation/../src/objective/regression_obj.cu:103: label must be in [0,1] for logistic regression

Why? How can I fix it?

$\endgroup$
3
  • $\begingroup$ Iris data has three „targets“ I guess. Table the target to check it. You use a binary logistic model, which accepts only two classes, just as the error says. $\endgroup$ Commented Nov 4, 2021 at 14:36
  • $\begingroup$ @Peter yes, you're right! what can i use instead of binary:logistic? $\endgroup$ Commented Nov 4, 2021 at 14:41
  • $\begingroup$ See my full answer below $\endgroup$ Commented Nov 4, 2021 at 14:50

1 Answer 1

3
$\begingroup$

The Iris data has three target values ("species"). The objective function in params is set to objective = "binary:logistic", which only accepts two classes (binary taget).

In case you have more than two classes, you need a multiclass objective function, e.g. multi:softmax or multi:softprob.

As stated in the docs:

“binary:logistic” –logistic regression for binary classification, output probability

[...]

“multi:softmax” –set XGBoost to do multiclass classification using the softmax objective, you also need to set num_class(number of classes)

“multi:softprob” –same as softmax, but output a vector of ndata * nclass, which can be further reshaped to ndata, nclass matrix. The result contains predicted probability of each data point belonging to each class.

Also note additional parameter options available for xgboost.

$\endgroup$

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.