XGBoost (R Interface) throwing "label set cannot be empty" error

Question

I am trying to use the XGBoost model to perform a multi-class classification over 40 classes.

The code is as follows:

xgb_params = list(colsample_bytree= 0.7,
                  subsample = 0.7,
                  eta = 0.05,
                  objective= 'multi:softmax',
                  max_depth= 5,
                  min_child_weight= 1,
                  eval_metric= "mlogloss", num_class = categoryclassnos,
                  nthread=4)

fit.xgb = xgb.train(params = xgb_params,
                    data = dtrain,
                    nrounds = 500,
                    watchlist = list(train = dtrain, test=dtest),
                    print_every_n = 50)

However, I am getting the following error:

Check failed: (info.labels.size()) != (0) label set cannot be empty

I have reproduced the dataset and the R script here.

Any help/ pointers are deeply appreciated.

In your code what does trainlabelsfactored <- as.integer(train$primarydeptt) - 1 return — grldsndrs
– grldsndrs, Commented Jun 1, 2017 at 19:02
This returns an integer in the range [0,max] based on the factor values. XGBoost requires this range (starting from 0) for label. — Chints
– Chints, Commented Jun 2, 2017 at 1:45
Continuing to explore this... As an attempt, I added label value in dtest parameter. The error went away... I still don't get the logic as there is unlikely to be a label tag available for test data set while predicting.. — Chints
– Chints, Commented Jun 2, 2017 at 9:53
When creating the xgb.DMatrix (dtrain, dtest), specify the label parameter. — bradS
– bradS, Commented Jun 27, 2018 at 8:18

citynorman · Accepted Answer · 2017-06-13 03:37:04Z

3

In python I needed to set the label param in dtrain = xgb.DMatrix(X_train,label=y_train,feature_names=cfg_col_X)

PS I was going to comment but missing rep points.

answered Jun 13, 2017 at 3:37

citynorman

1311 bronze badge

Add a comment |

Gaius · Accepted Answer · 2018-04-27 09:54:52Z

1

To train a model you need two things, your training data(a matrix of variables and observations) and your labels (the known classification of each observation in the training data). Perhaps the last column in your data is actually the label? If so you will to vertically split your dtrain so that parameter data= all columns but the last, and a parameter label= just the last column.

Once you have the model, you can use it to predict on data that doesn’t have labels and the output will be the labels.

answered Apr 27, 2018 at 9:54

Gaius

3282 silver badges11 bronze badges

Add a comment |

Stack Exchange Network

XGBoost (R Interface) throwing "label set cannot be empty" error

2 Answers 2

Hot Network Questions

XGBoost (R Interface) throwing "label set cannot be empty" error

2 Answers 2

Related

Hot Network Questions