Newest 'scikit-learn' Questions

0 votes

0 answers

17 views

HalvingGridSearchCV cannot fit multi label DecisionTreeClassifier

I'm trying to use HalvingGridSearch to find the best DecisionTree model. My model performs a multi-label prediction on a single example, it is trained on a batch of data of size (n_samples x ...

Gabriele Benanti

1

asked 14 hours ago

-3 votes

0 answers

19 views

When should I use Random Forest instead of XGBoost, and vice versa?

I’ve been using both Random Forest and XGBoost for classification tasks. In most cases, I notice that XGBoost gives slightly better accuracy. However, I’m unsure about the specific scenarios where one ...

سديم عبداللطيف الحايك

1

asked 15 hours ago

0 votes

2 answers

33 views

Python Sklearn.Model_Selection giving error numpy.dtype size changed

I have a train test split code from sklearn.model_selection import train_test_split train_df, test_df = train_test_split(new_cleaned_df, test_size=0.05, random_state=42, shuffle=True) train_df....

Moh. Aflah Azzaky

9

asked Apr 22 at 15:55

1 vote

1 answer

48 views

Linear regression prediction does not display properly

I want to make 2 different linear regressions for 2 diferent plots, but on the same figure. I have a problem with the y1_pred because it does not go for all the y axis where are scatters. model1 = ...

Gonzalo Martinez

11

asked Apr 22 at 14:05

0 votes

1 answer

44 views

How to preprocess date in Isolation Forest sklearn [closed]

I am using sklearn's IsolationForest model to detect anomalies on a time-series dataset. One of the features is date with the format MM-YYYY, the other features are numeric values. What is the best ...

Mar

21

asked Apr 21 at 17:03

-2 votes

1 answer

87 views

Why does my RandomForestClassifier overfit despite using cross-validation? [closed]

I'm working on a binary classification problem using RandomForestClassifier from scikit-learn. My dataset has ~10,000 rows and ~20 numerical features. I used train_test_split and cross_val_score, but ...

Eshaan Saha

1

asked Apr 20 at 19:53

0 votes

0 answers

25 views

Keras SKLearnClassifier wrapper can't fit MNIST data

I'm trying to use the SKLearnClassifier Keras wrapper to do some grid searching and cross validation using the sklearn library but I'm unable to get the model to work properly. def build_model(X, y, ...

Jesus Diaz Rivero

337

asked Apr 20 at 18:31

-2 votes

0 answers

22 views

Which library is more reliable for LDA and perplexity: gensim or scikit-learn? [closed]

When calculating the perplexity of an LDA model for N topics using train-test split with KFold, I noticed that in Gensim, the perplexity consistently increases as the number of topics grows—resulting ...

O Basile

1

asked Apr 17 at 18:53

2 votes

1 answer

35 views

How to fit scaler for different subsets of rows depending on group variable and include it in a Pipeline?

I have a data set like the following and want to scale the data using any of the scalers in sklearn.preprocessing. Is there an easy way to fit this scaler not over the whole data set, but per group? ...

ascripter

6,265

asked Apr 16 at 14:58

0 votes

1 answer

43 views

Confirm understanding of decision_function in Isolation Forest

I am looking to better understand sklearn IsolationForest decision_function. My understanding from this previous stack overflow post, What is the difference between decision function and ...

Mar

21

asked Apr 15 at 21:07

1 vote

1 answer

38 views

Why does RandomForestClassifier in scikit-learn predict even on all-NaN input?

I am training a random forest classifier in python sklearn, see code below- from sklearn.ensemble import RandomForestClassifier rf = RandomForestClassifier(random_state=42) rf.fit(X = df.drop("...

lsr729

844

asked Apr 15 at 19:53

1 vote

2 answers

38 views

reg.predict is telling me I am not providing an array

It seems I have an issue with an array that I thought I coded correctly. When I ask for reg.score or reg.coef_ the code works great, but when I try to predict it throws an error that is saying it is ...

Hoot

11

asked Apr 12 at 17:52

0 votes

0 answers

19 views

Get analytical equation of RF regressor model [duplicate]

I have the following dataset: X1 X2 X3 y 0 0.548814 0.715189 0.602763 0.264556 1 0.544883 0.423655 0.645894 0.774234 2 0.437587 0.891773 0.963663 0.456150 3 ...

quant

4,492

asked Apr 7 at 9:27

2 votes

0 answers

56 views

Different Feature Selection Results Between Local (Ubuntu VM) and Databricks Using sklearn's SequentialFeatureSelector

I am migrating from running my machine learning pipeline in VS Code with Ubuntu on a VM into Databricks. When I test the same dataset using the same code, I get different selected features from ...

Mattie

31

asked Mar 28 at 11:22

0 votes

0 answers

51 views

TabPFN feature selection raises KeyError(f"None of [{key}] are in the [{axis_name}]")

I trained a tabPFN model, which I then tried applying a sequential feature selector for important feature selection. I've been getting this error KeyError(f"None of [{key}] are in the [{axis_name}...

Adam

156

asked Mar 23 at 22:59

Collectives™ on Stack Overflow

HalvingGridSearchCV cannot fit multi label DecisionTreeClassifier

When should I use Random Forest instead of XGBoost, and vice versa?

Python Sklearn.Model_Selection giving error numpy.dtype size changed

Linear regression prediction does not display properly

How to preprocess date in Isolation Forest sklearn [closed]

Why does my RandomForestClassifier overfit despite using cross-validation? [closed]

Keras SKLearnClassifier wrapper can't fit MNIST data

Which library is more reliable for LDA and perplexity: gensim or scikit-learn? [closed]

How to fit scaler for different subsets of rows depending on group variable and include it in a Pipeline?

Confirm understanding of decision_function in Isolation Forest

Why does RandomForestClassifier in scikit-learn predict even on all-NaN input?

reg.predict is telling me I am not providing an array

Get analytical equation of RF regressor model [duplicate]

Different Feature Selection Results Between Local (Ubuntu VM) and Databricks Using sklearn's SequentialFeatureSelector

TabPFN feature selection raises KeyError(f"None of [{key}] are in the [{axis_name}]")

Hot Network Questions

Collectives™ on Stack Overflow

Related Tags