Newest 'machine-learning+scikit-learn' Questions

0 votes

1 answer

44 views

How to preprocess date in Isolation Forest sklearn [closed]

I am using sklearn's IsolationForest model to detect anomalies on a time-series dataset. One of the features is date with the format MM-YYYY, the other features are numeric values. What is the best ...

Mar

21

asked Apr 21 at 17:03

-2 votes

1 answer

87 views

Why does my RandomForestClassifier overfit despite using cross-validation? [closed]

I'm working on a binary classification problem using RandomForestClassifier from scikit-learn. My dataset has ~10,000 rows and ~20 numerical features. I used train_test_split and cross_val_score, but ...

Eshaan Saha

1

asked Apr 20 at 19:53

0 votes

1 answer

43 views

Confirm understanding of decision_function in Isolation Forest

I am looking to better understand sklearn IsolationForest decision_function. My understanding from this previous stack overflow post, What is the difference between decision function and ...

Mar

21

asked Apr 15 at 21:07

1 vote

1 answer

38 views

Why does RandomForestClassifier in scikit-learn predict even on all-NaN input?

I am training a random forest classifier in python sklearn, see code below- from sklearn.ensemble import RandomForestClassifier rf = RandomForestClassifier(random_state=42) rf.fit(X = df.drop("...

lsr729

844

asked Apr 15 at 19:53

0 votes

0 answers

19 views

Get analytical equation of RF regressor model [duplicate]

I have the following dataset: X1 X2 X3 y 0 0.548814 0.715189 0.602763 0.264556 1 0.544883 0.423655 0.645894 0.774234 2 0.437587 0.891773 0.963663 0.456150 3 ...

quant

4,492

asked Apr 7 at 9:27

2 votes

0 answers

56 views

Different Feature Selection Results Between Local (Ubuntu VM) and Databricks Using sklearn's SequentialFeatureSelector

I am migrating from running my machine learning pipeline in VS Code with Ubuntu on a VM into Databricks. When I test the same dataset using the same code, I get different selected features from ...

Mattie

31

asked Mar 28 at 11:22

0 votes

0 answers

51 views

TabPFN feature selection raises KeyError(f"None of [{key}] are in the [{axis_name}]")

I trained a tabPFN model, which I then tried applying a sequential feature selector for important feature selection. I've been getting this error KeyError(f"None of [{key}] are in the [{axis_name}...

Adam

156

asked Mar 23 at 22:59

2 votes

1 answer

109 views

Why does SequentialFeatureSelector return at most "n_features_in_ - 1" predictors?

I have a training dataset with six features and I am using SequentialFeatureSelector to find an "optimal" subset of the features for a linear regression model. The following code returns ...

CodingLikeAFox

23

asked Mar 23 at 12:16

0 votes

1 answer

152 views

Length of features is not equal to the length of SHAP Values

Im running a random forest model and to get some feature importance and Im trying to run a SHAP analysis. The problem is that every time I try to plot the shap values, I keep getting this error: ...

Starterkit07

1

asked Mar 17 at 19:16

1 vote

2 answers

206 views

Pipeline FutureWarning: This Pipeline instance is not fitted yet [closed]

I am working on a fairly simple machine learning problem in the form of a practicum. I am using the following code to preprocess the data: from preprocess.date_converter import DateConverter from ...

Santiago

21

asked Feb 28 at 15:25

1 vote

1 answer

32 views

Why VotingClassifer performance with voting set to "hard" is different with different weights?

I wanted to test VotingClassifier from sklearn and comparae performance with different parameters. I used param grid and then I notice something unintelligible. I prepared three classifiers gnb = ...

Krzysztof

11

asked Feb 28 at 2:14

0 votes

1 answer

24 views

In ordinal encoder what does handle_unknown= use_encoded_values do?

I've done my research about but I'm not satisfied with the answer I looked up both on the documentation and gemini. use_encoded_value what does it mean? Do I have to pass an argument to act as an ...

Remian-Feral

7

asked Feb 27 at 5:09

0 votes

2 answers

100 views

How to train sklearn model in different Dataframes?

I have a ML model made with "knn" in scikit-learn and noticed that the more i have data, more precise my model is getting with it's predictions. The problem is, i have lot's of DataFrames ...

Guilherme Diniz Queiroz De Car

23

asked Feb 14 at 13:01

0 votes

0 answers

39 views

Tweedie Regression: power >=2 ' "Some value(s) of y are out of the valid range of the loss", but y values are not

I'm running a Tweedie Regression, and for powers >= 2, I get an error telling me that my y values are out of the range of the HalfTweedieLoss. I understand the valid range of y for this loss to be &...

Laura Chutny

1

asked Feb 13 at 16:29

0 votes

1 answer

72 views

ValueError: X has 7 features, but ColumnTransformer expects 13 features

I have the following code where I try to predict price of tools for which I use poisson regression. # --- Load and Prepare Data --- y = train['PriceToday'] X = train.drop(columns=['PriceToday']) # ...

H_H

17

asked Feb 12 at 23:51

Collectives™ on Stack Overflow

All Questions

How to preprocess date in Isolation Forest sklearn [closed]

Why does my RandomForestClassifier overfit despite using cross-validation? [closed]

Confirm understanding of decision_function in Isolation Forest

Why does RandomForestClassifier in scikit-learn predict even on all-NaN input?

Get analytical equation of RF regressor model [duplicate]

Different Feature Selection Results Between Local (Ubuntu VM) and Databricks Using sklearn's SequentialFeatureSelector

TabPFN feature selection raises KeyError(f"None of [{key}] are in the [{axis_name}]")

Why does SequentialFeatureSelector return at most "n_features_in_ - 1" predictors?

Length of features is not equal to the length of SHAP Values

Pipeline FutureWarning: This Pipeline instance is not fitted yet [closed]

Why VotingClassifer performance with voting set to "hard" is different with different weights?

In ordinal encoder what does handle_unknown= use_encoded_values do?

How to train sklearn model in different Dataframes?

Tweedie Regression: power >=2 ' "Some value(s) of y are out of the valid range of the loss", but y values are not

ValueError: X has 7 features, but ColumnTransformer expects 13 features

Hot Network Questions

Collectives™ on Stack Overflow

All Questions

Related Tags