Skip to main content
0 votes
0 answers
17 views

HalvingGridSearchCV cannot fit multi label DecisionTreeClassifier

I'm trying to use HalvingGridSearch to find the best DecisionTree model. My model performs a multi-label prediction on a single example, it is trained on a batch of data of size (n_samples x ...
Gabriele Benanti's user avatar
-3 votes
0 answers
19 views

When should I use Random Forest instead of XGBoost, and vice versa?

I’ve been using both Random Forest and XGBoost for classification tasks. In most cases, I notice that XGBoost gives slightly better accuracy. However, I’m unsure about the specific scenarios where one ...
0 votes
2 answers
33 views

Python Sklearn.Model_Selection giving error numpy.dtype size changed

I have a train test split code from sklearn.model_selection import train_test_split train_df, test_df = train_test_split(new_cleaned_df, test_size=0.05, random_state=42, shuffle=True) train_df....
Moh. Aflah Azzaky's user avatar
1 vote
1 answer
48 views

Linear regression prediction does not display properly

I want to make 2 different linear regressions for 2 diferent plots, but on the same figure. I have a problem with the y1_pred because it does not go for all the y axis where are scatters. model1 = ...
Gonzalo Martinez's user avatar
0 votes
1 answer
44 views

How to preprocess date in Isolation Forest sklearn [closed]

I am using sklearn's IsolationForest model to detect anomalies on a time-series dataset. One of the features is date with the format MM-YYYY, the other features are numeric values. What is the best ...
Mar's user avatar
  • 21
-2 votes
1 answer
87 views

Why does my RandomForestClassifier overfit despite using cross-validation? [closed]

I'm working on a binary classification problem using RandomForestClassifier from scikit-learn. My dataset has ~10,000 rows and ~20 numerical features. I used train_test_split and cross_val_score, but ...
Eshaan Saha's user avatar
0 votes
0 answers
25 views

Keras SKLearnClassifier wrapper can't fit MNIST data

I'm trying to use the SKLearnClassifier Keras wrapper to do some grid searching and cross validation using the sklearn library but I'm unable to get the model to work properly. def build_model(X, y, ...
Jesus Diaz Rivero's user avatar
-2 votes
0 answers
22 views

Which library is more reliable for LDA and perplexity: gensim or scikit-learn? [closed]

When calculating the perplexity of an LDA model for N topics using train-test split with KFold, I noticed that in Gensim, the perplexity consistently increases as the number of topics grows—resulting ...
O Basile's user avatar
2 votes
1 answer
35 views

How to fit scaler for different subsets of rows depending on group variable and include it in a Pipeline?

I have a data set like the following and want to scale the data using any of the scalers in sklearn.preprocessing. Is there an easy way to fit this scaler not over the whole data set, but per group? ...
ascripter's user avatar
  • 6,265
0 votes
1 answer
43 views

Confirm understanding of decision_function in Isolation Forest

I am looking to better understand sklearn IsolationForest decision_function. My understanding from this previous stack overflow post, What is the difference between decision function and ...
Mar's user avatar
  • 21
1 vote
1 answer
38 views

Why does RandomForestClassifier in scikit-learn predict even on all-NaN input?

I am training a random forest classifier in python sklearn, see code below- from sklearn.ensemble import RandomForestClassifier rf = RandomForestClassifier(random_state=42) rf.fit(X = df.drop("...
lsr729's user avatar
  • 844
1 vote
2 answers
38 views

reg.predict is telling me I am not providing an array

It seems I have an issue with an array that I thought I coded correctly. When I ask for reg.score or reg.coef_ the code works great, but when I try to predict it throws an error that is saying it is ...
Hoot's user avatar
  • 11
0 votes
0 answers
19 views

Get analytical equation of RF regressor model [duplicate]

I have the following dataset: X1 X2 X3 y 0 0.548814 0.715189 0.602763 0.264556 1 0.544883 0.423655 0.645894 0.774234 2 0.437587 0.891773 0.963663 0.456150 3 ...
quant's user avatar
  • 4,492
2 votes
0 answers
56 views

Different Feature Selection Results Between Local (Ubuntu VM) and Databricks Using sklearn's SequentialFeatureSelector

I am migrating from running my machine learning pipeline in VS Code with Ubuntu on a VM into Databricks. When I test the same dataset using the same code, I get different selected features from ...
Mattie's user avatar
  • 31
0 votes
0 answers
51 views

TabPFN feature selection raises KeyError(f"None of [{key}] are in the [{axis_name}]")

I trained a tabPFN model, which I then tried applying a sequential feature selector for important feature selection. I've been getting this error KeyError(f"None of [{key}] are in the [{axis_name}...
Adam's user avatar
  • 156

15 30 50 per page
1
2 3 4 5
1886