All Questions
Tagged with python scikit-learn
21,907 questions
-1
votes
1
answer
36
views
do i need to scale the rf model while creating voting ensemble model? [closed]
So I'm kinda new to machine learning and I am trying to learn as I build my project. I'm building a classification model for sleep disorders using Voting Ensemble and I have three base models: ...
0
votes
1
answer
38
views
Stratification fails in train_test_split
Please consider the following code:
import pandas as pd
from sklearn.model_selection import train_test_split
# step 1
ids = list(range(1000))
label = 500 * [1.0] + 500 * [0.0]
df = pd.DataFrame({&...
0
votes
0
answers
43
views
HalvingGridSearchCV cannot fit multi label DecisionTreeClassifier
I'm trying to use HalvingGridSearch to find the best DecisionTree model. My model performs a multi-label prediction on a single example, it is trained on a batch of data of size (n_samples x ...
-4
votes
0
answers
27
views
When should I use Random Forest instead of XGBoost, and vice versa? [closed]
I’ve been using both Random Forest and XGBoost for classification tasks. In most cases, I notice that XGBoost gives slightly better accuracy. However, I’m unsure about the specific scenarios where one ...
0
votes
2
answers
35
views
Python Sklearn.Model_Selection giving error numpy.dtype size changed
I have a train test split code
from sklearn.model_selection import train_test_split
train_df, test_df = train_test_split(new_cleaned_df, test_size=0.05, random_state=42, shuffle=True)
train_df....
1
vote
1
answer
54
views
Linear regression prediction does not display properly
I want to make 2 different linear regressions for 2 diferent plots, but on the same figure. I have a problem with the y1_pred because it does not go for all the y axis where are scatters.
model1 = ...
0
votes
1
answer
45
views
How to preprocess date in Isolation Forest sklearn [closed]
I am using sklearn's IsolationForest model to detect anomalies on a time-series dataset. One of the features is date with the format MM-YYYY, the other features are numeric values.
What is the best ...
0
votes
0
answers
28
views
Keras SKLearnClassifier wrapper can't fit MNIST data
I'm trying to use the SKLearnClassifier Keras wrapper to do some grid searching and cross validation using the sklearn library but I'm unable to get the model to work properly.
def build_model(X, y, ...
2
votes
1
answer
37
views
How to fit scaler for different subsets of rows depending on group variable and include it in a Pipeline?
I have a data set like the following and want to scale the data using any of the scalers in sklearn.preprocessing.
Is there an easy way to fit this scaler not over the whole data set, but per group? ...
0
votes
1
answer
43
views
Confirm understanding of decision_function in Isolation Forest
I am looking to better understand sklearn IsolationForest decision_function. My understanding from this previous stack overflow post, What is the difference between decision function and ...
1
vote
1
answer
41
views
Why does RandomForestClassifier in scikit-learn predict even on all-NaN input?
I am training a random forest classifier in python sklearn, see code below-
from sklearn.ensemble import RandomForestClassifier
rf = RandomForestClassifier(random_state=42)
rf.fit(X = df.drop("...
1
vote
2
answers
39
views
reg.predict is telling me I am not providing an array
It seems I have an issue with an array that I thought I coded correctly. When I ask for reg.score or reg.coef_ the code works great, but when I try to predict it throws an error that is saying it is ...
0
votes
0
answers
19
views
Get analytical equation of RF regressor model [duplicate]
I have the following dataset:
X1 X2 X3 y
0 0.548814 0.715189 0.602763 0.264556
1 0.544883 0.423655 0.645894 0.774234
2 0.437587 0.891773 0.963663 0.456150
3 ...
2
votes
0
answers
58
views
Different Feature Selection Results Between Local (Ubuntu VM) and Databricks Using sklearn's SequentialFeatureSelector
I am migrating from running my machine learning pipeline in VS Code with Ubuntu on a VM into Databricks. When I test the same dataset using the same code, I get different selected features from ...
0
votes
0
answers
54
views
TabPFN feature selection raises KeyError(f"None of [{key}] are in the [{axis_name}]")
I trained a tabPFN model, which I then tried applying a sequential feature selector for important feature selection. I've been getting this error
KeyError(f"None of [{key}] are in the [{axis_name}...