Newest 'scikit-learn' Questions - Data Science Stack Exchange

0 votes

0 answers

45 views

Qiskit Problem: this solution is a bit slow, is there a way to make it faster and increase the accuracy a little bit?

I'm currently making a small binary classification program using Quantum Machine Learning (EstimatorQNN to be more specific). My program classifies data inside the Wisconsin Breast Cancer database and ...

Andrea

1

asked Sep 18 at 11:48

5 votes

1 answer

92 views

Sklearn ROC Curve not square

I am using sklearn.metrics.roc_curve to calculate the points of a ROC curve. This is the output I obtain. This plot does not look as I would expect it to. The line ...

user2138149

151

asked Jul 24 at 17:56

3 votes

1 answer

103 views

Principal Data Analysis - how to determine the key features contribute to PC1 using scikit-learn python

I struggle to select the key features that contribute to PC1. I will use the public breast cancer dataset to illustrate the issue. Please feel free to point me to previous post if this question has ...

WhiskerFeatures

31

asked Jul 19 at 23:13

0 votes

0 answers

21 views

Runtime complexity of scikit-learn’s One-vs-Rest LogisticRegression (LBFGS) vs. RidgeClassifier

I’m working through the runtime analysis of scikit-learn’s OneVsRestClassifier for two cases: LogisticRegression (solver=lbfgs, ...

user184658

1

asked Jul 10 at 10:09

1 vote

1 answer

92 views

Sklearn's One-hot encoder adds an extra column for NaNs which are not there

I would appreciate your advice on how to resolve the following issue. I am working with a dataset that contains two categorical features (actually, more than two, but two are enough to illustrate the ...

S. N.

131

asked Jul 7 at 13:56

0 votes

0 answers

30 views

expected the model to forecast resolution time more accurately based on past ticket patterns. I was also hoping to unde

day Modified today Viewed 25 times 0 I want to build a model that forecasts ticket resolution time for a data science software support tickets . I’ve calculated queuing time and resolution time from ...

Rebel Royals

11

asked Jun 26 at 10:03

2 votes

1 answer

56 views

Clarification about scale dataset for MLP regression model and use of the scaling inverse transform

I am a lot confused about the pre-processing scaling process. I have a dataset with several meteorological quantities (pressure, temperature, wind direction, etc.) and I am using it to forecast the ...

cicciodevoto

171

asked Jun 18 at 15:56

4 votes

1 answer

77 views

How to build model with smoothness via various data point

I am trying to model the arch of a basketball free throw projectory. Usually per person, this dataset has 6 points each where it is the height of the basketball via various seconds after the player ...

ChairmanMeow

163

asked Jun 12 at 16:53

1 vote

0 answers

40 views

Nested cross-validation: which implementation to use? different purpose?

I am learning Machine Learning and exploring nested cross-validation. I don't understand the example given in scikit-learn as the model seems to learn from the whole dataset and the evaluation is not ...

SamGG

11

asked Jun 4 at 20:23

3 votes

2 answers

144 views

Much higher scoring metrics with classification_report than cross_validate

I'm training a classifier on the DAIGT dataset. The objective is to differentiate human from AI text and so this is a binary classification problem. As a baseline before I move onto an LLM classifier, ...

saladmobster

33

asked May 31 at 12:04

7 votes

2 answers

159 views

Loan prediction model relying almost entirely on Credit_History and ignoring other features

I'm building a machine learning model to predict loan approval rate. My dataset includes features like: Credit_History ...

Muhammed Erbay

71

asked May 11 at 16:30

5 votes

1 answer

81 views

"Singular values of x" in LinearRegression

LinearRegression has an attribute singular_ which returns "singular values of x". According to a definition I found: "singularity is ... when a ...

Moti

53

asked May 10 at 20:12

4 votes

0 answers

79 views

Why is DecisionTree using same feature and same condition twice

When trying to fit scikit-learn DecisionTreeClassifier on my data, I am observing some weird behavior. x[54] (a boolan feature) ...

Krishna

141

asked May 3 at 12:25

4 votes

0 answers

27 views

Low Accuracy from Geospatial Random forest ML modeling problem - Training Exported from qGIS, SCP

I am doing a geospatial assessment integrated with ML modeling. The problem is the very low accuracy percentage, as more training features increases, it gets lower. What could be the solution to such ...

Reem

41

asked Apr 21 at 18:45

1 vote

0 answers

38 views

Isolation Forest sample size

I am using sklearn's Isolation Forest as a model to detect anomalies. My dataset is relatively small, 50 records with only 2-3 features. To prevent any overfitting, what would you recommend to tune ...

Mar

165

asked Apr 21 at 18:28

Stack Exchange Network

Questions tagged [scikit-learn]

Qiskit Problem: this solution is a bit slow, is there a way to make it faster and increase the accuracy a little bit?

Sklearn ROC Curve not square

Principal Data Analysis - how to determine the key features contribute to PC1 using scikit-learn python

Runtime complexity of scikit-learn’s One-vs-Rest LogisticRegression (LBFGS) vs. RidgeClassifier

Sklearn's One-hot encoder adds an extra column for NaNs which are not there

expected the model to forecast resolution time more accurately based on past ticket patterns. I was also hoping to unde

Clarification about scale dataset for MLP regression model and use of the scaling inverse transform

How to build model with smoothness via various data point

Nested cross-validation: which implementation to use? different purpose?

Much higher scoring metrics with classification_report than cross_validate

Loan prediction model relying almost entirely on Credit_History and ignoring other features

"Singular values of x" in LinearRegression

Why is DecisionTree using same feature and same condition twice

Low Accuracy from Geospatial Random forest ML modeling problem - Training Exported from qGIS, SCP

Isolation Forest sample size

Hot Network Questions