Newest 'classification' Questions - Data Science Stack Exchange

12 votes

1 answer

2k views

Use of training data that has been labeled by the AI model itself

I'm a software engineer working with medical device AI models that predict diseases and other conditions. For the most part, I don't design the models but I help with getting FDA clearance for them. ...

raner

223

asked Nov 3, 2025 at 20:32

1 vote

1 answer

49 views

Correlated Features In Classificatification Problem

I'm working on binary classification problem to identify struggling students in university. I have some features that are correlated such as high_school_grade_1 that represents 75% of ...

Youness Belhaj

11

asked Oct 26, 2025 at 22:39

7 votes

1 answer

90 views

LDA linearly separates 2 out 3 classes, what insight does it provide?

My dataset consists of board games data: each board game is rated with a categorical variable (low, medium, high). I've plotted the LDA projection to check whether classes are linearly separable. The ...

Giulio Lanza

73

asked Oct 26, 2025 at 10:36

0 votes

0 answers

23 views

Reporting results with a little high standard deviation within Nested CV

I'm working on a binary classification problem to identify struggling students, my dataset contains 10 features and 200 samples, I implement Nested CV, the distribution of the target variable is 58%/...

Youness Belhaj

11

asked Oct 22, 2025 at 2:13

1 vote

0 answers

18 views

Scalar versus 2-element output for binary classification models

When building a binary classification model using a neural network, you have two options for outputs: output a single number from 0 to 1 using sigmoid activation, or output a probability distribution ...

Mach5

31

asked Oct 19, 2025 at 6:18

2 votes

0 answers

39 views

KDE classification with n>1 features

i'm working on an implementation of this paper and i have a question. The authors purpose a model (KDE boosting classifier) which works with only n=1 feature and 1 dependent variabile. I'm saying that ...

wolowizard

21

asked Oct 8, 2025 at 10:45

0 votes

0 answers

51 views

Qiskit Problem: this solution is a bit slow, is there a way to make it faster and increase the accuracy a little bit?

I'm currently making a small binary classification program using Quantum Machine Learning (EstimatorQNN to be more specific). My program classifies data inside the Wisconsin Breast Cancer database and ...

Andrea

1

asked Sep 18, 2025 at 11:48

2 votes

0 answers

54 views

Do you need paired data to train multimodal?

I have video, audio, and text data. The intent is to use the multimodal for binary classification. However, the data is not paired (i.e The audio and text are not from the same video recording). I've ...

myts999

21

asked Aug 31, 2025 at 18:53

4 votes

0 answers

104 views

Why is my models classification performance so much worse than its regression performance?

I'm comparing different models on their performance for breathing detection. For every model, i try to predict a continouous breathing signal as regression task as well as a binary classification for ...

sophie

41

asked Jul 31, 2025 at 16:44

3 votes

1 answer

124 views

Principal Data Analysis - how to determine the key features contribute to PC1 using scikit-learn python

I struggle to select the key features that contribute to PC1. I will use the public breast cancer dataset to illustrate the issue. Please feel free to point me to previous post if this question has ...

WhiskerFeatures

31

asked Jul 19, 2025 at 23:13

3 votes

1 answer

73 views

XGboost for predicting time based

I am trying to predict whether the event will likely to occur or not. I split my data into training and testing based on time. Let's say first to 10th months as training data, then the 11th and 12th ...

Ocean

705

asked Jul 10, 2025 at 7:00

0 votes

0 answers

34 views

How to handle a small size of samples of a subcategory in a categorical variable in the training of a decision tree classifier?

I am learning how Decision Tree Classifiers work and I have a situation where the tree is trained with a dataset where one category has got 6 possible subcategories, for simplicity let's say A, B, C, ...

Mai

1

asked Jul 6, 2025 at 21:28

1 vote

0 answers

54 views

How to improve classification model (item will sell that day or not) for dataset with multiple sparce timeseries?

I am trying to create one big model(lightGB) that forecasts sales for each product for cosmetic chain store. Dataset I am working with is last 5 years data and has these columns: ...

13aba

11

asked Jul 2, 2025 at 5:42

4 votes

1 answer

73 views

How do I downsample huge datasets with sparse asymptotes?

I'm rendering charts for timeseries data composed by millions of records. The charts need to be interactive and have lots of feature support so I need to downsample them. The problem I've encountered ...

nathan-w

41

asked Jun 21, 2025 at 6:14

8 votes

3 answers

540 views

Best CNN architecture for multiple aligned grayscale images per instance

I’m working on a binary classification problem in a biomedical context, with ~15,000 instances. Each instance corresponds to a single biological sample (a cell), and for each sample I have three co-...

Antonio Rossi

331

asked Jun 16, 2025 at 14:45

Stack Exchange Network

Questions tagged [classification]

Use of training data that has been labeled by the AI model itself

Correlated Features In Classificatification Problem

LDA linearly separates 2 out 3 classes, what insight does it provide?

Reporting results with a little high standard deviation within Nested CV

Scalar versus 2-element output for binary classification models

KDE classification with n>1 features

Qiskit Problem: this solution is a bit slow, is there a way to make it faster and increase the accuracy a little bit?

Do you need paired data to train multimodal?

Why is my models classification performance so much worse than its regression performance?

Principal Data Analysis - how to determine the key features contribute to PC1 using scikit-learn python

XGboost for predicting time based

How to handle a small size of samples of a subcategory in a categorical variable in the training of a decision tree classifier?

How to improve classification model (item will sell that day or not) for dataset with multiple sparce timeseries?

How do I downsample huge datasets with sparse asymptotes?

Best CNN architecture for multiple aligned grayscale images per instance

Hot Network Questions