Questions tagged [classification]
An instance of supervised learning that identifies the category or categories which a new instance of dataset belongs.
3,242 questions
12
votes
1
answer
2k
views
Use of training data that has been labeled by the AI model itself
I'm a software engineer working with medical device AI models that predict diseases and other conditions. For the most part, I don't design the models but I help with getting FDA clearance for them. ...
1
vote
1
answer
49
views
Correlated Features In Classificatification Problem
I'm working on binary classification problem to identify struggling students in university. I have some features that are correlated such as high_school_grade_1 that represents 75% of ...
7
votes
1
answer
90
views
LDA linearly separates 2 out 3 classes, what insight does it provide?
My dataset consists of board games data: each board game is rated with a categorical variable (low, medium, high).
I've plotted the LDA projection to check whether classes are linearly separable. The ...
0
votes
0
answers
23
views
Reporting results with a little high standard deviation within Nested CV
I'm working on a binary classification problem to identify struggling students, my dataset contains 10 features and 200 samples, I implement Nested CV, the distribution of the target variable is 58%/...
1
vote
0
answers
18
views
Scalar versus 2-element output for binary classification models
When building a binary classification model using a neural network, you have two options for outputs: output a single number from 0 to 1 using sigmoid activation, or output a probability distribution ...
2
votes
0
answers
39
views
KDE classification with n>1 features
i'm working on an implementation of this paper and i have a question. The authors purpose a model (KDE boosting classifier) which works with only n=1 feature and 1 dependent variabile. I'm saying that ...
0
votes
0
answers
51
views
Qiskit Problem: this solution is a bit slow, is there a way to make it faster and increase the accuracy a little bit?
I'm currently making a small binary classification program using Quantum Machine Learning (EstimatorQNN to be more specific). My program classifies data inside the Wisconsin Breast Cancer database and ...
2
votes
0
answers
54
views
Do you need paired data to train multimodal?
I have video, audio, and text data. The intent is to use the multimodal for binary classification.
However, the data is not paired (i.e The audio and text are not from the same video recording).
I've ...
4
votes
0
answers
104
views
Why is my models classification performance so much worse than its regression performance?
I'm comparing different models on their performance for breathing detection. For every model, i try to predict a continouous breathing signal as regression task as well as a binary classification for ...
3
votes
1
answer
124
views
Principal Data Analysis - how to determine the key features contribute to PC1 using scikit-learn python
I struggle to select the key features that contribute to PC1. I will use the public breast cancer dataset to illustrate the issue. Please feel free to point me to previous post if this question has ...
3
votes
1
answer
73
views
XGboost for predicting time based
I am trying to predict whether the event will likely to occur or not. I split my data into training and testing based on time. Let's say first to 10th months as training data, then the 11th and 12th ...
0
votes
0
answers
34
views
How to handle a small size of samples of a subcategory in a categorical variable in the training of a decision tree classifier?
I am learning how Decision Tree Classifiers work and I have a situation where the tree is trained with a dataset where one category has got 6 possible subcategories, for simplicity let's say A, B, C, ...
1
vote
0
answers
54
views
How to improve classification model (item will sell that day or not) for dataset with multiple sparce timeseries?
I am trying to create one big model(lightGB) that forecasts sales for each product for cosmetic chain store. Dataset I am working with is last 5 years data and has these columns:
...
4
votes
1
answer
73
views
How do I downsample huge datasets with sparse asymptotes?
I'm rendering charts for timeseries data composed by millions of records. The charts need to be interactive and have lots of feature support so I need to downsample them.
The problem I've encountered ...
8
votes
3
answers
540
views
Best CNN architecture for multiple aligned grayscale images per instance
I’m working on a binary classification problem in a biomedical context, with ~15,000 instances.
Each instance corresponds to a single biological sample (a cell), and for each sample I have three co-...