Questions tagged [machine-learning-model]
A machine learning model is a simplified representation of a dataset, derived from statistics in the data, used to make predictions. It can represent patterns, behaviours or features within this dataset which have been learnt by the algorithm during training.
826 questions
3
votes
0
answers
35
views
What is the correct model selection protocol in order to generate the best prediction model?
We're evaluating a novel Machine Learning algorithm and we would like to ensure that its prediction are comparable, if not better, than the baseline models.
Let's say that we have a generic dataset &...
3
votes
1
answer
37
views
Why does my model perform worse after transforming the target?
I have a target with skewed distribution. So, i tried applying TransformedTargetRegressor from scikit using np.log1p as the function and np.expm1 as the inverse function. However, when i evaluate it, ...
12
votes
1
answer
2k
views
Use of training data that has been labeled by the AI model itself
I'm a software engineer working with medical device AI models that predict diseases and other conditions. For the most part, I don't design the models but I help with getting FDA clearance for them. ...
10
votes
1
answer
324
views
How do I train a regression model on time series data containing a band of zeros?
I am trying to create some kind of regression model. Target is continuous and can both be negative and positive. However, the issue is that there is a region/band that I know is roughly -50 to 50, ...
33
votes
3
answers
5k
views
Is class imbalance really a problem in machine learning?
Following on from my recent post on the topic, my goal here is to synthesise the excellent community wisdom on it over at Cross Validated into a "canonical" Q&A for the data science SE :)...
6
votes
2
answers
271
views
What are some good resources to read about recommendation systems to help build your own?
I am working on a content-based recommendation system. I am planning to frame this as a binary classification problem (1 = click/0 = not click).
And I was looking for paper/readings on feature ...
2
votes
0
answers
39
views
How to train Vanna AI to distinguish between two similar tables and their column values?
I am working with Vanna AI (text-to-SQL) and I have two problems regarding my database schema and how the model interprets it:
Problem 1: Two similar tables
I have two tables: SellingDocuments, ...
0
votes
0
answers
73
views
RL - Updating rewards at every step based on filtering model, how to evaluate policies?
I am trying to apply Reinforcement Learning (RL) to the following partially observed setting. I would really appreciate hearing your thoughts on my question.
I have a Markov process that evolves as $p(...
0
votes
0
answers
45
views
churn prediction machine learning low precision
i am working on a project to check for churn prediction, but my data is very imbalanced I tried so many things but this the best model I can get to my main problem is that I want recall and Precision ...
0
votes
0
answers
32
views
Discrete Feature Imputation: How to Choose an Appropriate Data Distribution Model?
I am working on a dataset containing features that are discrete frequency counts. I understand that knowing the underlying data distribution is important for selecting an appropriate imputation method....
1
vote
0
answers
39
views
Fine-tuning Llama 3 to generate task dependencies (industrial planning)
I'm working on fine-tuning a language model (Meta-Llama-3-8B-Instruct) to generate a dependency graph for industrial tasks. The idea is: given a list of unordered tasks, the model should output a ...
0
votes
0
answers
35
views
How to properly set up your X matrix for time-series classification
I am making predictions at the entity level, and for simplicity's sake, suppose there is only one feature. My goal is to set up my X matrix such that I can capture changes to the entity over different ...
1
vote
1
answer
79
views
Tips on how to fix sampling bias
I am trying to improve a classification model with a highly imbalanced dataset — the positive class has very few samples. To compensate, I added more positive-class samples to the training set only, ...
4
votes
1
answer
131
views
Which model is the best suitable for generating edges?
I'm trying to develop a model who'd be able to generate dependencies between industrial tasks. In order to do that, i went for the GNN solution : i have nodes = tasks, dependencies = edges, and have ...
2
votes
0
answers
68
views
DensNet169 model accuracy not increasing on medical classification dataset
I am training an DensNet model on medical dataset which has gold standards as per annotation. After training i noticed accuracy is just 60%. Later i performed following changes but still no luck. ...