Questions tagged [research]
The research tag has no summary.
71 questions
2
votes
2
answers
129
views
Best Practice for Group Based splitting (Train / Val / Test)
As an intro, Group Based Splitting is data splitting into Train / Test (Val), when by some attribute like patient_id, item_id or similar, to ensure that same person ...
2
votes
1
answer
68
views
Where can I find open/free Galton's Ox estimate/Wisdom of Crowd dataset and similar?
I am playing around with some thoughts on Wisdom of Crowd phenomena and wanted to do some analysis in R/Excel. Francis Galton pioneered this concept and I was hoping to use his dataset but I can't ...
2
votes
0
answers
96
views
Why can monotonic feature transformation influence the performance of hypeparam-tuned tree-based models (e.g., random forest)?
I recently observed something unexpected: Although monotonic feature transformation does not affect the performance of decision tree-based models with default hyperparameters, it actually does affect ...
0
votes
0
answers
33
views
I would like to build an open source Traffic Signs Dataset solely for research purposes
I've been interested lately in doing research about different neural networks and how to contribute to Autonomous Vehicles, I used a couple of images to train a model and the results were different ...
1
vote
0
answers
33
views
Research in Machine Learning in the era of transformers
I'm a master's student in Machine Learning. I'm interested in pursuing research in the field, but I'm concerned about the recent advancements like ChatGPT, CLIP, and DiNO that require massive compute ...
0
votes
0
answers
52
views
Can one have good understanding on a method without having direct experience with it?
This question is in line of these previous questions on other sites:
Is it possible to conduct scientific research without actually getting close to the sample/specimen? in Biology SE
Is it possible ...
0
votes
0
answers
36
views
How to use two independent datasets in machine learning phd research work?
In order to develop an academic performance prediction model for a local Higher Ed Institution, I have collected the OULAD open dataset and the local Institution's dataset which I structured into the ...
1
vote
0
answers
54
views
ML paper reproducibility
How can I reproduce results in an ML paper if I don't have the identical resources to train the models as in the paper ? (in my case I only have a laptop spec NVidia gpu and in most of the papers I ...
15
votes
3
answers
24k
views
Why does everyone use BERT in research instead of LLAMA or GPT or PaLM, etc?
It could be that I'm misunderstanding the problems space and the iterations of LLAMA, GPT, and PaLM are all based on BERT like many language models are, but every time I see a new paper in improving ...
1
vote
1
answer
56
views
Which statistical technique should I use for a within-person repeated measures study?
I have collected all my data for a study and need to run my analysis but have come unstuck (I should have planned better beforehand I know).
I'm looking to see whether personality traits (five trait ...
1
vote
0
answers
45
views
Influence functions on neural networks: Help with understanding of result and derivation
I'm working through a paper titled "Understanding Black-box Predictions via Influence Functions" where they introduce the notion of influence functions from robust statistics to approximate ...
0
votes
2
answers
150
views
Where can I find the applied data science research papers?
I'm trying to find conferences that have applied data science papers published. I'm only interested in top ranked conferences. And I notice quite a number of them are quite theoretical, e.g. IJAI, ...
0
votes
2
answers
112
views
The ideal function in R for fit fitting n LASSO Regressions on n data sets
As part of a statistical learning research paper I am collaborating on, I am running/fitting two hundred sixty thousand different LASSO Regressions on the same number of different randomly generated ...
4
votes
1
answer
82
views
Resources for Promotion/Demotion Strategies for ML Item Recommendation Systems?
We are looking to design a system where specific items or categories of items can be boosted/promoted up or relegated/demoted down the recommendation order.
What are the common strategies or standards ...
0
votes
1
answer
55
views
Which specific AWS service to use for running Benchmark Regressions on datasets far too large to run locally on my laptop [closed]
I am in the middle of a research project with a collaborator in which he has proposed a novel statistical learning processor for optimal variable selection, and I am running the 3 Benchmark Variable ...