Questions tagged [model-selection]

Ask Question

Model selection is a problem of judging which model from some set performs best. Popular methods include $R^2$, AIC and BIC criteria, test sets, and cross-validation. To some extent, feature selection is a subproblem of model selection.

2,037 questions

0 votes

0 answers

44 views

How to plot AIC, BIC of all possible models?

Suppose I was given a data set, say, golf, in the form of an MLR model. Given that best subset selection is choosing the top 5 best models of each size, how would ...

DavyJonessss

asked Nov 16 at 0:57

1 vote

0 answers

22 views

3-way holdout for performance evaluation but 2-way for model selection

The paper https://arxiv.org/pdf/1811.12808 by Sebastian Raschka explains how to perform 3-way holdout method, and also how to compute the final model (used in production). During computation of the ...

Ayrat

asked Nov 15 at 11:44

2 votes

0 answers

61 views

Do k-folds risk sampling bias and, if so, how do we avoid it?

In cross-validation, $k$-folds are a common way to train, compare and validate models. Often we want to find an optimal set of hyperparameters for our models. There are many ways to probe the ...

Markus Klyver

asked Oct 18 at 16:51

0 votes

0 answers

52 views

Dealing with high concurvity and variable selection in GAMMs with imbalanced data (mgcv::bam)

I am using GAMMs to model the probability of occurrence of a species, applying logistic regressions with mgcv::bam() to presence-pseudoabsence data. The dataset ...

airC

asked Oct 15 at 15:12

0 votes

0 answers

41 views

How do I conduct backward selection on my OLS regression with Newey-West standard errors?

I have run an OLS regression and detected that it contains autocorrelation and heteroskedasticity. To deal with this I intend to use Newey-West standard errors. But I am not sure what is the proper ...

Mateo Bergman

asked Oct 4 at 10:20

0 votes

0 answers

55 views

LASSO and cross validation when dealing with missing data

I want to simulate data with missing values and use them to compare the predictive performance of several machine learning algorithms, including LASSO. All analyses will be performed in R, using the ...

Benykō-Zamurai

asked Jul 23 at 12:38

0 votes

1 answer

76 views

How to model feeder choice in bees while ignoring unbalanced feeding events per bout?

I'm analyzing an experiment I ran with bumblebees, and really struggling with choosing the appropriate model. In the experiment, each bee made feeder choices across two temperature conditions: ...

bee-researcher

asked Jul 19 at 17:35

1 vote

0 answers

63 views

How to justify the number of background points in MaxEnt species distribution modeling?

I'm building a species distribution model using MaxEnt with 260 presence points, collected opportunistically within a relatively small study area (a single administrative department in France). I'm ...

Martin Eden

asked Jul 8 at 10:08

0 votes

0 answers

41 views

How to interpret AIC model selection and uninformative parameters

I have a model set with 36 candidate models and 4 models with an AIC less than or equal to 2.0. I do not want to model average because I don't think my candidate set really fits in with the caveats ...

Amanda Goldberg

asked Jul 4 at 23:50

1 vote

1 answer

43 views

DCC-GARCH: Valid to have different GARCH models for each series?

Most DCC-GARCH tutorials and guides I found online often use "replicate" in creating their DCC specification, i.e. ...

Matt

asked Jun 14 at 4:13

0 votes

1 answer

93 views

DCC-GARCH: Correct way of choosing between the normal distribution and t-distribution

DCC-GARCH is comprised of two stages: (1) estimating the univariate GARCH and (2) estimating the correlations through DCC. My time series (bond yields) is not normally distributed, as they rejected ...

Matt

asked Jun 11 at 14:36

1 vote

1 answer

65 views

DCC GARCH - Is there any merit in setting omega to zero?

I estimated the univariate GARCH models for each series, and all coefficients are statistically significant. However, upon putting them into one DCC-GARCH model with a DCC(1,1) spec, the individual ...

Matt

asked Jun 9 at 2:47

1 vote

1 answer

79 views

Can Goodness-of-Fit Test be used for Model Selection?

I would like to know whether Goodness of Fit Tests (like Pearson's Chi-squared test or Kolmogorov-Smirnov Test) be used to select which probabilistic distribution model certain empirical observation ...

Luthfi Ahmad

asked Jun 4 at 3:12

0 votes

1 answer

52 views

Why do overfitted models in finite mixture regression sometimes have the smallest BIC despite the true number of components being selected frequently?

Learning about EM algorithms and finite mixture models and I've run into a particularly unintuitive problem. I'm trying to fit a finite mixture regression model on simulated data, where the true ...

dancing_monkeys

asked May 22 at 20:56

0 votes

0 answers

76 views

Linear regression after multiple imputation: Should assumptions be checked before or after AIC-based model selection?

I’m currently working on multiple regression analyses with a small sample (n = 36), using multiple imputation via the mice package in R (5 imputed datasets). The ...

statsInPractice

asked May 22 at 8:01

15 30 50 per page

2 3 4 5

…

136 Next

Stack Exchange Network

Questions tagged [model-selection]

How to plot AIC, BIC of all possible models?

3-way holdout for performance evaluation but 2-way for model selection

Do k-folds risk sampling bias and, if so, how do we avoid it?

Dealing with high concurvity and variable selection in GAMMs with imbalanced data (mgcv::bam)

How do I conduct backward selection on my OLS regression with Newey-West standard errors?

LASSO and cross validation when dealing with missing data

How to model feeder choice in bees while ignoring unbalanced feeding events per bout?

How to justify the number of background points in MaxEnt species distribution modeling?

How to interpret AIC model selection and uninformative parameters

DCC-GARCH: Valid to have different GARCH models for each series?

DCC-GARCH: Correct way of choosing between the normal distribution and t-distribution

DCC GARCH - Is there any merit in setting omega to zero?

Can Goodness-of-Fit Test be used for Model Selection?

Why do overfitted models in finite mixture regression sometimes have the smallest BIC despite the true number of components being selected frequently?

Linear regression after multiple imputation: Should assumptions be checked before or after AIC-based model selection?

Hot Network Questions