Timeline for Choosing splines terms for continuous predictors
Current License: CC BY-SA 4.0
Post Revisions
12 events
| when toggle format | what | by | license | comment | |
|---|---|---|---|---|---|
| Mar 13 at 13:05 | comment | added | EdM | My answer assumed that you were asking about restricted cubic regression splines, but I particularly like Gavin Simpson's answer with its extension to the broader class of generalized additive models (GAM). Based on questions I see on this site, I suspect that the other types of GAM smoothers/splines he discusses might be more "usually chosen in practice" these days. I'd recommend that you accept his answer as a better guide for future visitors to this page. See this page for an introduction to different types of splines. | |
| Mar 13 at 12:59 | history | became hot network question | |||
| Mar 13 at 8:51 | answer | added | Gavin Simpson | timeline score: 7 | |
| Mar 13 at 6:27 | comment | added | Roland | "if using a spline, how many degrees of freedom to use?" If you use a mgcv::gam smoother instead, the penalization takes care of this. | |
| Mar 12 at 23:16 | answer | added | Peter Flom | timeline score: 3 | |
| Mar 12 at 21:31 | comment | added | whuber♦ | In all linear models, predictors are vector spaces. That's what splines are, too. This makes them statistically and mathematically the same as any other predictor in any linear model. You can see this by examining the model matrices your software creates: the columns of these matrices generate the subspaces. Indeed, the very fact that your software operates by creating a model matrix demonstrates my claim. | |
| Mar 12 at 20:55 | answer | added | EdM | timeline score: 8 | |
| Mar 12 at 20:32 | comment | added | Konstantinos Gkirgkiris | I am not sure what @whuber indicates about splines in the first comment. | |
| Mar 12 at 20:11 | comment | added | Stephan Kolassa | Is your objective hypothesis testing or prediction? If you feed the final (AIC-chosen) model into the standard NHST machinery, note that p values will be biased low. Ideally you would determine the model based on one data set and assess significance on a different one. (I know, easier said than done.) And just as @whuber writes, this holds for splines just like for other predictors. | |
| Mar 12 at 18:21 | comment | added | Konstantinos Gkirgkiris | Good point. I’m not assuming splines are special in principle—I’m mainly asking how their use and subsequently their complexity is usually chosen in practice within a multivariable model. | |
| Mar 12 at 17:49 | comment | added | whuber♦ | What do you see as special about splines that distinguishes them in this respect from any other kinds of explanatory variables in regression? | |
| Mar 12 at 16:48 | history | asked | Konstantinos Gkirgkiris | CC BY-SA 4.0 |