Questions tagged [r]
Use this tag for any *on-topic* question that (a) involves `R` either as a critical part of the question or expected answer, & (b) is not *just* about how to use `R`.
27 questions from the last 30 days
1
vote
1
answer
22
views
What does the base prediction value actually imply in SHAP?
I know SHAP (or shapley) values are the contribution of each input variable to the model prediction. Adding the base values to the sum of all SHAP values gives you the model prediction for any data ...
1
vote
2
answers
53
views
Spearman2's rho or Chatterjee's xi correlation coefficient for non-monotonic data?
Let's assume the non-monotonic data below (right graph and data from here).
I would like to test if the two variables x and y are correlated or at least not independent, given the non-monotonic ...
0
votes
1
answer
43
views
Sample size estimation to compare three group means given a coefficient of variation and a percentage change taking pairwise comparisons into account
I would like to estimate asample size to compare the means of three groups(A,B and C) taking into account three pairwise comparisons (A/B, B/C, A/C) given a coefficient variation and an effect size as ...
0
votes
0
answers
37
views
Problem with data cleaning
The Union of India has undergone frequent political re-organizations since independence. The problem today (for me) is that, I've been unable to account for certain data values of the following states/...
6
votes
2
answers
208
views
Singularity Problem with gls that isn't present in lm
I'm performing an IPD meta-analysis, and need to fit my models with study-specific variances (which is why I need to fit with nlme::gls instead of ...
7
votes
2
answers
275
views
How to decide if an interaction exists: graphically/interaction terms/contrasts of slopes
I have fit interaction models of the form: phenotype ~ genotype * environment, based on theory.
I am assigning environment (GFR, in this case) as the moderator.
I have three scenarios:
A: Non-parallel ...
2
votes
1
answer
51
views
Several 2D factor smooths of same variables in GAMs with mgcv
I aim to build a model that includes 2D smooth by different factors (to check for smooth differences; e.g., level 1 of factor 1 vs. level 2 of factor 1, level 1 of factor 2 vs. level 2 of factor 2), ...
2
votes
1
answer
45
views
How does the Cox hazard regression model change with a main vs. interactive model?
How do these two differ in terms of interpretation? When should one be used over the other?
...
1
vote
1
answer
50
views
Standardising predictors before using poly() in a GLMM
Crossposting from https://stackoverflow.com/q/79913674/19231816
I am building a GLMM to answer an ecological question. Most of my predictors were log-transformed and then z-score standardised ((x - ...
6
votes
2
answers
252
views
Do Spline Terms Affect the Interpretation of Linear Terms in Logistic Regression?
In a multivariable logistic regression model, some continuous predictors are modeled using spline transformations to allow for nonlinear effects, while other continuous predictors are entered as ...
0
votes
0
answers
41
views
Quantile density estimation [closed]
Let $X$ be a random variable and having quantile function $Q$ and quantile density function $q$. Let $X_1, X_2,..., X_n$ be independent and identically distributed random variables
from $F(x)$. I ...
6
votes
1
answer
149
views
Choosing distribution families when fitting GAMs for different response variables with mgcv
Considering the histograms and Q-Q-plots of these 6 response variables, which distribution families would you recommend for fitting GAMs with mgcv?
Var3 and Var6 follow a somewhat normal distribution ...
4
votes
1
answer
200
views
Survival analysis options for nested cancer data
I commonly work with cancer data that is on the patient-lesion level, as patients with metastases often have multiple treated lesions. While looking at patient survival is easy, I get a bit stuck with ...
2
votes
1
answer
29
views
PERMANOVA sample size and variable limitations
I am trying to test for vocal differences between 5 primate individuals with sample sizes of n = 102, 86, 115, 45, 12 recordings of their calls. The data is not normally distributed and the variance ...
3
votes
1
answer
44
views
Are my simulations of multi-level survival times correct?
I came across the article from Bender et al.(2005) and attempted to put this into R code to simulate survival times based on an empirical baseline hazard from existing data.
I compute survival times ...