Newest Questions
219,352 questions
1
vote
0
answers
7
views
Is GVIF meaningful for a reduced interaction block created from manually coded factor × treatment terms?
I’m fitting a survival model with a multi-level factor (HistologyClass) and binary treatment variables (Radiotherapy, Chemotherapy). I do not want the full HistologyClass * Treatment interaction, ...
2
votes
1
answer
25
views
Given a dataset of human patients which includes their age, sex, and over 100 measured disease biomarkers, what type of statistical analysis is best?
I was given a dataset of close to 100 human participants. This dataset includes values for 100+ measured hypothesized biomarkers of neurodegeneration, their age, sex, disease state (clinically normal, ...
3
votes
1
answer
24
views
How to handle calendar year as a continuous predictor with a mismatched train/test time horizon?
I am using Ordinal Semiparametric Regression (Frank Harrell's rms package) to model overall survival in patients with brain tumor.
My training data is from the SEER database (covering years 2004 to ...
1
vote
1
answer
17
views
What does the base prediction value actually imply in SHAP?
I know SHAP (or shapley) values are the contribution of each input variable to the model prediction. Adding the base values to the sum of all SHAP values gives you the model prediction for any data ...
1
vote
2
answers
49
views
Spearman2's rho or Chatterjee's xi correlation coefficient for non-monotonic data?
Let's assume the non-monotonic data below (right graph and data from here).
I would like to test if the two variables x and y are correlated or at least not independent, given the non-monotonic ...
2
votes
0
answers
14
views
Sample size calculation for Paired Data using ordinal models [closed]
I came across this blog post by Prof. Harrell Ordinal Models for Paired Data – Statistical Thinking.
I was trying to replicate the simulation in order to estimate the sample size for my paired study. ...
4
votes
1
answer
86
views
Which test do I use to estimate the preference of species?
I have a question about how to analyse my dataset and would really appreciate your advice.
My data consist of observations of a set of target plant species collected during field surveys. The surveys ...
2
votes
0
answers
20
views
Is this train/validation/test split method considered a data leakage?
Training has time steps 0-300.
Validation has time steps 200-400.
Testing has time steps 300-500.
The method uses N time steps as the observed past and the next N time steps as the future.
For example,...
2
votes
0
answers
26
views
Markov State Transition Models vs the Win Ratio in clinical trials
I have recently become interested in Markov State Transition Models for the analysis of clinical trials with composite endpoints, such as the Markov Longitudinal Ordinal Model described in Frank ...
5
votes
1
answer
48
views
Exact maximum likelihood for ARMA models
I am trying to understand the unconditional or exact least squares and maximum likelihood estimation methods for ARMA models. I am struggling to reconcile the different formulations given in standard ...
1
vote
0
answers
44
views
Find the Fisher information of Gamma $(\alpha_0, \theta)$. Where the first parameter is known and the second is not, for a sample $(x_1,\dots,x_n)$
Find the Fisher information of Gamma $(\alpha_0, \theta)$. Where the first parameter is known and the second is not, for a sample $(x_1,\dots,x_n)$.
Or to state the original question( which I restated ...
0
votes
0
answers
28
views
Testing the fit of one model on two separate datasets [closed]
I'm trying to compare the shape of two curves (from two datasets) to an expected exponential shape.
Specifically, I am interested in where people allocate their attention to during a cognitive task. ...
3
votes
2
answers
69
views
Testing performance of numerical algorithm for penalised estimators
Assume $y \in \mathbb{R}^{n}$, $\beta \in \mathbb{R}^{k}$, and $X \mathbb{R}^{n\times k}$ and we are solving the following strictly convex optimisation problem
$$
\hat{\beta} = \arg \min ||y - X\beta |...
7
votes
1
answer
256
views
Survival proportion for censored data
Let's consider the below example as a life table for survival analysis:
At time 124, we have censored patinet, lost to follow up.
Generally, I have noticed that we don't get from the software ...
2
votes
1
answer
26
views
Best method to determine sub-sensor error
I have a system measuring several outputs by sub-sensors and the total input to the sub-sensors is measured from a main sensor.
The above chart shows the error between the sum of sub-sensor readings ...