0
$\begingroup$

I am analysing data from a pilot VR study with a between-subjects manipulation of arousal (High, Neutral, Low), with 13 participants in each condition. Each participant completed three different VR tasks, each only once, and for each task we recorded one execution-time value.

A reviewer suggested fitting a linear mixed-effects model with Task as a within-subject fixed effect and Arousal × Task as the interaction of interest. However, I am unsure whether this is statistically appropriate given the data structure.

My concerns are:

Each participant contributes only one observation per task (i.e., 3 repeated measures total).

There are no trial-level repetitions within each task, so within-person variance cannot be reliably estimated.

The sample size per arousal group is very small (n = 13), which may lead to unstable fixed-effects estimates.

The design seems extremely underpowered for detecting an Arousal × Task interaction.

My question:

Is a mixed-effects model justified with such limited within-subject replication (three single observations)? Or would collapsing the three tasks into a composite measure be more defensible for a pilot study? How do mixed-effects modellers generally evaluate the viability of models when repeated-measures depth is extremely shallow?

Any guidance or references would be very helpful.

New contributor
José Teles is a new contributor to this site. Take care in asking for clarification, commenting, and answering. Check out our Code of Conduct.
$\endgroup$
3
  • $\begingroup$ Usually in these cases task would be a random effect. I wouldn’t follow the reviewer’s advice. EdM’s answer is very good however. $\endgroup$ Commented yesterday
  • $\begingroup$ Thank you very much for the reply. I am however concerned about the fact that the order of the tasks was random so any arousal x task interaction effects would also be influenced by other factor such as practice or fatigue? Hence the composite effect was extracted to obrain a more general measure of ADLs. Any comments on this ? $\endgroup$ Commented 6 hours ago
  • $\begingroup$ As the answer says, no, I wouldn’t create a composite, I would set task as a random effect along with participant ID. $\endgroup$ Commented 5 hours ago

1 Answer 1

1
$\begingroup$

In a pilot study, you get data to inform the design of a later, more definitive study. With a small study your estimates will necessarily be somewhat imprecise, but you want to do what you can with what you have.

With that in mind, the answer to one of your questions

... would collapsing the three tasks into a composite measure be more defensible for a pilot study?

would be no. You presumably want to get as much information as possible about each task/arousal combination. Collapsing would lose much of that information.

With respect to your other concerns:

Each participant contributes only one observation per task (i.e., 3 repeated measures total).

Quoting from this answer (which includes literature references): "The minimum sample size per cluster in a mixed-effecs model is 1, provided that the number of clusters is adequate, and the proportion of singleton cluster is not 'too high.'"

You do need to have enough clusters (subjects, in your case); you have 39, each with 3 observations. That should be OK. See this page.

There are no trial-level repetitions within each task, so within-person variance cannot be reliably estimated.

In your context, you use a mixed model to take within-person correlations among responses into account. Such a mixed model wouldn't estimate within-person variance. A simple mixed model with a random intercept would evaluate the among-person variance in estimated intercept values.

The sample size per arousal group is very small (n = 13), which may lead to unstable fixed-effects estimates.

The design seems extremely underpowered for detecting an Arousal × Task interaction.

A model without that interaction, under treatment coding of factor variables, would already estimate 5 parameter values:

  • an intercept (estimated execution time at reference levels of both factors)

  • two coefficients for Task (differences from execution time at the reference level for the other 2 levels of Task)

  • two "main effects" of Arousal (differences from execution time at the reference level for the other 2 levels of Arousal)

The Arousal × Task interaction would seem to be of interest, unless you already know that the association between Arousal and outcome doesn't differ among the Tasks. That would require only estimating 4 more fixed-effect coefficients, for 9 total.

You have 39 subjects with 3 observations each, for a total of 117 observations. Even with the interaction term, that gives you 13 observations per fixed-effect coefficient. If the observations were independent, that would be a bit less than ideal, but not too far from recommendations for 15 observations per parameter estimate for a continuous outcome.

For a pilot study it would seem to be important to include the interaction, even given the limited sample size and the repeated measures. You might have large standard errors for the coefficient estimates, but those can be incorporated into later simulations to establish sample sizes adequate for a later, more definitive study.

$\endgroup$

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.