Parallelization of loo using mirai and mori by florence-bockting · Pull Request #378 · stan-dev/loo

florence-bockting · 2026-07-01T08:30:25Z

Summary

Fixes #308

Replaces parallel::mclapply() / parLapply() with mirai + mori for per-observation parallelism (cross-platform, including Windows).
Adds three parallelism modes:
- per-call cores,
- persistent session pool (loo.daemons / LOO_DAEMONS), and
- user-managed mirai::daemons() (remote/SSH/HPC).
Parallel output matches serial; only scheduling changes.

What changed

Core (R/parallel.R): with_loo_daemons(), loo_map(), loo_pool_is_local(), loo_persist_config().

Parallelized functions: loo() (function method), psis()/sis()/tis(), relative_eff(), loo_subsample(), loo_moment_match(), loo_model_weights().

Pool precedence: connected pool (user or persistent) always wins → cores is ignored. Local pools use mori zero-copy for broadcast objects (e.g. draws); remote pools serialize.

Also: mirai + mori in DESCRIPTION; vignettes/loo2-parallel.Rmd; tests/testthat/test_parallel.R; benchmark/ scripts + bench-comparison.md.

Review guide

vignettes/loo2-parallel.Rmd shows the user-facing model
R/parallel.R includes the pool lifecycle + loo_map() transport
See as example loo.function: R/loo.R → with_loo_daemons() → loo_map(broadcast = list(draws = ...))
tests/testthat/test_parallel.R include serial/parallel equivalence, pool precedence
benchmark/README.md includes first attempt of a small baseline vs new comparison (see first results in benchmark/bench-comparison.md)

Initial benchmarks (Linux, one machine): loo.function + large draws benefits most (~4× with persistent pool); matrix psis() does not (communication-bound); per-call pool pays ~1s spawn/teardown per call.

Follow-up work

Reviewing: Please have a look at the current implementation and check it for correctness and usability. Any comments and improvements are welcome.
Benchmarking: master vs this branch across problem sizes, all (or selected number of) parallelized functions, OSes (Linux/macOS/Windows), and metrics (wall-clock, allocation, peak RSS). You can for example extend benchmark/.
LSAT case study: posteriorDB lsat-data; showcase speedup and all three parallelism modes (one-off, persistent pool, simulation loop).
Remote SSH: two-machine test; verify correctness, measure speedup, document setup (mirai::daemons(url = ..., remote = ssh_config(...))).
Documentation: expand vignettes/loo2-parallel.Rmd with assumptions, when-to-use guidance, function-specific notes, memory model.

Current limitations of implementation

Matrix psis() rarely speeds up (large data shipped per worker).
Per-call cores > 1 can be slower than serial on small problems without loo.daemons.
Remote SSH untested in CI

github-actions · 2026-07-01T08:45:24Z

This is how benchmark results would change (along with a 95% confidence interval in relative change) if 2bee14f is merged into master:

❗🐌loo_function: 1.98s -> 2.09s [+4.63%, +6.14%]
🚀loo_matrix: 1.9s -> 1.87s [-2.08%, -0.46%]
Further explanation regarding interpretation and methodology can be found in the documentation.

Florence Bockting added 6 commits June 30, 2026 12:38

feat: use mirai and mori in do_importance_sampling

115fdd0

refactor: use mirai/mori for parallelization

4b6bbc3

update: docs, benchmark, vignette

d7ef7bc

ignore cores if daemons are set

89187e1

update benchmark scenarios

5dd4b52

update benchmark results

2bee14f

florence-bockting requested a review from VisruthSK July 1, 2026 08:30

florence-bockting mentioned this pull request Jul 1, 2026

Prepare loo version 3.0.0 #379

Draft

6 tasks

florence-bockting changed the base branch from master to loo-v3.0.0 July 1, 2026 09:33

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Uh oh!

Parallelization of loo using mirai and mori#378

Parallelization of loo using mirai and mori#378
florence-bockting wants to merge 6 commits into
loo-v3.0.0from
parallelization

florence-bockting commented Jul 1, 2026

github-actions Bot commented Jul 1, 2026

Labels

1 participant

Uh oh!

Uh oh!

Conversation

florence-bockting commented Jul 1, 2026

Summary

What changed

Review guide

Follow-up work

Current limitations of implementation

github-actions Bot commented Jul 1, 2026

Labels

1 participant