3

I am running inference on a trained PyTorch model using the same input tensor, fixed random seeds, and evaluation mode enabled.

import torch
torch.manual_seed(42)
torch.cuda.manual_seed_all(42)
model.eval()

Despite this, repeated inference calls produce slightly different outputs at the floating-point level.

Question
Which PyTorch or CUDA operations are non-deterministic during inference, and what exact configuration is required to guarantee deterministic results across runs?

1
  • The ordering that threads (warps actually) execute in is indeterminate. Commented Dec 29, 2025 at 18:22

1 Answer 1

1

Even with fixed random seeds, PyTorch uses non-deterministic cuDNN and CUDA operations by default, so you need to add

  • torch.use_deterministic_algorithms(True),
  • torch.backends.cudnn.deterministic = True,
  • torch.backends.cudnn.benchmark = False,
  • and set os.environ['CUBLAS_WORKSPACE_CONFIG'] = ':4096:8'

to ensure completely deterministic inference results.

Sign up to request clarification or add additional context in comments.

Comments

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.