Making timeout to acquire a new instance configurable within gitlab-runner

What does this MR do?

This change adds a configurable timeout for acquiring instances in GitLab Runner's autoscaler feature. Previously, the timeout was hardcoded to 15 minutes, but now users can customize it through a new instance_acquire_timeout configuration option. The default remains 15 minutes, which accommodates cloud providers that may take several minutes to provision instances, especially Windows ones. The code also improves error messaging when timeouts occur, making it clearer that the failure was due to the timeout being exceeded. Documentation has been updated to explain this new configuration option, which will be available in GitLab Runner 18.1.

Why was this MR needed?

Initially the timeout was set to 5 minutes. This was not sufficient, as for some cloud providers startup takes more than this. Because of this issue, the timeout was increased to 15 minutes. Unfortunately you were not able to configure this to the users needs. Therefore the variable was introduced and added to the config.toml.

What's the best way to test this MR?

  1. Build the gitlab-runner binary.
  2. Adapt the config.toml and add something like
  [runners.autoscaler]
    ..
    instance_acquire_timeout = "20m0s"
  1. When the autoscaler spins up a new instance that takes longer than the specified value in the config, the acquisition will timeout and will trigger an error message. image

The program was tested solely for our own use cases, which might differ from yours.

What are the relevant issue numbers?

#38795 (closed)

Info

Moritz Scheve moritz.scheve@mercedes-benz.com, Mercedes-AMG GmbH Provider Information

Co-authored-by: Oscar Villarraga oscar.villarraga@mercedes-benz.com, MBition GmbH Provider Information

Edited by 🤖 GitLab Bot 🤖

Merge request reports

Loading