Page MenuHomePhabricator

Harbormaster builds get stuck in "Restarting" status
Closed, InvalidPublic

Description

This is an ongoing issue with Harbormaster, but it doesn't occur with any reliable reproduction steps. Basically if you do something like this:

Occasional Reproduction Steps:

  1. Open a build or buildable
  2. Restart a build

Expected Result:

The build restarts.

Actual Result:

Occasionally the build will get stuck in the restarting status like this:

pasted_file (289×449 px, 21 KB)

When looking at the daemon console, you can see that the HarbormasterBuildWorker ran:

pasted_file (174×526 px, 17 KB)

But it doesn't appear to have actually done anything. There's no exceptions or errors in the daemon logs (bin/phd log).

Bug Lifetime:

This has been an ongoing issue for at least 18+ months. I see it on my personal instance which is reasonably close to HEAD:

phabricator rPf9a58fafba0d15e043f20e410bbe782357130183
arcanist rARCc13e5a629535f460ca1a16d0bfe6d95f43b70b78
phutil rPHUfb1e159d36402cc5f6cdb64726599acf784283b6

And we also see it at work where we're running on a significantly older version of Phabricator.

Further Diagnosis:

Diagnosing the issue further is difficult due to the lack of logs, and the lack of diagnostic tools. It might help if HarbormasterBuildWorker was more verbose in it's operation as it ran as a background task, emitting to the logs whether it had any commands to process and what commands it did process during it's run.

Known Workarounds:

Right now the only workaround is to run bin/harbormaster update --background B1234 from the CLI to add a new update task into the daemon queue, which then appears to correctly pick up the restart command and process it.

Event Timeline

epriestley added a subscriber: epriestley.

There's no pathway forward here without reproduction steps.