I've run into this issue 3 times over the course of ~3000 builds, and only ever seen it in production. Can't reproduce on development systems, but the way these builds are scheduled and run is wholly homogenous, and works the vast majority of the time.
I'll air the dirty laundry of the fact that these builds are not scheduled by herald or repository operations, which I suspect are the only two "officially supported methods". Instead, they are scheduled like this:
$buildable = HarbormasterBuildable::initializeNewBuildable($queue->getAgent()) ->setBuildablePHID($integration_plan->getHarbormasterBuildablePHID()) ->setContainerPHID($integration_plan->getHarbormasterContainerPHID()) ->save(); $build = $buildable->applyPlan($queue->getBuildPlan(), [], $submission->getSubmitterPHID());
An "integration plan" is a DAO object conforming to HarbormasterBuildableInterface. If this sounds like we've wandered too far into the deep end and this isn't something the upstream will support, understandable. However, the way this issue manifests itself suggests to me that it's strictly a harbormaster / drydock problem that it's probably going to flare up during the course of normal operations on regular installs.
Symptoms:
- Buildable created properly
- No errors in daemon logs
- "Restarts" says zero for the build
- Build status is "pending", no steps in the build plan have started
- First step in the build plan is "lease working copy", which has been configured correctly via almanac etc, no trickery
- The first step doesn't actually show up under the build as having started
- Simply restarting the build fixes everything
We have or builds sitting in a queue for verification before they are applied to master, and in some cases the line for that queue can get long, so when this happens it holds the line up and doesn't inform anyone that there might be a problem. Usually sits there for a while until we notice it and hit restart.
Given the relatively nebulous nature of this ticket, I'm totally willing to go try to track this down myself, I guess I'm just wondering if you can point me in the direction of likely things causing this?