Page MenuHomePhabricator

Implement "Wait for Previous Builds" build step
ClosedPublic

Authored by hach-que on Dec 8 2013, 11:15 PM.
Tags
None
Referenced Files
F18763612: D7745.id17527.diff
Tue, Oct 7, 1:06 AM
F18735834: D7745.id17529.diff
Wed, Oct 1, 3:30 AM
F18649668: D7745.id17529.diff
Sep 20 2025, 9:53 PM
F18649660: D7745.id17529.diff
Sep 20 2025, 9:50 PM
F18514181: D7745.diff
Sep 5 2025, 10:04 AM
F18511584: D7745.diff
Sep 5 2025, 6:51 AM
F18288029: D7745.id17508.diff
Aug 23 2025, 8:03 AM
F18191291: D7745.id17528.diff
Aug 17 2025, 2:30 AM

Details

Reviewers
epriestley
Group Reviewers
Blessed Reviewers
Maniphest Tasks
T1049: Implement Harbormaster
Commits
Restricted Diffusion Commit
rP270c8d27ab4e: Implement "Wait for Previous Builds" build step
Summary

This adds a build step which will block a build from continuing if there are previous builds of the build plan still running.

Test Plan

Configured a build plan with a wait of 60 seconds and a "wait for previous builds", then started a build. While that was still building, reconfigured the plan to have a wait time of 3 seconds, started it, and saw it move into the "Waiting" status. When the 60 second build finished, both builds passed.

Diff Detail

Lint
Lint Skipped
Unit
Tests Skipped

Event Timeline

Finding Earlier Buildables: This is tricky. Testing for smaller IDs isn't sufficient: there's no guarantee builds triggered by Herald will be be in order. If commits A, B and C all get discovered at about the same time, they may make their way through the commit parsing pipeline at different speeds and their Herald rules may trigger in any order. The buildables will then get created in arbitrary order, and this rule will enforce arbitrary order.

Finding all earlier buidables seems quite difficult: it's not only possible that they have larger IDs, it's also possible that they don't even exist yet. We may be running the build plan for "C" before commits "A" and "B" even make it through Herald.

I think the rule needs to be:

  1. Load the buildable's parents (for revisions, nothing I guess; for commits, the commit's parents).
  2. Check that they have all completed their Herald phase. If not, wait.
  3. Check that their buildables are all complete. If not, wait.

In practice, this will look like:

  • Call diffusion.commitparentsquery with ConduitCall to find parents.
  • Check their importStatus for IMPORTED_HERALD to determine that Herald rules have triggered.
  • If all Herald rules have triggered, do a Buildable query for (automatic) buildables.
  • Check that all matching Buildables are complete.

Waiting: We also can't implement "wait" as sleep(). If we do, the entire task pipeline will block once the number of waiting tasks equals the number of daemons. This is an unreasonably low limit.

Wait has to be implemented as either releasing the task's lease for later reacquisition (which is easier, but less efficient) or recursing into subtask processing (which is harder, but more efficient). Drydock does the latter, but we need more metadata about queued tasks in order to do it. It also may be less suitable here: Drydock has many shallow trees, but this may have a small number of very deep trees.

The easy way is to throw new Exception("Blocked by other work!"); or similar. This is trivial to implement, but will gum up the log a bit and not re-execute as efficiently as it could. I'd do this for now, though -- the reentrant thing is super messy. A small step forward from this would be to implement a new type of exception like PhabricatorWorkerPermanentFailureException, but which represents a temporary failure, say PhabricatorWorkerWaitingForABitException or whatever. This could be made to not log in PhabricatorTaskmasterDaemon->run() (so it doesn't gum up the log) and could have a shorter or alternate retry policy at some point.

hach-que updated this revision to Unknown Object (????).Dec 9 2013, 11:52 PM

Ensure blocking detection works by checking commit parents and Herald

The throw vs sleep() thing is important so we can't deadlock the queue.

src/applications/harbormaster/query/HarbormasterBuildQuery.php
142โ€“148 โ†—(On Diff #17526)

Revert these changes now?

src/applications/harbormaster/step/WaitForPreviousBuildStepImplementation.php
60

We should do the throw new Exception("blocked...") here to prevent infinite deadlock.

117

I think this behavior is correct (i.e., wait only for other copies of this build plan) but the description could make it more clear that we aren't waiting for the entire build, just other copies of this plan.

(Pending step separation.)

hach-que updated this revision to Unknown Object (????).Dec 10 2013, 12:00 AM

Remove modifications to build query

hach-que updated this revision to Unknown Object (????).Dec 10 2013, 12:01 AM

More clarity in description of build step