Page MenuHomePhabricator

Improve several exception behaviors for Harbormaster workers
ClosedPublic

Authored by epriestley on Jan 12 2014, 8:00 PM.
Tags
None
Referenced Files
Unknown Object (File)
Sat, Mar 23, 4:59 PM
Unknown Object (File)
Sun, Mar 10, 8:35 AM
Unknown Object (File)
Jan 26 2024, 2:26 AM
Unknown Object (File)
Jan 22 2024, 12:00 PM
Unknown Object (File)
Dec 22 2023, 2:22 PM
Unknown Object (File)
Dec 19 2023, 7:59 AM
Unknown Object (File)
Nov 24 2023, 8:46 AM
Unknown Object (File)
Nov 23 2023, 12:01 PM
Subscribers

Details

Summary

Ref T2015. Several fixes:

  • checkForCancellation() no longer exists, and isn't relevant for resumable stops. Throw it away for now.
  • Fix an issue where a build could pass even if the final step failed.
  • phlog() exceptions so they show up in bin/harbormaster and the daemon logs.
  • Write an exception log if a step fails.
  • Add a "throw an exception" step to debug this stuff more easily.
Test Plan
  • Grepped for checkForCancellation().
  • Ran a failing build where the final step caused the failure.
  • Observed phlog() in bin/harbormaster output.
  • Observed log in web UI:

{F101168}

Diff Detail

Branch
hm1
Lint
Lint Passed
Unit
No Test Coverage

Event Timeline

src/applications/harbormaster/step/CommandBuildStepImplementation.php
73

I presume the reason for not using resolvex is that we want to dump stdout / stderr to the log first?

src/applications/harbormaster/step/WaitForPreviousBuildStepImplementation.php
33–36

Why is this removed? This will mean the build will not display waiting information to the user.

src/applications/harbormaster/step/CommandBuildStepImplementation.php
73

Yeah. This would maybe be a little bit nicer as resolvex + capture exception + log + rethrow, I just wanted to get rid of $build mutations in workers.

src/applications/harbormaster/step/WaitForPreviousBuildStepImplementation.php
33–36

Two reasons:

  • I've removed all updates of $build from the actual steps. This avoids a bunch of locking and race issues where multiple things are trying to write to $build at once: instead, the build update process does all build writes, and it uses locks to guarantee it is the only simultaneous writer. So these direct writes to $build violate that (and, e.g., could overwrite changes in other fields which had happened since $build loaded).
  • Now that steps can run in parallel, a single step being in the "waiting" state does not mean that the whole build is waiting: other steps may be running. Instead, we should set the step as waiting and fail temporarily. Elsewhere, in the build update process, we should set the whole build as waiting only if all of the current steps are waiting. Since the only thing that not doing this costs us is that statuses say "running" instead of "waiting" and it's somewhat complex, I left it out of this diff (as the other changes represent meaningful unbreaks). I plan to restore it in the near future.