Page MenuHomePhabricator

[drydock/core] Add more logging around why leases are broken
AbandonedPublic

Authored by epriestley on Sep 17 2014, 2:09 AM.
Tags
None
Referenced Files
Unknown Object (File)
Thu, Jan 16, 9:58 PM
Unknown Object (File)
Thu, Jan 16, 3:20 AM
Unknown Object (File)
Fri, Jan 3, 3:14 AM
Unknown Object (File)
Sat, Dec 28, 2:27 PM
Unknown Object (File)
Sun, Dec 22, 1:57 AM
Unknown Object (File)
Fri, Dec 20, 6:51 PM
Unknown Object (File)
Dec 14 2024, 4:51 PM
Unknown Object (File)
Dec 13 2024, 10:34 AM

Details

Reviewers
hach-que
Group Reviewers
Blessed Reviewers
Maniphest Tasks
T2015: Implement Drydock
Summary

Ref T2015. Currently the 'Lease has been broken!' message is not very helpful when diagnosing when "Lease Host" targets fail. This adds more logging around why leases move to the broken status, and exposes this information easily.

Test Plan

Tested this by setting blueprints to have a maximum of 0 and attempting to lease.

Event Timeline

hach-que retitled this revision from to Add more logging around why leases are broken.
hach-que updated this object.
hach-que edited the test plan for this revision. (Show Details)
hach-que added a reviewer: epriestley.

I would greatly appreciate this feature.

Currently I have two blueprints, one working-copy and one host, both of which have one resource each (manually created using the drydock create-resource command), but still I get:

exception 'Exception' with message 'Lease has been broken!' in /opt/phabricator/phabricator/src/applications/drydock/storage/DrydockLease.php:171
Stack trace:
#0 /opt/phabricator/phabricator/src/applications/drydock/storage/DrydockLease.php(197): DrydockLease::waitForLeases(Array)
#1 /opt/phabricator/phabricator/src/applications/harbormaster/step/HarbormasterLeaseHostBuildStepImplementation.php(32): DrydockLease->waitUntilActive()
#2 /opt/phabricator/phabricator/src/applications/harbormaster/worker/HarbormasterTargetWorker.php(52): HarbormasterLeaseHostBuildStepImplementation->execute(Object(HarbormasterBuild), Object(HarbormasterBuildTarget))
#3 /opt/phabricator/phabricator/src/infrastructure/daemon/workers/PhabricatorWorker.php(91): HarbormasterTargetWorker->doWork()
#4 /opt/phabricator/phabricator/src/infrastructure/daemon/workers/storage/PhabricatorWorkerActiveTask.php(158): PhabricatorWorker->executeTask()
#5 /opt/phabricator/phabricator/src/infrastructure/daemon/workers/PhabricatorTaskmasterDaemon.php(20): PhabricatorWorkerActiveTask->executeTask()
#6 /opt/phabricator/libphutil/src/daemon/PhutilDaemon.php(179): PhabricatorTaskmasterDaemon->run()
#7 /opt/phabricator/libphutil/scripts/daemon/exec/exec_daemon.php(119): PhutilDaemon->execute()
#8 {main}

Drydock is currently not supported, and it's advised that you don't use it unless you know how it works. These diffs are probably still like 6 months away (if not more) from being reviewed.

Drydock is currently not supported, and it's advised that you don't use it unless you know how it works. These diffs are probably still like 6 months away (if not more) from being reviewed.

So the recommendation would be to use e.g. Jenkins instead?

The recommendation is to not use Harbormaster / Drydock (but what you use instead is entirely up to you).

epriestley edited edge metadata.

Why is this better than just logging the reason?

This revision now requires changes to proceed.Aug 8 2015, 6:17 PM
hach-que requested a review of this revision.Aug 24 2015, 6:13 AM
hach-que edited edge metadata.

By storing the reason for the lease breaking in a field, we can show the actual reason for the lease moving into this state inside Harbormaster. If we just used log() to put the broken reason in the Drydock logs, we wouldn't be able to show more useful information inside Harbormaster.

For example, the EC2 blueprint uses this to indicate when EC2 spot instances fail to lease because the spot price is too low. Just logging this information to Drydock and showing "Lease was broken" in Harbormaster doesn't give users actionable information to go on. By storing it in a separate field, we can show the actual "The spot price is too low to allocate a host in EC2" message in Harbormaster.

hach-que retitled this revision from Add more logging around why leases are broken to [drydock/core] Add more logging around why leases are broken.Aug 24 2015, 3:41 PM
epriestley edited reviewers, added: hach-que; removed: epriestley.

D14212 theoretically does this. I didn't add the "broken reason" stuff, but logs are a lot more structured now, so I think we could either just show recent logs or show recent logs of some subset of log types (e.g., "leasebrokenlogtype"). I also don't like shoving strings into the database if we can avoid it, since they aren't translatable (old logs weren't either, but new logs are as translatable as possible).

I think this is just a display logic issue for Harbormaster now, though.