Page MenuHomePhabricator

Allow Drydock Blueprints to control "supplemental allocation" behavior so all hosts in an Almanac pool get used
ClosedPublic

Authored by epriestley on Oct 25 2018, 1:58 PM.

Details

Summary

Fixes T12145. Ref T13210. See PHI570. See PHI536.

Currently, when you give Drydock an Almanac host pool with more than one host, it never voluntarily builds a second host resource: there is no way to say "maximum X working copies per host" (only "maximum X global working copies") to make the first host overflow, and the allocator tries to pack resources as tightly as possible.

If you can force it to allocate the 2nd..Nth host, things will work reasonably well from there (it will spread working copies across the hosts randomly), but tricking it is very hard, especially before D19761.

To deal with this, give blueprints a new behavior around "supplemental allocations". The idea here is that a blueprint may decide that it would prefer to allocate a fresh new resource instead of allowing an otherwise valid acquisition to occur.

These supplemental allocations follow all the normal allocation rules (they can't exceed limits or actually replace existing resources), so they can only happen if there's free space in the resource pool. But a blueprint can elect for a supplemental allocation to provide a "grow the pool" hint.

The only useful policies here are probably "true" (immediately use all resources, like Almanac) or "false" (pack resources as efficiently as possible) but some other policies might be useful (perhaps "start growing the pool when we're getting a bit full even if we aren't at the limit yet, since our workload is bursty").

Then, give Almanac host resources a "true" policy (always allocate supplemental resources) so they use all hosts once a similar number of concurrent jobs arrive.

One aspect of this approach is that we only do supplemental resources if the normal allocation algorithm already decided that the best resource to acquire was part of the same blueprint. I started with an approach like "look at all the blueprints and see if any of them want to be greedy", but then a not-very-desirable blueprint would end up filling up its whole pool before we skipped the supplemental allocation part and ended up picking a different resource. That felt a bit silly and this feels a little cleaner and more focused.

Test Plan
  • Without changing the Almanac blueprint policy, allocated hosts. Got A, A, A, A, ... (second host never used).
  • Changed the Almanac policy.
  • Allocated hosts, got A, B, random mix of A and B.
  • Destroyed B. Destroyed all leases on A. Allocated. Got A. This tests the "don't build a supplemental resource if there are no leases on the natural resource".

Diff Detail

Repository
rP Phabricator
Lint
Automatic diff as part of commit; lint not applicable.
Unit
Automatic diff as part of commit; unit tests not applicable.

Event Timeline

epriestley created this revision.Oct 25 2018, 1:58 PM
Owners added a subscriber: Restricted Owners Package.Oct 25 2018, 1:58 PM
epriestley requested review of this revision.Oct 25 2018, 2:00 PM
jcox awarded a token.Oct 25 2018, 2:54 PM
epriestley edited the summary of this revision. (Show Details)Oct 25 2018, 3:36 PM
epriestley updated this revision to Diff 47197.Oct 25 2018, 3:39 PM
  • Use a more conventional spelling of "supplemental".
amckinley accepted this revision.Oct 31 2018, 6:37 PM
This revision is now accepted and ready to land.Oct 31 2018, 6:37 PM
This revision was automatically updated to reflect the committed changes.