HomePhabricator

Allow Drydock Blueprints to control "supplemental allocation" behavior so all…

Authored by epriestley on Oct 25 2018, 1:28 PM.

Description

Allow Drydock Blueprints to control "supplemental allocation" behavior so all hosts in an Almanac pool get used

Summary:
Fixes T12145. Ref T13210. See PHI570. See PHI536.

Currently, when you give Drydock an Almanac host pool with more than one host, it never voluntarily builds a second host resource: there is no way to say "maximum X working copies per host" (only "maximum X global working copies") to make the first host overflow, and the allocator tries to pack resources as tightly as possible.

If you can force it to allocate the 2nd..Nth host, things will work reasonably well from there (it will spread working copies across the hosts randomly), but tricking it is very hard, especially before D19761.

To deal with this, give blueprints a new behavior around "supplemental allocations". The idea here is that a blueprint may decide that it would prefer to allocate a fresh new resource instead of allowing an otherwise valid acquisition to occur.

These supplemental allocations follow all the normal allocation rules (they can't exceed limits or actually replace existing resources), so they can only happen if there's free space in the resource pool. But a blueprint can elect for a supplemental allocation to provide a "grow the pool" hint.

The only useful policies here are probably "true" (immediately use all resources, like Almanac) or "false" (pack resources as efficiently as possible) but some other policies might be useful (perhaps "start growing the pool when we're getting a bit full even if we aren't at the limit yet, since our workload is bursty").

Then, give Almanac host resources a "true" policy (always allocate supplemental resources) so they use all hosts once a similar number of concurrent jobs arrive.

One aspect of this approach is that we only do supplemental resources if the normal allocation algorithm already decided that the best resource to acquire was part of the same blueprint. I started with an approach like "look at all the blueprints and see if any of them want to be greedy", but then a not-very-desirable blueprint would end up filling up its whole pool before we skipped the supplemental allocation part and ended up picking a different resource. That felt a bit silly and this feels a little cleaner and more focused.

Test Plan:

  • Without changing the Almanac blueprint policy, allocated hosts. Got A, A, A, A, ... (second host never used).
  • Changed the Almanac policy.
  • Allocated hosts, got A, B, random mix of A and B.
  • Destroyed B. Destroyed all leases on A. Allocated. Got A. This tests the "don't build a supplemental resource if there are no leases on the natural resource".

Reviewers: amckinley

Reviewed By: amckinley

Subscribers: yelirekim, PHID-OPKG-gm6ozazyms6q6i22gyam

Maniphest Tasks: T13210, T12145

Differential Revision: https://secure.phabricator.com/D19762