Page MenuHomePhabricator

Allow Drydock Blueprints to control "supplemental allocation" behavior so all hosts in an Almanac pool get used
ClosedPublic

Authored by epriestley on Oct 25 2018, 1:58 PM.
Tags
None
Referenced Files
Unknown Object (File)
Sun, Nov 24, 12:43 PM
Unknown Object (File)
Tue, Nov 19, 4:36 AM
Unknown Object (File)
Sun, Nov 10, 8:15 AM
Unknown Object (File)
Mon, Nov 4, 2:19 PM
Unknown Object (File)
Oct 22 2024, 3:37 PM
Unknown Object (File)
Oct 15 2024, 8:37 PM
Unknown Object (File)
Oct 9 2024, 10:27 AM
Unknown Object (File)
Oct 9 2024, 1:09 AM
Subscribers
Restricted Owners Package
Tokens
"Hungry Hippo" token, awarded by jcox.

Details

Summary

Fixes T12145. Ref T13210. See PHI570. See PHI536.

Currently, when you give Drydock an Almanac host pool with more than one host, it never voluntarily builds a second host resource: there is no way to say "maximum X working copies per host" (only "maximum X global working copies") to make the first host overflow, and the allocator tries to pack resources as tightly as possible.

If you can force it to allocate the 2nd..Nth host, things will work reasonably well from there (it will spread working copies across the hosts randomly), but tricking it is very hard, especially before D19761.

To deal with this, give blueprints a new behavior around "supplemental allocations". The idea here is that a blueprint may decide that it would prefer to allocate a fresh new resource instead of allowing an otherwise valid acquisition to occur.

These supplemental allocations follow all the normal allocation rules (they can't exceed limits or actually replace existing resources), so they can only happen if there's free space in the resource pool. But a blueprint can elect for a supplemental allocation to provide a "grow the pool" hint.

The only useful policies here are probably "true" (immediately use all resources, like Almanac) or "false" (pack resources as efficiently as possible) but some other policies might be useful (perhaps "start growing the pool when we're getting a bit full even if we aren't at the limit yet, since our workload is bursty").

Then, give Almanac host resources a "true" policy (always allocate supplemental resources) so they use all hosts once a similar number of concurrent jobs arrive.

One aspect of this approach is that we only do supplemental resources if the normal allocation algorithm already decided that the best resource to acquire was part of the same blueprint. I started with an approach like "look at all the blueprints and see if any of them want to be greedy", but then a not-very-desirable blueprint would end up filling up its whole pool before we skipped the supplemental allocation part and ended up picking a different resource. That felt a bit silly and this feels a little cleaner and more focused.

Test Plan
  • Without changing the Almanac blueprint policy, allocated hosts. Got A, A, A, A, ... (second host never used).
  • Changed the Almanac policy.
  • Allocated hosts, got A, B, random mix of A and B.
  • Destroyed B. Destroyed all leases on A. Allocated. Got A. This tests the "don't build a supplemental resource if there are no leases on the natural resource".

Diff Detail

Repository
rP Phabricator
Lint
Lint Not Applicable
Unit
Tests Not Applicable

Event Timeline

Owners added a subscriber: Restricted Owners Package.Oct 25 2018, 1:58 PM
  • Use a more conventional spelling of "supplemental".
This revision is now accepted and ready to land.Oct 31 2018, 6:37 PM
This revision was automatically updated to reflect the committed changes.