Allocate software resources
Details
Oct 26 2022
An earlier patch here (rCORE6d6170f76463) swapped binlogs to MIXED and set a 24-hour retention policy. This issue has not reoccurred in the cluster since that patch went out, but the root causes remain unresolved.
Jun 13 2022
- The drydock_resource table could use a (status, ...) key to satisfy common/default queries.
Jun 7 2022
May 9 2022
There may be additional work here, but presuming this is more or less resolved until evidence to the contrary arises.
May 5 2022
When this mechanism is removed (by commenting out the logic that cares about the 25% limit), we'd expect Drydock to build 8 resources at a time (limited by number of taskmasters). It actually builds ~1-4...
May 4 2022
The outline above isn't quite sufficient because when the active resource list is nonempty, we don't actually reach the "new allocation" logic. Broadly, executeAllocator() is kind of wonky and needs some additional restructuring to cover both the D19762 case ("allocate up to the resource limit before reusing resources") and the normal set of cases. The proper logic is something like:
This issue partially reproduces (consistent with the original report, not immediately consistent with my theorizing about a root cause in PHI2177 -- actually, looks like both parts are right, see below): Drydock builds ~1 working copy per minute serially until it reaches a pool size of 5 resources. Then, it begins allocating 2 simultaneous resources.
May 3 2022
This is somewhat resolved and neither next steps or motivation are clear any longer, so I'm going to call it done until evidence to the contrary arises.
Perhaps a philosophical question here is: do we care about which repositories are checked out in a working copy resource?
Before, instant reclaim after lease destruction:
To create resource pressure, I'm now going to try this -- I guess I don't really need the --count flag, but it does make the terminal juggling slightly easier:
The blueprint thing was on the way toward creating allocation pressure, so D21802 allows you to select a blueprint (or a set of possible blueprints) with --blueprint. You can specify an ID or PHID:
That patch is reasonable, and shouldn't break anything as long as the list you provide is a subset of the possible list.
fill in the details a bit.
After D21796:
(one orthogonal bug I found is that bin/drydock lease discards any blueprints provided in an attributes JSON)
(one orthogonal bug I found is that bin/drydock lease discards any blueprints provided in an attributes JSON)
Grab a test lease on the host with:
Here's a fairly simple way to reproduce this:
Feb 7 2020
A saved state is likely something like this:
Sep 27 2019
One broad problem here is "chain of custody" issues in T182. A "Saved State" can easily accommodate multiple representations, and the plan above imagines using Drydock to build tags/branches out of non-repository representations, so we'd have cases where a given "Saved State" has a way to build it with a "patch list" (from the client) or a "ref pointer" (from Drydock).
Aug 20 2019
Dec 12 2018
Dec 9 2018
I'm having some trouble getting this new behaviour (which IIUC basically means that multiple hosts in a Drydock pool should be load-balanced across). In "active resources" I see three Drydock hosts, which all belong to the same Almanac service. In "active leases", however, I see only a single host lease and many working copy leases.
Nov 26 2018
Nov 10 2018
Nov 1 2018
Oct 30 2018
Fantastic, thanks very much @epriestley! I had indeed intended to take care of this myself was on other work this and last week and planned to come back to this. It also would have taken me much longer to realize that drydock.lease.search wasn't yet upstream and how to proceed from there, so I'm glad to see you were able to handle this so easily!