Page MenuHomePhabricator
Feed Advanced Search

Oct 26 2022

epriestley added a comment to T13684: Drydock resource accounting may put significant stress on the MySQL binlog if a lease is unsatisfiable.

An earlier patch here (rCORE6d6170f76463) swapped binlogs to MIXED and set a 24-hour retention policy. This issue has not reoccurred in the cluster since that patch went out, but the root causes remain unresolved.

Oct 26 2022, 7:58 PM · Drydock

Jun 13 2022

epriestley added a comment to T13684: Drydock resource accounting may put significant stress on the MySQL binlog if a lease is unsatisfiable.
  • The drydock_resource table could use a (status, ...) key to satisfy common/default queries.
Jun 13 2022, 1:16 PM · Drydock

Jun 7 2022

epriestley triaged T13684: Drydock resource accounting may put significant stress on the MySQL binlog if a lease is unsatisfiable as Normal priority.
Jun 7 2022, 3:04 AM · Drydock

May 9 2022

epriestley closed T13677: Drydock may grow resource pools too cautiously as Resolved.

There may be additional work here, but presuming this is more or less resolved until evidence to the contrary arises.

May 9 2022, 10:21 PM · Drydock
epriestley added a revision to T13677: Drydock may grow resource pools too cautiously: D21809: In Drydock, yield for reclaiming resources in the "released" state.
May 9 2022, 5:45 PM · Drydock
epriestley added a revision to T13677: Drydock may grow resource pools too cautiously: D21808: Remove the "25% of active pool" growth rate throttle from Drydock.
May 9 2022, 5:30 PM · Drydock
epriestley added a revision to T13677: Drydock may grow resource pools too cautiously: D21807: Adjust the Drydock allocator to limit each pending lease to one allocating resource.
May 9 2022, 5:24 PM · Drydock
epriestley added a revision to T13677: Drydock may grow resource pools too cautiously: D21806: Formalize some more Drydock conditions and bookkeeping.
May 9 2022, 5:09 PM · Drydock

May 5 2022

epriestley added a comment to T13677: Drydock may grow resource pools too cautiously.

When this mechanism is removed (by commenting out the logic that cares about the 25% limit), we'd expect Drydock to build 8 resources at a time (limited by number of taskmasters). It actually builds ~1-4...

May 5 2022, 11:29 PM · Drydock
epriestley added a revision to T13677: Drydock may grow resource pools too cautiously: D21805: Add "--all" flags to "release-lease" and "release-resource" workflows in "bin/drydock".
May 5 2022, 11:01 PM · Drydock

May 4 2022

epriestley added a comment to T13677: Drydock may grow resource pools too cautiously.

The outline above isn't quite sufficient because when the active resource list is nonempty, we don't actually reach the "new allocation" logic. Broadly, executeAllocator() is kind of wonky and needs some additional restructuring to cover both the D19762 case ("allocate up to the resource limit before reusing resources") and the normal set of cases. The proper logic is something like:

May 4 2022, 10:48 PM · Drydock
epriestley added a comment to T13677: Drydock may grow resource pools too cautiously.

This issue partially reproduces (consistent with the original report, not immediately consistent with my theorizing about a root cause in PHI2177 -- actually, looks like both parts are right, see below): Drydock builds ~1 working copy per minute serially until it reaches a pool size of 5 resources. Then, it begins allocating 2 simultaneous resources.

May 4 2022, 10:16 PM · Drydock
epriestley triaged T13677: Drydock may grow resource pools too cautiously as Normal priority.
May 4 2022, 9:37 PM · Drydock

May 3 2022

epriestley added a project to T10522: Build plan page doesn't warn about unapproved Drydock blueprints: Drydock.
May 3 2022, 11:25 PM · Drydock, Restricted Project, Harbormaster (v3)
epriestley closed T11694: Allow clients to generally reason about Drydock leases over the API as Resolved.

This is somewhat resolved and neither next steps or motivation are clear any longer, so I'm going to call it done until evidence to the contrary arises.

May 3 2022, 11:17 PM · Restricted Project, Drydock
epriestley closed T11694: Allow clients to generally reason about Drydock leases over the API, a subtask of T11693: Make drydock command interfaces accessible via SSH workflows, as Resolved.
May 3 2022, 11:17 PM · Restricted Project, Drydock
epriestley closed T13676: Drydock may reclaim recently-used resources as Resolved.

Perhaps a philosophical question here is: do we care about which repositories are checked out in a working copy resource?

May 3 2022, 11:15 PM · Drydock
epriestley added a comment to T13676: Drydock may reclaim recently-used resources.

Before, instant reclaim after lease destruction:

May 3 2022, 10:56 PM · Drydock
epriestley added a revision to T13676: Drydock may reclaim recently-used resources: D21803: Don't reclaim resources that have a destroyed lease less than 3 minutes old.
May 3 2022, 10:54 PM · Drydock
epriestley added a comment to T13676: Drydock may reclaim recently-used resources.

To create resource pressure, I'm now going to try this -- I guess I don't really need the --count flag, but it does make the terminal juggling slightly easier:

May 3 2022, 10:51 PM · Drydock
epriestley added a comment to T13676: Drydock may reclaim recently-used resources.

The blueprint thing was on the way toward creating allocation pressure, so D21802 allows you to select a blueprint (or a set of possible blueprints) with --blueprint. You can specify an ID or PHID:

May 3 2022, 10:07 PM · Drydock
epriestley added a revision to T13676: Drydock may reclaim recently-used resources: D21802: Allow "bin/drydock lease ..." to select particular blueprints with "--blueprint".
May 3 2022, 10:03 PM · Drydock
epriestley added a revision to T13676: Drydock may reclaim recently-used resources: D21801: Use the same logic in "bin/drydock lease" and LeaseUpdateWorker to identify candidate blueprints.
May 3 2022, 9:28 PM · Drydock
epriestley added a revision to T13676: Drydock may reclaim recently-used resources: D21800: Allow "bin/drydock lease" to acquire many identical leases with "--count N".
May 3 2022, 9:10 PM · Drydock
epriestley added a revision to T13676: Drydock may reclaim recently-used resources: D21799: Add an ArgumentParser helper for integers.
May 3 2022, 9:09 PM · Drydock
epriestley added a comment to T13676: Drydock may reclaim recently-used resources.

That patch is reasonable, and shouldn't break anything as long as the list you provide is a subset of the possible list.

May 3 2022, 8:41 PM · Drydock
jmeador added a comment to T13676: Drydock may reclaim recently-used resources.

fill in the details a bit.

May 3 2022, 7:06 PM · Drydock
epriestley added a revision to T13676: Drydock may reclaim recently-used resources: D21798: Fix more PHP 8.1 "strlen(null)" callsites in PhutilURI.
May 3 2022, 7:00 PM · Drydock
epriestley added a revision to T13676: Drydock may reclaim recently-used resources: D21797: Update "bin/drydock command" help text to use more standard quoting.
May 3 2022, 6:57 PM · Drydock
epriestley added a comment to T13676: Drydock may reclaim recently-used resources.

After D21796:

May 3 2022, 6:54 PM · Drydock
epriestley added a comment to T13676: Drydock may reclaim recently-used resources.

(one orthogonal bug I found is that bin/drydock lease discards any blueprints provided in an attributes JSON)

May 3 2022, 6:46 PM · Drydock
jmeador added a comment to T13676: Drydock may reclaim recently-used resources.

(one orthogonal bug I found is that bin/drydock lease discards any blueprints provided in an attributes JSON)

May 3 2022, 6:42 PM · Drydock
epriestley added a revision to T13676: Drydock may reclaim recently-used resources: D21796: Fail in a more comprehensible way when a WorkingCopy lease omits or mangles "repositories.map".
May 3 2022, 6:36 PM · Drydock
epriestley added a comment to T13676: Drydock may reclaim recently-used resources.

Grab a test lease on the host with:

May 3 2022, 6:31 PM · Drydock
epriestley added a revision to T13676: Drydock may reclaim recently-used resources: D21795: Fix various "strlen(null)" PHP 8.1 issues on "bin/phd" and "bin/drydock" pathways.
May 3 2022, 6:10 PM · Drydock
epriestley added a comment to T13676: Drydock may reclaim recently-used resources.

Here's a fairly simple way to reproduce this:

May 3 2022, 5:55 PM · Drydock
epriestley triaged T13676: Drydock may reclaim recently-used resources as Normal priority.
May 3 2022, 5:35 PM · Drydock

Feb 7 2020

epriestley added a comment to T13426: Add a "Saved States" indirection layer on top of "Staging Areas".

A saved state is likely something like this:

Feb 7 2020, 4:54 PM · Harbormaster, Drydock, Arcanist

Sep 27 2019

epriestley added a comment to T13426: Add a "Saved States" indirection layer on top of "Staging Areas".

One broad problem here is "chain of custody" issues in T182. A "Saved State" can easily accommodate multiple representations, and the plan above imagines using Drydock to build tags/branches out of non-repository representations, so we'd have cases where a given "Saved State" has a way to build it with a "patch list" (from the client) or a "ref pointer" (from Drydock).

Sep 27 2019, 6:43 PM · Harbormaster, Drydock, Arcanist
epriestley triaged T13426: Add a "Saved States" indirection layer on top of "Staging Areas" as Normal priority.
Sep 27 2019, 6:28 PM · Harbormaster, Drydock, Arcanist

Aug 20 2019

epriestley closed T13383: Provide a "drydock.resource.search" API method as Resolved by committing rP721a86401ff4: Implement "drydock.resource.search".
Aug 20 2019, 8:07 PM · Conduit, Drydock
epriestley added a revision to T13383: Provide a "drydock.resource.search" API method: D20723: Implement "drydock.resource.search".
Aug 20 2019, 8:02 PM · Conduit, Drydock
epriestley triaged T13383: Provide a "drydock.resource.search" API method as Normal priority.
Aug 20 2019, 3:52 PM · Conduit, Drydock

Dec 12 2018

joshuaspence added a comment to T12145: Resource allocator does not create new host resources when one is already active.

I'm having some trouble getting this new behaviour (which IIUC basically means that multiple hosts in a Drydock pool should be load-balanced across). In "active resources" I see three Drydock hosts, which all belong to the same Almanac service. In "active leases", however, I see only a single host lease and many working copy leases.

Dec 12 2018, 10:15 AM · Bug Report, Drydock

Dec 9 2018

joshuaspence added a comment to T12145: Resource allocator does not create new host resources when one is already active.

I'm having some trouble getting this new behaviour (which IIUC basically means that multiple hosts in a Drydock pool should be load-balanced across). In "active resources" I see three Drydock hosts, which all belong to the same Almanac service. In "active leases", however, I see only a single host lease and many working copy leases.

Dec 9 2018, 11:39 PM · Bug Report, Drydock

Nov 26 2018

epriestley added a project to T13223: "Land Revision" builds a commit message as an omnipotent user, not the revision author or landing user: Drydock.
Nov 26 2018, 5:53 PM · Drydock, Policy, Differential, Security

Nov 10 2018

epriestley updated the task description for T13073: Plans: Drydock for normal software use cases where builds take more than 45 seconds.
Nov 10 2018, 1:31 PM · Plans, Drydock

Nov 1 2018

epriestley closed T12145: Resource allocator does not create new host resources when one is already active as Resolved by committing rPb950f877c50c: Allow Drydock Blueprints to control "supplemental allocation" behavior so all….
Nov 1 2018, 1:06 AM · Bug Report, Drydock

Oct 30 2018

jwarner added a comment to T13212: Add 'ownerPHIDs' query constraint to 'drydock.lease.search' conduit call.

Fantastic, thanks very much @epriestley! I had indeed intended to take care of this myself was on other work this and last week and planned to come back to this. It also would have taken me much longer to realize that drydock.lease.search wasn't yet upstream and how to proceed from there, so I'm glad to see you were able to handle this so easily!

Oct 30 2018, 9:39 PM · Drydock

Oct 26 2018

epriestley closed T13212: Add 'ownerPHIDs' query constraint to 'drydock.lease.search' conduit call as Resolved by committing rP5f3a7cb41b17: Expose Drydock leases via Conduit.
Oct 26 2018, 1:12 PM · Drydock
epriestley closed T13212: Add 'ownerPHIDs' query constraint to 'drydock.lease.search' conduit call, a subtask of T11694: Allow clients to generally reason about Drydock leases over the API, as Resolved.
Oct 26 2018, 1:12 PM · Restricted Project, Drydock

Oct 25 2018

epriestley added a comment to T12145: Resource allocator does not create new host resources when one is already active.

D19762 adds a "supplemental allocation" behavior, which basically lets blueprints say "I want to grow the pool instead of allowing this otherwise valid lease acquisition".

Oct 25 2018, 2:03 PM · Bug Report, Drydock
epriestley added a revision to T12145: Resource allocator does not create new host resources when one is already active: D19762: Allow Drydock Blueprints to control "supplemental allocation" behavior so all hosts in an Almanac pool get used.
Oct 25 2018, 1:59 PM · Bug Report, Drydock
epriestley added a comment to T12145: Resource allocator does not create new host resources when one is already active.

After that, both hosts will have resources and jobs will allocate randomly, which should be good enough.

Oct 25 2018, 12:57 PM · Bug Report, Drydock
epriestley added a revision to T12145: Resource allocator does not create new host resources when one is already active: D19761: When a Drydock host based on an Almanac blueprint has its binding disabled, stop handing out leases.
Oct 25 2018, 12:48 PM · Bug Report, Drydock
epriestley added a comment to T12145: Resource allocator does not create new host resources when one is already active.

I believe you can work around this today by disabling the binding to host "A" in Almanac, running one job (which will be forced to allocate on host "B"), then re-enabling the binding. After that, both hosts will have resources and jobs will allocate randomly, which should be good enough. This is exceptionally cumbersome and ridiculous, of course (and it's possible that it doesn't even work).

Oct 25 2018, 12:42 PM · Bug Report, Drydock
epriestley added a comment to T8153: Improve detection and recovery when resources are mangled outside of Drydock's control.

A specific subcase here is when the binding to an Almanac host has been disabled. We should possibly test this during Interface construction, treat it as a failure, then recover from it.

Oct 25 2018, 12:40 PM · Prioritized, Drydock
epriestley moved T13212: Add 'ownerPHIDs' query constraint to 'drydock.lease.search' conduit call from Backlog to Now on the Drydock board.
Oct 25 2018, 12:05 PM · Drydock
epriestley moved T12145: Resource allocator does not create new host resources when one is already active from Backlog to Now on the Drydock board.
Oct 25 2018, 12:03 PM · Bug Report, Drydock
epriestley added a comment to T13212: Add 'ownerPHIDs' query constraint to 'drydock.lease.search' conduit call.

I believe D16594 should implement this, one way or another, unless I'm misunderstanding the request.

Oct 25 2018, 11:49 AM · Drydock
epriestley added a parent task for T13212: Add 'ownerPHIDs' query constraint to 'drydock.lease.search' conduit call: T11694: Allow clients to generally reason about Drydock leases over the API.
Oct 25 2018, 11:48 AM · Drydock
epriestley added a subtask for T11694: Allow clients to generally reason about Drydock leases over the API: T13212: Add 'ownerPHIDs' query constraint to 'drydock.lease.search' conduit call.
Oct 25 2018, 11:48 AM · Restricted Project, Drydock
epriestley added a revision to T13212: Add 'ownerPHIDs' query constraint to 'drydock.lease.search' conduit call: D16594: Expose Drydock leases via Conduit.
Oct 25 2018, 11:40 AM · Drydock
epriestley added a comment to T13212: Add 'ownerPHIDs' query constraint to 'drydock.lease.search' conduit call.

Complicating this: there is no drydock.lease.search call upstream. So you're probably running some variation of D16594? But that already has ownerPHIDs.

Oct 25 2018, 11:38 AM · Drydock

Oct 24 2018

epriestley added a comment to T13212: Add 'ownerPHIDs' query constraint to 'drydock.lease.search' conduit call.

I'm happy to make these changes myself, or you mentioned wanting to contribute a patch?

Oct 24 2018, 12:17 AM · Drydock

Oct 23 2018

epriestley moved T13205: Perhaps, provide options for hardening long-lived and relatively stable directories in Drydock? from Backlog to Far Future on the Drydock board.
Oct 23 2018, 8:59 PM · Drydock

Oct 16 2018

epriestley added a project to T13212: Add 'ownerPHIDs' query constraint to 'drydock.lease.search' conduit call: Drydock.
Oct 16 2018, 2:32 PM · Drydock
epriestley updated the task description for T13088: Plans: Harbormaster UI usability and interconnectedness.
Oct 16 2018, 1:36 PM · Plans, Harbormaster

Oct 12 2018

epriestley updated the task description for T13088: Plans: Harbormaster UI usability and interconnectedness.
Oct 12 2018, 3:19 PM · Plans, Harbormaster

Oct 10 2018

epriestley updated the task description for T13088: Plans: Harbormaster UI usability and interconnectedness.
Oct 10 2018, 11:42 PM · Plans, Harbormaster

Oct 1 2018

epriestley updated the task description for T13088: Plans: Harbormaster UI usability and interconnectedness.
Oct 1 2018, 3:50 PM · Plans, Harbormaster

Sep 21 2018

epriestley triaged T13205: Perhaps, provide options for hardening long-lived and relatively stable directories in Drydock? as Wishlist priority.
Sep 21 2018, 11:27 PM · Drydock

Sep 19 2018

epriestley updated the task description for T13073: Plans: Drydock for normal software use cases where builds take more than 45 seconds.
Sep 19 2018, 12:53 PM · Plans, Drydock
epriestley updated the task description for T13073: Plans: Drydock for normal software use cases where builds take more than 45 seconds.
Sep 19 2018, 12:43 PM · Plans, Drydock
epriestley updated the task description for T13073: Plans: Drydock for normal software use cases where builds take more than 45 seconds.
Sep 19 2018, 12:37 PM · Plans, Drydock

Sep 14 2018

epriestley updated the task description for T13088: Plans: Harbormaster UI usability and interconnectedness.
Sep 14 2018, 4:01 PM · Plans, Harbormaster

Sep 13 2018

epriestley updated the task description for T13088: Plans: Harbormaster UI usability and interconnectedness.
Sep 13 2018, 2:44 PM · Plans, Harbormaster

Sep 7 2018

epriestley updated the task description for T13088: Plans: Harbormaster UI usability and interconnectedness.
Sep 7 2018, 3:05 PM · Plans, Harbormaster

Aug 28 2018

epriestley added a revision to T13088: Plans: Harbormaster UI usability and interconnectedness: D19615: Allow unit test results to specify that their details are formatted with remarkup when reporting to "harbormaster.sendmessage".
Aug 28 2018, 8:24 PM · Plans, Harbormaster
epriestley added a comment to T13088: Plans: Harbormaster UI usability and interconnectedness.

The unit test results also don't currently show on individual builds, which is a little whack?

Aug 28 2018, 8:05 PM · Plans, Harbormaster
epriestley added a comment to T13088: Plans: Harbormaster UI usability and interconnectedness.

See T13189#240682 for some planning on the Unit Test result table.

Aug 28 2018, 7:52 PM · Plans, Harbormaster

Aug 27 2018

epriestley updated the task description for T13088: Plans: Harbormaster UI usability and interconnectedness.
Aug 27 2018, 10:21 PM · Plans, Harbormaster
epriestley updated the task description for T13088: Plans: Harbormaster UI usability and interconnectedness.
Aug 27 2018, 10:16 PM · Plans, Harbormaster

Aug 3 2018

epriestley updated the task description for T13088: Plans: Harbormaster UI usability and interconnectedness.
Aug 3 2018, 7:23 PM · Plans, Harbormaster
epriestley updated the task description for T13088: Plans: Harbormaster UI usability and interconnectedness.
Aug 3 2018, 7:21 PM · Plans, Harbormaster

Jun 20 2018

hach-que added a comment to T11195: Drydock's working copy should run "git lfs fetch && git lfs checkout" for repositories known to use Git LFS.

Back when this was originally reported, I'm pretty sure git lfs clone didn't exist (or at least I wasn't aware of it's existence). The appropriate fix now is probably different to the fix suggested in the original report.

Jun 20 2018, 6:31 AM · Drydock, Feature Request
aeiser added a comment to T11195: Drydock's working copy should run "git lfs fetch && git lfs checkout" for repositories known to use Git LFS.

We have a similar issue - however I think the "fix" is probably worse then the workaround.

Jun 20 2018, 2:37 AM · Drydock, Feature Request

Jun 5 2018

joshuaspence added a member for Drydock: joshuaspence.
Jun 5 2018, 10:45 PM

Apr 16 2018

epriestley updated the task description for T13088: Plans: Harbormaster UI usability and interconnectedness.
Apr 16 2018, 5:18 PM · Plans, Harbormaster
epriestley updated the task description for T13088: Plans: Harbormaster UI usability and interconnectedness.
Apr 16 2018, 5:18 PM · Plans, Harbormaster
epriestley updated the task description for T13088: Plans: Harbormaster UI usability and interconnectedness.
Apr 16 2018, 5:17 PM · Plans, Harbormaster

Apr 13 2018

epriestley updated the task description for T13088: Plans: Harbormaster UI usability and interconnectedness.
Apr 13 2018, 3:46 PM · Plans, Harbormaster
epriestley updated the task description for T13088: Plans: Harbormaster UI usability and interconnectedness.
Apr 13 2018, 1:58 PM · Plans, Harbormaster

Mar 16 2018

epriestley updated the task description for T13088: Plans: Harbormaster UI usability and interconnectedness.
Mar 16 2018, 10:33 PM · Plans, Harbormaster
epriestley updated the task description for T13088: Plans: Harbormaster UI usability and interconnectedness.
Mar 16 2018, 8:20 PM · Plans, Harbormaster

Mar 13 2018

epriestley added a revision to T13088: Plans: Harbormaster UI usability and interconnectedness: D19217: Add a UI element for reviewing older generations of Harbormaster builds.
Mar 13 2018, 11:12 PM · Plans, Harbormaster

Mar 12 2018

epriestley updated the task description for T13088: Plans: Harbormaster UI usability and interconnectedness.
Mar 12 2018, 11:43 PM · Plans, Harbormaster

Mar 7 2018

epriestley added a revision to T13088: Plans: Harbormaster UI usability and interconnectedness: D19187: Correct line highlighting behavior in Diffusion.
Mar 7 2018, 3:04 PM · Plans, Harbormaster

Mar 5 2018

epriestley moved T13073: Plans: Drydock for normal software use cases where builds take more than 45 seconds from Backlog to Future on the Plans board.

This is effectively paused until I'm more convinced that the stabilizations changes really stabilized things -- I'm hoping to stabilize first, then work on improvements from there.

Mar 5 2018, 3:06 PM · Plans, Drydock
epriestley moved T13088: Plans: Harbormaster UI usability and interconnectedness from Backlog to Soon on the Plans board.
Mar 5 2018, 3:02 PM · Plans, Harbormaster