Fix an issue where newly created Drydock resources could be improperly acquired
ClosedPublic
Actions

Authored by epriestley on Oct 14 2015, 12:02 PM.

Details

Reviewers

hach-que
chad

Maniphest Tasks

T9252: Unprototype Drydock (v1)

Commits

Restricted Diffusion Commit
rP083a321dad1b: Fix an issue where newly created Drydock resources could be improperly acquired

Summary

Ref T9252. This is mostly a fix for an edge case from D14236. Here's the setup:

There are no resources.
A request for a new resource arrives.
We build a new resource.

Now, if we were leasing an existing resource, we'd call canAcquireLeaseOnResource() before acquiring a lease on the new resource.

However, for new resources we don't do that: we just acquire a lease immediately. This is wrong, because we now allow and expect some resources to be unleasable when created.

In a more complex workflow, this can also produce the wrong result and leave the lease acquired sub-optimally (and, today, deadlocked).

Make the "can we acquire?" pathway consistent for new and existing resources, so we always do the same set of checks.

Test Plan

Started daemons.
Deleted all working copy resources.
Ran two working-copy-using build plans at the same time.
Before this change, one would often [1] acquire a lease on a pending resource which never allocated, then deadlock.
After this change, the same thing happens except that the lease remains pending and the work completes.

[1] Although the race this implies is allowed (resource pool limits are soft/advisory, and it is expected that we may occasionally run over them), it's MUCH easier to hit right now than I would expect it to be, so I think there's probably at least one more small bug here somewhere. I'll see if I can root it out after this change.

Diff Detail

Repository

rP Phabricator

Lint

Lint Not Applicable

Unit

Tests Not Applicable

Event Timeline

epriestley updated this revision to Diff 34454.Oct 14 2015, 12:02 PM

epriestley retitled this revision from to Fix an issue where newly created Drydock resources could be improperly acquired.

epriestley updated this object.

epriestley edited the test plan for this revision. (Show Details)

epriestley added a reviewer: chad.

epriestley added a task: T9252: Unprototype Drydock (v1).

epriestley mentioned this in D14274: Fix bad counting in SQL when enforcing Drydock allocator soft limits.Oct 14 2015, 12:26 PM

Although the race this implies is allowed (resource pool limits are soft/advisory, and it is expected that we may occasionally run over them), it's MUCH easier to hit right now than I would expect it to be, so I think there's probably at least one more small bug here somewhere. I'll see if I can root it out after this change.

D14274 has the fix for this, it turned out to be a bad query.

epriestley mentioned this in D14236: Fix unbounded expansion of allocating resource pool.Oct 14 2015, 12:28 PM

hach-que accepted this revision.Oct 14 2015, 1:12 PM

hach-que added a reviewer: hach-que.

This revision is now accepted and ready to land.Oct 14 2015, 1:12 PM

Closed by commit rP083a321dad1b: Fix an issue where newly created Drydock resources could be improperly acquired (authored by epriestley, committed by epriestley). · Explain WhyOct 14 2015, 1:16 PM

This revision was automatically updated to reflect the committed changes.

epriestley mentioned this in rPac7edf54afe4: Fix bad counting in SQL when enforcing Drydock allocator soft limits.

epriestley mentioned this in T9252: Unprototype Drydock (v1).Oct 14 2015, 8:04 PM