HomePhabricator

Implement optimistic "slot locks" in Drydock

Description

Implement optimistic "slot locks" in Drydock

Summary:
See discussion in D10304. There's a lot of context there, but the general idea is:

  • Blueprints should manage locks in a granular way during the actual allocation/acquisition phase.
  • Optimistic "slot locks" might a pretty good primitive to make that easy to implement and reason about in most cases.

The way these locks work is that you just pick some name for the lock (like the PHID of a resource) and say that it needs to be acquired for the allocation/acquisition to work:

...
->needSlotLock("mylock(PHID-XYZQ-...)")
...

When you fire off the acquisition or allocation, it fails unless it could acquire the slot with that name. This is really simple (no explicit lock management) and a pretty good fit for most of the locking that blueprints and leases need to do.

If you need to do limit-based locks (e.g., maximum of 3 locks) you could acquire a lock like this:

mylock(whatever).slot(2)

Blueprints generally only contend with themselves, so it's normally OK for them to pick whatever strategy works best for them in naming locks.

This may not work as well if you have a huge number of slots (e.g., 100TB you want to give out in 1MB chunks), or other complex needs for locks (like you have to synchronize access to some external resource), but slot locks don't need to be the only mechanism that blueprints use. If they run into a problem that slot locks aren't a good fit for, they can use something else instead. For now, slot locks seem like a good fit for the problems we currently face and most of the problems I anticipate facing.

(The release workflows have other race issues which I'm not addressing here. They work fine if nothing races, but aren't race-safe.)

Test Plan:
To create a race where the same binding is allocated as a resource twice:

  • Add sleep(10) near the beginning of allocateResource(), after the free bindings are loaded but before resources are allocated.
  • (Comment out slot lock acquisition if you have this patch.)
  • Run bin/drydock lease ... in two windows, within 10 seconds of one another.

This will reliably double-allocate the binding because both blueprints see a view of the world where the binding is free.

To verify the lock works, un-comment it (or apply this patch) and run the same test again. Now, the lock fails in one process and only one resource is allocated.

Reviewers: hach-que, chad

Reviewed By: hach-que, chad

Differential Revision: https://secure.phabricator.com/D14118

Details

Provenance
epriestleyAuthored on
epriestleyPushed on Sep 21 2015, 11:45 AM
Reviewer
hach-que
Differential Revision
D14118: Implement optimistic "slot locks" in Drydock
Parents
rP6e03419593a6: Implement a rough AlmanacService blueprint in Drydock
Branches
Unknown
Tags
Unknown