Paths

Table of Contentst

Differential D16565

Expose Drydock lease management from conduit
AbandonedPublic
Actions

Authored by yelirekim on Sep 17 2016, 1:13 AM.

Details

Reviewers

None

Group Reviewers

Blessed Reviewers

Summary

Blueprints, authorizations and leases have their search engines and dao's made conduit friendly here so that clients can reason about leases which belong to them and which blueprints they are allowed to use.

We also give blueprints more rigid definitions of the attributes they expect when creating leases, and expose documentation for those attributes on the conduit console.

The general expected client flow is:

use drydock.requestauthorization to gain access to blueprints
use drydock.createlease to obtain a lease
use drydock.destroylease when you're done with it

Test Plan

wrapped all of these things together in arcanist workflows, leased and released workspaces

Diff Detail

Repository

rP Phabricator

Branch

lease_management (branched from master)

Lint

Lint Passed

Unit

Tests Passed

Build Status

Buildable 13831
Build 17879: Run Core Tests
Build 17878: arc lint + arc unit

Event Timeline

yelirekim updated this revision to Diff 39874.Sep 17 2016, 1:13 AM

yelirekim retitled this revision from to Expose Drydock lease management from conduit.

yelirekim updated this object.

yelirekim edited the test plan for this revision. (Show Details)

Herald added a reviewer: Blessed Reviewers. · View Herald TranscriptSep 17 2016, 1:13 AM

Herald added a subscriber: epriestley. · View Herald Transcript

yelirekim added a child revision: D16566: Manage Drydock leases from arcanist.Sep 17 2016, 1:57 AM

@epriestley couple thoughts:

Does it actually make more sense for us to define "resource types" which specify lease attributes, so that blueprint implementations providing identical resource types would always require identical lease attributes? I can see how that is more coherent from an API standpoint, but less flexible from the standpoint of how blueprints allow you to obtain resources.

If we don't go down that path, should we define unique strings for the blueprint implementations that users can reference instead of the blueprint class names?

Are you requesting authorisation against user PHIDs?

Would it be better to authorise users using policies rather than authorisations (which are as I understand it primarily for object <-> object authorisations)?

Other than that, this is amazing.

hach-que awarded a token.Sep 18 2016, 3:32 AM

From my perspective, it makes sense for it to be the user requesting authorization, for our use cases a user (or a bot user in the case of other services consuming this) should be the "thing" that is requesting authorization.

A simple use case here is that we have an external service which runs all of our unit tests and integration tests constantly to see if any them suddenly start failing (through no fault of any change made to the codebase, usually some kind of external data has gone missing or changed or something, or the test was relying on being run in a particular order) so that we can react to that.

The system service has a bot user assigned to it, it requests authorization to a blueprint, then it can use the same machines phabricator uses in order to run the tests.

I have a futher diff coming which allows you to proxy ssh commands to the machines, but I'm waiting to make sure evan doesn't have some kind of fundamental change he wants made to this before I get that upstreamed.

I may have misunderstood you, were you advocating for application capabilities instead of authorizations?

@epriestley / @chad do you have pending comments on this? I can split it up into "obvious improvements", like bundling together search engine / dao / API methods per-dao. And then leave drydock.createlease and drydock.destroylease in their own diffs since I imagine they are the most contentious.

As is, this diff represents a cohesive whole that represents the improvements needed from the API in order to coherently manage leases from a client, and I figured that changes to any single concept here could easily propagate to related ideas, so I kept them as one revision.

Yeah, splitting this up feels like the best short-term way forward to me:

I think the *.search methods can all come upstream, these Engine-based API endpoints are feeling pretty stable to me and I think none of these are perilous are controversial.

drydock.destroylease might be better as drydock.lease.sendmessage or something in that vein (*.sendcommand?), to mirror harbormaster.sendmessage. This might be YAGNI because there are no other commands you can send today. (I think the name destroy is also slightly misleading, but that's a quibble.) I think this is probably premature to upstream, but should come upstream in some very similar form sooner or later. The behavior is basically what I expect, I'm just not sure about the particulars of the API yet.

drydock.requestauthorization should probably be drydock.authorization.edit, but that's a much larger change since the object isn't transactional yet. It seems like this object probably should become transactional if we're going to expand the role of authorizations. This workflow should probably also have UI associated with it (today, I guess you just go get someone to click the button manually? I haven't reviewed the other diff in detail yet). I could probably queue this up for @jcox if you want to get this upstream.

drydock.createlease should maybe be drydock.lease.create or similar, maybe using a different action word like .requisition to distinguish it from normal create/edit endpoints. This is probably furthest away from being upstreamable because I don't really have a concrete plan for leaseAttributes. This dovetails into the earlier question about attributes being particular to blueprints or resource types.

I don't yet have a good answer on where attributes come from. Currently, both resource types and lease attributes are very weak: they're fairly freeform and unstructured. My plan today is basically:

limp along with some stuff that sort of works until we have a pretty clear picture of what the 95% use case looks like;
retrofit a model on top of that.

For example, in the case of requesting a working copy, you might want to make sure you get one on a Unix machine. There are two extreme versions of this request:

"Most Magic": drydock.lease.requisition( type=workingcopy, os=unix, repository=Y, commit=Z )
"Least Magic": drydock.lease.requisition( blueprint=X, repository=Y, commit=Z )

In the most-magic version, you don't even specify a blueprint, and Drydock just considers all of your authorized blueprints to find any kind of constructible resource which matches the request. This is sort of a "Computer! Give me a working copy!" workflow.

In the least-magic version, you have separate Unix and Windows build pools and just request a working copy by specifying a blueprint that you know only comes from the Unix build pool. The only way Drydock can fulfill this request is by giving you a Unix machine, so it does.

I'm not sure how much magic we actually want Drydock to have, or how much magic the average use case expects/desires. I think more magic will generally create a number of difficult problems (for example, resources with seemingly identical attributes often have different security or provisioning or quota or network availability attributes -- a human "knows" a lot about hosts which Drydock currently doesn't, and letting it magic its way through these situations will probably means it needs to learn about these things). On the other hand, more magic makes arc do-drydock-things easier to use, provided the magic works.

I wasn't sure how useful arc do-drydock-things would actually be -- it seems cool, but also like the kind of thing that could just be conceptually cool and not terribly useful. However, it sounds like you're finding it pretty useful in practice. If we push down the path here, we may end up wanting a more-magical Drydock.

However, if this has a pretty narrow set of use cases behind it and most queries realistically have most of their parameters expressed in terms of an explicit blueprint, we don't need as much magic, and can keep resource types and attributes simpler and easier to administrate.

I think a more-magical Drydock probably sees lease attributes as sort of "interfaces" on resource types and maybe blueprints, and makes them heritable: so a lease requisition might be specifying attributes for the resource type or for child resource types, and possibly for particular blueprints. A less-magical Drydock might not need inheritance or multiple attribute interfaces, and could accommodate the cases where they're important by having the child blueprint repackage attributes on the parent blueprint explicitly.

There are also some types of more-powerful queries which we may want to support. Some examples are expressing a preference for a resource with a large amount of available CPU or memory, or saying "try to get me something with these specs for 30 minutes, and settle for these less-good specs if that doesn't work". I'm not sure if these queries are interesting. If they are, I'm not sure if they're interesting to have Drydock handle fully vs having a simpler mechanism like just putting a TTL on the request and having the client do a "must have good specs, ttl = 30 minutes" request first, then cancel the request, then submit a second "don't care about specs, ttl = infinity" request if it doesn't get a lease.

Upshot:

Split out the *.search stuff and I'll review that, that can come upstream more or less as-is although I may catch some, e.g., convention/consistency stuff when I look it over in detail.
I can queue up "put transactions on DrydockAuthorization and implement drydock.authorization.edit" for @jcox whenever you want to upstream that.
Other stuff is probably premature. I suspect we'll have very similar endpoints upstream eventually, but the actual APIs may be significantly different so I don't want to bring them upstream yet.

On the actual authorization thing, I think this approach is reasonable. I think @hach-que's suggestion to add a policy like this to Blueprints:

Can Use This Blueprint Without Authorization: ( Members of Engineering ▼ )

That is, add a new canView-style policy property to blueprints that allows some set of users to use the blueprint without needing an authorization.

This would make administration/usage a little easier, since users wouldn't need to muck around with authorization requests as much.

I dislike this slightly since, among other things, it means we only sometimes have an authorization object, which will probably lead to at least a few weird bugs. I think we could get the best of both worlds down the line like this:

Automatically Approve Authorization Requests: ( Members of Engineering ▼)

This is similar, but then we always have an authorization (and you can disable a particular user's authorization to easily de-authorize them) but users wouldn't have to go through the whole flow every time for blueprints which were widely usable. This generally seems a little better for administration and gives users the full benefit of less paperwork, at a moderate cost of workflows needing to check if they're OK to self-authorize before they ask the user to sign any forms.

epriestley mentioned this in D16566: Manage Drydock leases from arcanist.Sep 21 2016, 4:50 PM

The only reason I can come up with not to authorize User PHIDs is that we might want to authorize particular credentials instead, but that would probably be sort of weird/inconsistent with other per-credential policies in the future.

I vaguely imagine SSH keys and other access tokens getting a fairly involved "what can this credential do?" interface to let you set remote address options, Conduit access options, repository access options, etc. So this is already going to be pretty heavy, and adding a Drydock panel when all of this eventually gets out is probably reasonable. Then that panel could have options like:

(•) This credential can lease.
( ) No leasing ever.
( ) Leasing, but only these blueprints: { X, Y, Z ]
( ) No leasing these blueprints: [ X, Y, Z ]
( ) Some leasing, but only sometimes.

yelirekim mentioned this in T11693: Make drydock command interfaces accessible via SSH workflows.Sep 24 2016, 4:12 AM

yelirekim mentioned this in T11694: Allow clients to generally reason about Drydock leases over the API.Sep 24 2016, 4:16 AM

yelirekim abandoned this revision.Sep 24 2016, 5:16 AM