Page MenuHomePhabricator

Refactor Conduit auth to be stateless, token-based, and support wire encodings
Closed, ResolvedPublic

Description

Conduit authentication currently supports two mechanisms:

  • A session-based mechanism, where you use conduit.connect to establish a session.
  • An undocumented stateless mechanism likely used only by Facebook.

The first mechanism is unnecessarily complex and makes Conduit slow (extra round trips) and hard to use (can't use CURL, pain to write clients). We should move away from it and deprecate it.

The actual wire token is needlessly complicated. We transmit a proof-of-token, not the token itself. The proof we transmit is not a request signature, so all this does is make replay attacks moderately more difficult. In practice, it just causes a bunch of issues for users with bad client or server timestamps or goofy environmental problems. No one has ever expressed interest in upgrading to a request signature scheme.

Because the wire token is needlessly complicated, the token/handshake/certificate UIs are also needlessly complicated and users hit a bunch of issues using them. These UIs should just be "generate session", which gives you a durable token, with attendant session management/review capabilities.

Upshot:

  • Deprecate conduit sessions and conduit.connect.
  • Support direct token-based auth (?token=abdef123) and make this the standard.
  • Leave room for a proof-of-token + request-signature flavor of this eventually.
  • Support SSH auth.
  • Support multiple request encodings (likely BSON, protobuf, or messagepack). Leave JSON as the default, but in cases where messages can not be represented in JSON this gives us a plausible way forward.
  • Fix the UIs to make handshakes and session management straightforward.

Related Objects

StatusAssignedTask
Resolvedepriestley
Resolvedepriestley
ResolvedNone
Wontfixepriestley
Resolvedepriestley
DuplicateNone
Resolvedepriestley
OpenNone
Resolvedepriestley
Resolvedepriestley
OpenNone
OpenNone
OpenNone
Openepriestley
Resolvedepriestley
OpenNone
DuplicateNone
DuplicateNone
Resolvedepriestley
Resolvedbtrahan

Event Timeline

There are a very large number of changes, so older changes are hidden. Show Older Changes
epriestley removed a project: Phacility.

At least for now, this no longer blocks Phacility or T2783.

The utf8 stuff hasn't been coming up much; the token-based stuff is just blocked on docs and also hasn't come up much; I touched the protocol enough that I'm pretty confident we can move forward without backward compatibility breaks.

Is the above commit able to be used or are there a few steps to go? Gave it a quick try and it still wanted my bot user to install a certificate despite providing an API token with the above parameter.

I'd expect it to work on relatively up-to-date Phabricator. You may still need to have some valid-ish ~/.arcrc file, which we might be reading and then ignoring. I'll check that and fix it if it's the case -- we shouldn't require a ~/.arcrc if you specify --conduit-token.

Great, thanks. It appears that it does still need the certificate in a .arcrc file somewhere, as we removed that file as part of the above test and received the certificate prompt.

As a goofy workaround you can probably write some dummy file

dummy.arcrc
{
  "hosts": {
    "http://blah.yourcompany.com": {}
  }
}

...and specify that with --arcrc-file, but this is stupid; I'll just fix the actual problem.

Yeah that's what we were doing actually, but given our environment and arcanist's checks regarding the file permissions (that it's 600, etc.) it was messy due to shell restrictions and other stuff where our scripts are running. This will be great for build agents posting back Harbormaster statuses.

@epriestley Is there an ETA on this? It's been 7 days since I cleared out failing Git parser tasks and there's another 20 I need to clear out now. This issue has a significant impact on our background processing because commits without UTF8 data in them effectively block the queue.

commits without UTF8 data in them effectively block the queue.

This is unexpected. What are you seeing that makes you think they're blocking the queue?

Tasks with at least any failures automatically de-prioritize and are executed after tasks with no failures. The expectation is that the screenshot in T8857 does not show a block queue, just a queue with some cruft at the end of it.

Generally we find that the commit / message parser tasks tend to get done first, and take up taskmasters that could be processing Harbormaster builds. This delays builds because we have less taskmasters available.

If it were the case where the commit / message parser tasks become deprioritized, I would expect to see the currently leased tasks quickly replaced entirely with Harbormaster tasks when a build with 50+ targets starts. But we don't see that; instead we see the same commit / message parser tasks being attempted (with failure counts >2000) being leased, with the remaining slots being used for builds (and search and other tasks).

Okay, can you file a separate issue for that and I'll try to reproduce it? That's bad and not the expected queue behavior.

eadler added a project: Restricted Project.Jan 8 2016, 10:27 PM
eadler moved this task from Restricted Project Column to Restricted Project Column on the Restricted Project board.Jan 8 2016, 10:42 PM

How can i use DiffusionBrowseController.php to brow my files with non-latin character names correctly?

Support multiple request encodings (likely BSON, protobuf, or messagepack). Leave JSON as the default, but in cases where messages can not be represented in JSON this gives us a plausible way forward.

I am increasingly coming to the viewpoint that we shouldn't do this.

Much of the binary data we potentially transmit over the API is also arbitrarily gigantic. Examples include diffs (like the output of git diff) and file content. JSON is clearly not a great format for uploading or downloading gigantic binary blobs, and the API has generally moved away from this anyway: methods like diffusion.filecontentquery return a pointer to a File object and support timeouts and size limits.

In most cases, I think this model is probably the right one for downloading binary blobs: we stream the data into Files, then return a pointer to a File object. Recent APIs which need to do this generally work this way anyway.

For uploading binary blobs, we'd do the reverse: have the client stream the blob into Files, then call an API method with a reference to the object. For convenience, we could continue supporting providing the data inline in cases where it is small and can be represented in JSON. There is generally not much need for this today. This has been discussed elsewhere.

That leaves us with a small number of cases where data can not be represented in JSON and is also reasonable to want to transmit over the API as a field in some sort of datastructure. An example of this kind of data is repository path names: a repository may have binary path names that can not be represented in JSON, but it's obviously silly to have an API call like diffusion.browse return a list of file pointers with each path name stored in an individual file.

In the cases, I think it's probably reasonable to just encode the data inline:

[
  "path": {
    "readable": "path/to/fil<?>e.txt",
    "raw.base64": "ba!nfl1namef!NLF1f13"
  }
]

The handful of clients that care about correctness in the face of binary paths can then implement handling correctly, and the majority of clients that don't can use the transcribed name and get the right behavior almost always.

I think the only real trick we're left with is that we need to be mindful of this in exposing possibly-binary data to the API, but I think there are fairly few sources which really matter very much outside of paths: basically path names, branch names, and committer/author names. And then maybe some day stuff like "imported JIRA task titles", but I think this kind of case is unlikely to be important.

epriestley claimed this task.

Deprecate conduit sessions and conduit.connect.
Support direct token-based auth (?token=abdef123) and make this the standard.

Although not all of this is dead, we've effectively moved to token-based authentication.

Leave room for a proof-of-token + request-signature flavor of this eventually.

This has been supported internally for some time. Particularly, JSON/HTTP has supported authentication via signing with SSH keys for a while, and this is how we authenticate in the Phacility cluster (ConduitClient::AUTH_ASYMMETRIC). There's little external interest in this today so it's not well documented, but it works fine.

Support SSH auth.

This has been supported for a while, although T550 is a more focused followup. The major limitation today is that calls are one-shot so you pay connection overhead for each call.

Support multiple request encodings (likely BSON, protobuf, or messagepack). Leave JSON as the default, but in cases where messages can not be represented in JSON this gives us a plausible way forward.

I no longer plan to support request or response encodings. See T13337 for the anticipated pathway forward.

Fix the UIs to make handshakes and session management straightforward.

This got cleaned up a while ago and no longer seems to be much of a pain point.

For uploading binary blobs, we'd do the reverse: have the client stream the blob into Files, then call an API method with a reference to the object.

Does anything actually use a flow like this today? Just looking for an example to see how hard it would be to create similar methods for Harbormaster blob content.

Yeah, diffusion.rawdiffquery and diffusion.filecontentquery both do this.

They have some mild weirdness (timeout/limit handling feels kind of ad-hoc) but the actual blob part is in pretty good shape, I think.

Am I going crazy, or do those methods only handle the "blob fetching" part of this equation? I was looking for examples of the "create a File, do a chunked upload, attach said File to an existing object" flow.

Oh, sorry, I misread which half you were asking about.

arc upload does the upload part via ArcanistFileUploader, which is basically file.allocate to create the file, then file.uploadchunk calls in a loop to stream it up. There's some additional code that uses file.querychunks to resume uploads, and falls back to file.upload for old server versions.

It doesn't attach files, but they don't need to be attached by the client if the same user is making a subsequent call to some other api.do-something-with-a-file method, since they'll be the author of the file and always be able to read it.