Page MenuHomePhabricator

Track MFA "challenges" so we can bind challenges to sessions and support SMS and other push MFA

Authored by epriestley on Dec 13 2018, 11:44 PM.



Ref T13222. See PHI873. Ref T9770.

Currently, we support only TOTP MFA. For some MFA (SMS and "push-to-app"-style MFA) we may need to keep track of MFA details (e.g., the code we SMS'd you). There isn't much support for that yet.

We also currently allow free reuse of TOTP responses across sessions and workflows. This hypothetically enables some "spyglass" attacks where you look at someone's phone and type the code in before they do. T9770 discusses this in more detail, but is focused on an attack window starting when the user submits the form. I claim the attack window opens when the TOTP code is shown on their phone, and the window between the code being shown and being submitted is much more interesting than the window after it is submitted.

To address both of these cases, start tracking MFA "Challenges". These are basically a record that we asked you to give us MFA credentials.

For TOTP, the challenge binds a particular timestep to a given session, so an attacker can't look at your phone and type the code into their browser before (or after) you do -- they have a different session. For now, this means that codes are reusable in the same session, but that will be refined in the future.

For SMS / push, the "Challenge" would store the code we sent you so we could validate it.

This is mostly a step on the way toward one-shot MFA, ad-hoc MFA in comment action stacks, and figuring out what's going on with Duo.

Test Plan
  • Passed MFA normally.
  • Passed MFA normally, simultaneously, as two different users.
  • With two different sessions for the same user:
    • Opened MFA in A, opened MFA in B. B got a "wait".
    • Submitted MFA in A.
    • Clicked "Wait" a bunch in B.
    • Submitted MFA in B when prompted.
  • Passed MFA normally, then passed MFA normally again with the same code in the same session. (This change does not prevent code reuse.)

Diff Detail

rP Phabricator
Automatic diff as part of commit; lint not applicable.
Automatic diff as part of commit; unit tests not applicable.

Event Timeline

epriestley created this revision.Dec 13 2018, 11:44 PM
Owners added a subscriber: Restricted Owners Package.Dec 13 2018, 11:44 PM
epriestley requested review of this revision.Dec 13 2018, 11:46 PM

Some of the TTL/window stuff is a little funky here, my expectation is that this change more of a "shaped roughly correctly/moving us in the right direction" kind of change than a polished product. D19890 improves things a bit. Changes in this sequence all make life harder for attackers, but until everything is in the actual security model the changes implement may have some weird holes in it.

epriestley added inline comments.Dec 14 2018, 4:01 PM

Particularly, this is eventually going to change to require you to submit a response to a specific challenge we issued earlier, not just a valid response for the current timestep. Until it does, a lot of the other checks can't be completely effective.

amckinley accepted this revision.Dec 14 2018, 7:52 PM
amckinley added inline comments.

Should this dialogue be scarier? "Another Phabricator session of yours is currently proceeding through an MFA challenge; if this is surprising to you, you should probably call your CISO and/or move to Belize," or similar.


I'm trying to think of a clever way of defeating this by changing the view permissions on an existing challenge to NO_ONE, but that shouldn't really make a difference. In any event, is there a reason to not just use the omnipotent viewer here?


This is probably a good time to talk about how AphrontWriteGuard works, because I understand it mostly as "magic herbs that ward off evil CSRF spirits". Why wouldn't we have access to CSRF tokens at this point? Is it because this request to generate a challenge is coming in via a GET request? Why are we calling save() instead of letting newResultFromIssuedChallenges implementations handle that (especially if those implementations might have their own special form of persistence)?


Do we have any kind of existing animated progress bar UI elements with a countdown? I looked around in /uiexample/ but didn't see anything obvious.


It's not worth changing if this code is about to go away, but I'm surprised it doesn't use getAllowedTimesteps.



This revision is now accepted and ready to land.Dec 14 2018, 7:52 PM
epriestley added inline comments.Dec 14 2018, 10:45 PM

I think this will probably happen reasonably often (log out + log in within a couple minutes, two browsers, or desktop + mobile) in the wild and probably doesn't mean anything ~100% of the time. It could mean something bad but I'm wary about crying wolf and we can't really give the user very good next steps other than "tell someone", and that someone is just going to ignore them after the first three times it happens I think.

(It's maybe worth looking at more awareness around sessions in general, e.g. login alerts, at some point -- those are actionable, and could do a better job of covering this. "Bind session to IP" also has a task somewhere and would help.)


Just that using the omnipotent viewer is inherently somewhat risky/dangerous since some future change could accidentally use it somewhere inappropriate. This has caused almost no actual damage historically, but T13223 is a rare example of a mild policy issue arising from inappropriate use of the omnipotent viewer.

I think "the risk that the omnipotent viewer goes somewhere it shouldn't" and "the risk that the policy system hides a challenge" are both virtually zero, but the former is maybe slightly riskier?

There's some reasonable argument that challenges shouldn't be a policy object, but, today, you can't get normal query behavior unless you extend PolicyAwareQuery and implement the policy interface.

Making them a policy object is also somewhat good if, say, a third party writes a "Any Object Debugger/Inspector" app without thinking about it too carefully, where you put in a PHID and get out object properties (very old versions of Phabricator had this, but it proved difficult to make both secure and useful once policies got more sophisticated), since you can get them indirectly with ObjectQuery.


Is it because this request to generate a challenge is coming in via a GET request?

Yeah -- the write guard is for the case where you load the form for the first time and it says "hey, answer this MFA challenge".

Sometimes that's in response to an actual form POST, but sometimes it's a GET or not a CSRF'd POST. One example is that when you GET into the "VCS Password" settings page, we immediately prompt you for MFA (admittedly, this is a little weird).

magic herbs that ward off evil CSRF spirits

A lot of the way it's hooked up is pretty magic, but the actual implementation is as dumb as rocks.

We tell the query layer "before you do any write, call this method" (in AphrontApplicationConfiguration):

$write_guard = new AphrontWriteGuard(array($request, 'validateCSRF'));

This object is magic and registers itself with the query layer. A more modern implementation would probably be a little less magic, but this was written in like 2011.

The query layer tests each query to see if it starts with "SELECT", "SHOW", or "EXPLAIN". If it doesn't, it determines that it's probably a write and calls the callback:

// NOTE: The opening "(" allows queries in the form of:
//   (SELECT ...) UNION (SELECT ...)
$is_write = !preg_match('/^[(]*(SELECT|SHOW|EXPLAIN)\s/', $raw_query);
if ($is_write) {
  // ....

The callback checks that $_REQUEST has a valid CSRF token. If it doesn't, it throws an exception.

Why are we calling save() instead of letting newResultFromIssuedChallenges implementations handle that (especially if those implementations might have their own special form of persistence)?

Mostly, just so that we don't have to have a lot of unguarded write boilerplate in every single implementation.

Challenge has a freeform $properties property, so my expectation is that any reasonable MFA factor implementation doesn't need other persistence. If that turns out to be untrue, maybe this was a bad call.

In theory, the write guard mechanism works with any service that does writes, although I'm not sure we ever bothered to implement it in any others. There's normally no way to do a service write without also doing a database write, and service writes are normally not interesting/dangerous (e.g., uploading data to S3 or whatever, I guess?).


Nothing client side / in JS right now. There's one in the background job thing (bulk editor / bulk exporter) but it "animates" by reloading the page. This is probably not terribly difficult to build and the current UI isn't super hot so maybe I'll take a stab at it.


Yeah, I just didn't want to break this by accident between now and when I (hopefully) make it vanish.

amckinley added inline comments.Dec 14 2018, 11:32 PM

It occurs to me that the code in the Countdown app is probably up to the challenge for just showing a little timer.

This revision was automatically updated to reflect the committed changes.