Page MenuHomePhabricator

Build Glorious High-Security Notification/Audit Infrastructure
Closed, ResolvedPublic

Assigned To
Authored By
May 4 2016, 3:57 AM
Referenced Files
F1652548: Screen Shot 2016-05-19 at 3.09.16 PM.png
May 19 2016, 10:20 PM
F1652546: Screen Shot 2016-05-19 at 3.09.14 PM.png
May 19 2016, 10:20 PM
F1652543: Screen Shot 2016-05-19 at 3.09.01 PM.png
May 19 2016, 10:20 PM
F1652541: Screen Shot 2016-05-19 at 3.08.57 PM.png
May 19 2016, 10:20 PM
F1652551: Screen Shot 2016-05-19 at 3.11.20 PM.png
May 19 2016, 10:20 PM
"Piece of Eight" token, awarded by benjumanji."Haypence" token, awarded by yelirekim.


We can improve the ability of installs to react to and recover from attacks that involve session compromise by emitting notifications when high-security actions are taken. The primary line of defense against this class of attacks should be proactive MFA, but improved reaction and recovery mechanisms can provide greater confidence and defense-in-depth.

In particular:

  • Account objects like SSH keys should use transaction infrastructure and have a full create/edit/delete log like most other objects do. There is no technical or product reason that they do not today.
  • High security actions, including modifying SSH keys, should tie into new notification infrastructure which can emit email today and SMS (or other out-of-band, high-urgency) notifications in the future. We don't need to build this today, but should put some infrastructure in place to anticipate it.
  • The existing primary email change notifications should likely fold into this new security event infrastructure.
  • (Possibly also reasonable is emitting security event notifications when an install exceeds an action rate limit? This is not normally a high-security action but falls into the same general class of sensitive events. I'm not sure if all of these overlap with high-security actions already or not.)

Original Report

Stick with me through this rather unfortunate tale.

Currently if a malicious actor gets on our network they are able to listen to some unencrypted traffic that is going over it to internal services. The problem for us is that LDAP auth is used for everything in our office and some systems do not securely transfer those credentials.

In our effort to be super secure™ we only allow SSH access to repos so all users must first add a public key or generate a new key. If the attacker is able to glean a users LDAP information they can log in and add an SSH key and pull down our repos.

Although sending an email out when an SSH key is added will not eliminate this issue it will alert the user to an action that they did not take and allow the user to quickly revoke the key as well as quickly enabling us to perform some kind of super corporate postmortem.

Event Timeline

I've generally been hesitant about making Phabricator send these emails because I'm not sure how valuable they are in practice, and a lot of users hate email (see T9161). If an attacker gains access to an instance even briefly, I think it's likely much of the damage is done before most users will be able to respond (in particular, a targeted attack can virtually guarantee you receive the email at a time when you are likely to be asleep). By its nature, this mitigation is a reactive one with a potentially long reaction time.

I've more actively pursued implementing a proactive mitigation, in MFA. If you configure MFA for your account, you must:

  • provide an MFA token to establish a session (so it is insufficient for an attacker to compromise LDAP credentials alone); and
  • provide an MFA token to modify SSH keys or take other sensitive actions.

You can require all users to configure MFA with security.require-multi-factor-auth.

Of course, this is very annoying for users. But it's an effective, proactive security measure which materially raises the barrier an attacker faces.

I'm not strictly opposed to sending email as well, I just worry it is so ineffective as a mitigation that it borders on security theater, and feels like serving the letter of a policy rather than solving a security/engineering problem.

I'd feel a little better about pursuing this if we targeted SMS as a preferred alert method (which technically works in HEAD, just isn't exposed in the UI -- see T920) to at least slightly reduce the chance an attacker can compromise an account at a time when they're virtually guaranteed to have 6-8 hours before a response.

SSH keys are also one of the few objects that Phabricator does not modify via the transactions infrastructure, and thus does not have a detailed edit log for. I think this is mostly because they were just built a long time ago. Improving tools for directly auditing security/authentication actions on account is a worthy goal -- we have a decent start with the "Activity Logs", but should have more comprehensive logs available for objects like email addresses and SSH keys, similar to how objects like tasks have complete edit logs.

The cases where we currently require high security (prompt for MFA) are:

  • establishing a web session;
  • linking a third-party account to an existing account;
  • deleting, editing, adding, or generating an SSH key;
  • deleting, editing, adding, or generating a Conduit token;
  • changing a user's HTTP VCS password;
  • establishing an OAuth session using Phabricator as an OAuth server;
  • creating, renaming, or granting/revoking administrative privileges from a user;
  • viewing or editing email addresses for an account;
  • viewing or editing MFA tokens for an account; and
  • viewing or editing passwords for an account.

Additionally, we require high security for these application actions:

  • signing a Legalpad document;
  • changing the availability of a Passphrase credential over Conduit;
  • revealing the plaintext of a Passphrase credential; and
  • configuring autopay for a subscription in Phortune.

We do also send advisory/notification email in one case, currently:

  • after changing your primary email address, both your old and new addresses are alerted to the change (this change requires that you click a link sent to the old address to confirm the change).

If we pursue changes in this vein, I'd generally like them to look something like:

  • generate mail for all MFA checkpoints, not just SSH keys;
  • make it easy for users to get SMS notifications instead of mail notifications;
  • improve transaction/audit logs for artifacts which currently have a limited or partial log (like SSH keys);
  • attempt to encourage both policymakers and users to favor proactive MFA over reactive notifications.

For the sake of completeness, I'll also mention plans to build a "quorum" class of checkpoint (T9515), where you must solicit trusted users to "turn the nuclear launch keys" and allow you to take some action with your account. The primary action this is aimed at is stripping MFA tokens, but it could conceivably be extended to other actions. I'm not sure this is actually useful, but the pathway to adding an SSH key could be made to be "MFA + convince 2 coworkers that you're legit", which would proactively require an attacker to execute both technical and social attacks to compromise an account, and/or compromise multiple distinct user accounts.

Broadly, there are a class of "security" changes which are really policy compliance changes. For example, it's common for large enterprises to have a password requirement like this:

Passwords must contain a prime number of lowercase letters.

When requirements serve no security or engineering goal (or justification is very weak, or substantially better solutions are available), we generally don't want to bear the cost of policy compliance.

I think this issue covers a lot of ground, and much of it is worthy goals with good security and engineering justifications which we're happy to pursue. But some of it may also border on pure policy compliance, which we're less eager about.

Particularly, if you want to completely defuse this entire class of attack and many adjacent attacks, require MFA. If you don't want to require MFA, we should start with a discussion about why MFA (which seems like a better/stronger mitigation here than reactive email notifications) isn't suitable, and see if we can fix those problems or remove the barriers to MFA adoption.

For example, maybe there are alternate approaches we can take (like T6115) to let you, e.g., require MFA for users who have access to Diffusion, without requiring it for all users.

Tasks like T9515 / T6549 focus on lowering the administrative burden of MFA.

T8787 / T920 and similar approaches could make it easier to deploy MFA.

(And, per above, the quorum mechanism is potentially a super-MFA.)

Once you require MFA (or we've gone as far as we can toward making MFA required in as many cases as possible) I think it's then reasonable to look at additional reactive steps we can take to further limit damage, make assessment easier, etc., including email notifications and other things discussed above. But these seem like they're premature from a security/engineering viewpoint if we haven't first pushed MFA as far as we reasonably can.

Upshot: what are the barriers you face today to enabling security.require-multi-factor-auth for all users, and how can we reduce them?

The best way to describe this request from our perspective is "third party requirement", which falls under your definition of a policy compliance change, mostly.

I'm in agreement with you for all of the security posturing stuff here, but, there might be simpler ways to let us send these emails ourselves and also work down a path aligned with phab's security principles.

My only idea would be converting actions surrounding SSH key management to transactions, which you're not explicitly stating here, but if we were to do that we would be able to use internal infa we have that reacts to transactions, and send email when a key is added.

We haven't yet enabled, but will be enabling MFA this week.

My only idea would be converting actions surrounding SSH key management to transactions, which you're not explicitly stating here, but if we were to do that we would be able to use internal infa we have that reacts to transactions, and send email when a key is added.

Sorry, yeah -- this got an indirect mention, but to be explicit I consider converting SSH keys to transactions to be highly desirable. There is no technical or product reason that they shouldn't be, just legacy stuff + we haven't gotten to it yet.

And I think the mail notifications are also largely reasonable, I just don't want to hack this one mail notification in and call it a day since this I think this should be approached more broadly (covering all high-security actions, not just SSH) and deeply (SMS, mail, transactions).

Basically this is a "yes, but first let's build glorious INFRASTRUCTURE", as most things tend to be on this project.

Basically this is a "yes, but first let's build glorious INFRASTRUCTURE", as most things tend to be on this project.


epriestley renamed this task from Send email when an SSH key is added for a user to Build Glorious High-Security Notification/Audit Infrastructure.May 4 2016, 9:45 PM
epriestley claimed this task.
epriestley triaged this task as Normal priority.
epriestley updated the task description. (Show Details)
epriestley added a project: Prioritized.
epriestley moved this task from Backlog to The Queue on the Prioritized board.
epriestley added a project: Auth.

(☞゚ヮ゚)☞ Prioritized ☜(゚ヮ゚☜)

Here's my technical plan on this:

Key Handling: Putting transactions on SSH keys is currently tricky because there's a UNIQUE KEY (publicKey) on the public key itself, so we currently need to delete keys to let them be reused. This is no good if we're recording transactions, since the old transactions need to still have something valid to link to.

I'm going to change that to a UNIQUE KEY (publicKey, isActive) with a nullable boolean column so we only check uniqueness of active keys. Then we keep inactive keys and have stable PHIDs for transactions to refer to. I'll update all the SSHKeyQuery callsites to only load active keys, unless loading inactive keys makes sense for some reason. This is probably the only tricky part of this.

Detail Page + SearchEngine + Handles: Keys have PHIDs but need dedicated detail pages, query UI, and handles so transactions can meaningfully link to them and work properly. This is all pretty straightforward to build out.

Transactions + EditEngine (?): Key edits should become transaction oriented, and probably be driven by EditEngine although maybe we can just skip this for now. Not sure if it EditEngine ends up being easier or not.

Mail Notification: Once we have transactions, we can hand them to some kind of new mail thing. I think it's probably most reasonable for this new API to accept transactions, but who knows.

This is mostly run-of-the-mill stuff in well-mapped territory and I don't currently anticipate any weird surprises.

In HEAD of master, keys now have a detail screen which shows a transaction record. Clicking the key name will take you there:

Screen Shot 2016-05-19 at 3.08.57 PM.png (1×1 px, 169 KB)

Screen Shot 2016-05-19 at 3.09.01 PM.png (1×1 px, 119 KB)

A new "View History" option allows you to review all keys which have ever been associated with an account, including keys which have been deactivated:

Screen Shot 2016-05-19 at 3.09.14 PM.png (1×1 px, 171 KB)

Screen Shot 2016-05-19 at 3.09.16 PM.png (1×1 px, 106 KB)

All changes to keys now send email to the account holder. This email bypasses notification preferences:

Screen Shot 2016-05-19 at 3.11.20 PM.png (320×829 px, 66 KB)

(The link style on the "Security Warning" section is a bug, remedied by D15952.)

These changes are live on this install, and you can review them by navigating to SettingsSSH Public Keys and adding or editing keys on your own account.

I don't have any more changes planned here, so I'm pausing this for feedback.

Preserve KeysD159430.5 HoursPreserve deactivated SSH keys instead of destroying them.
TransactionsD15946, D159471 HourUse transactions to edit keys, implement search/detail interfaces.
MailD159480.5 HoursSend mail about key edits.
Subtotal2 Hours
Cumulative Total2 Hours

I believe this is resolved and accounted for, let us know if anyone runs into issues.