Page MenuHomePhabricator

Determine if we need to mitigate BREACH
Closed, ResolvedPublic

Description

BREACH, a new attack against SSL based on compression, was announced at Black Hat recently. As I understand it, the attack scenario is something like this:

  • An attacker MITMs your SSL traffic (without attempting to decrypt it);
  • then they acquire the ability to make some set of requests on your behalf (e.g., by having you visit a webpage which runs Javascript they control);
  • the Javascript coordinates with the MITM'd proxy to make requests to the server with various parameters, e.g., /?q=a, /?q=b, etc.;
  • the server has some page which sends back the value of q in the response body (like a search results page) and is otherwise fairly static;
  • by examining the size of the response, they can determine when they're matching your CSRF token because the response will be smaller. For example, if your CSRF token starts a9f..., then the response to /?q=a9f will be slightly smaller than other responses because it can compress better;
  • they can quickly extend the query and discover the entire CSRF token (with just thousands of requests);
  • then the Javascript (which is still running elsewhere) can launch CSRF attacks.

There's some discussion here:

Some recommended mitigations are listed here http://breachattack.com/#mitigations.

Of these:

  1. Disable HTTP Compression: Very practical but implies a huge performance cost. This isn't something we can fix anyway in most cases (it's usually in layers above us), although we could try to detect it (see T2226).
  2. Separate secrets from user input: This is possible, but I think not very practical. We'd have to do something like manage CSRF using Javascript, and nothing would work without JS anymore. We can't move CSRF tokens into cookies or it defeats the whole purpose.
  3. Randomizing secrets per request: This is very practical and has no significant downsides.
  4. Masking secrets: As above.
  5. CSRF: Basically, this means that any page which echoes request data must have CSRF protection. This is possible, but is really hard to detect and would reduce functionality (you could no longer have GET query pages, for example), although with ApplicationSearch we might be closer than I think here. We could look at this in more detail, although I worry that the lack of mechanism to detect it make it perilous.
  6. Length hiding: Trivial, but not clearly effective
  7. Rate limiting: Undesirable and ineffective.

Revisions and Commits

Event Timeline

epriestley triaged this task as Normal priority.
epriestley added a project: Security.
epriestley added subscribers: epriestley, btrahan, chad, arice.

BREACH is a fairly surgical attack that must target specific software and individuals, must gain a high degree of access to mount, and only escalates into CSRF, which certainly isn't good but I believe isn't exceptionally dangerous in Phabricator. My inclination is to wait for a bit and see how other applications react to it, although I think masking/randomizing is very practical and very straightforward so it might be worth implementing preemptively. Essentially, instead of sending down "aaaa" as the CSRF token, we'd send down "aaaabbbb", where "bbbb" is random noise and "aaaa" xor "bbbb" produces the original token.

Man, that Rails pull request sure has a lot of discussion about the performance cost of XOR'ing two 32-byte strings.

D6686 is starting point for the masking approach. Since XOR'ing printable character strings without exposing information is messy, I just did salting / hashing instead.

BREACH can also be used to find other types of plaintext, but I think the only other meaningful credential is Conduit certificates, which appear on only one page that has no reflection. We actually should probably retool this (there's no need for users to ever access their Conduit certificate directly). It's hard to imagine an attacker mounting a BREACH attack purely to identify page content since it's not that fast, although we could append random data (e.g., an HTML comment) to further increase the difficulty.

Excellent summary. I don't believe you're missing anything major here. The CSRF token is the only secret in need of immediate attention, and I agree masking is the best approach here.

Might be worth noting that there are a long tail of installation-specific secrets that could theoretically be accessed with this attack when combined with some other out of band information leak. E.g., learning that a target has an private Apache repo "A" and then reading content from /diffusion/A/browse/master/.htpasswd (although diffusion doesn't seem to echo content either - differential might be a better example). Even so, these attacks are relatively far fetched without any obvious solutions from Phabricator itself. I'd say the attack doesn't feel in need of immediate attention beyond CSRF.

@arice Thanks!

I think Differential/Diffusion have no reflection (even with debugging modes enabled, we no longer deliver user content in the main response body), but it's hard to be exhaustively certain. I can at least do an audit pass on our handful of magic super-parameters (__profile__, etc.) and see if anything is suspicious.

I'll do that and clean up D6686 and land it, probably today or tomorrow, in the absence of any new developments in the meantime. I think the patch is very low risk even if further developments somehow render it ineffective or undesirable (which seems unlikely).

@wez, just a heads up on this -- is FB still way behind HEAD? It's probably not critical that this be deployed immediately after it lands since the attack is difficult to mount, but it does help mitigate a now widely-disclosed attack.

Thanks, we'll review. We're a bit behind but doing some catchup this week.

I cleaned up D6686 and looked through the __magic__ stuff. The stuff I found was:

  • If you have DarkConsole enabled, the URI itself (including query parameters) is echoed into the response body, albeit JSON encoded. This is just used to label a tab in the console. I'll simply replace this with some informational label like "Main Request", since it's always redundant with the URL.
  • You can force a specific __metablock__ in Javascript data responses, but this is (int) cast already to prevent other types of hyjinx so there's not much credible threat from BREACH (in theory, I suppose it's possible you could use it to BREACH purely numeric data). I think the window this represents is so tiny that it's not worth mitigating, and it's somewhat difficult to mitigate without impacting other things. If we did want to mitigate it, the cleverest thing I can come up with is swithcing identifiers over some threshold (say, 999) to just use large random values. In effect, this would mean that the first 999 Ajax requests a page made would be unchanged, and the 1000-Nth requests would be very slightly less efficient in their data representation (although after compression it's probably moot if an identifies are U1001_1 .. U1001_N or U8983987298329_1 .. U8983987298329_N).

The other magic parameters seem clean to me.

epriestley edited this Maniphest Task.

Okay, this stuff is in HEAD now and pushed. I'll post a note about it on the Phabricator page and keep my ears open for further developments, but as far as I know we're in reasonably good shape against realistic attacks now.

epriestley changed the visibility from "All Users" to "Public (No Login Required)".Aug 13 2015, 12:16 AM