Proof of concept mitigation of BREACH
ClosedPublic
Actions

Authored by epriestley on Aug 6 2013, 4:05 PM.

Details

Reviewers

btrahan

Maniphest Tasks

T3684: Determine if we need to mitigate BREACH

Commits

Restricted Diffusion Commit
rP7298589c86ec: Proof of concept mitigation of BREACH

Summary

Ref T3684 for discussion. This could be cleaned up a bit (it would be nice to draw entropy once per request, for instance, and maybe respect CSRF_TOKEN_LENGTH more closely) but should effectively mitigate BREACH.

Test Plan

Submitted forms; submitted forms after mucking with CSRF and observed CSRF error. Verified that source now has "B@..." tokens.

Diff Detail

Lint

Lint Skipped

Unit

Tests Skipped

Event Timeline

Reduce the number of magic numbers and magic strings.
Use PhabricatorStartup to store CSRF salt.

A note on the latter bit: this is basically a "static", except that some day we'd like to reuse PHP interpreters across multiple requests, and to make that work we can't put any data which varies from request to request into "static" variables. This definitely needs to vary from request to request. PhabricatorStartup globals are explicitly cleared at the beginning of each (logical) request, so they're safe to use as a quasi-static for data like this. We have very few types of static data which vary across requests, which is why this problem hasn't come up much yet.

The quick version of this is that we previously sent down some token:

csrf="TTTTTTTTTTTTTTTT"

This is vulnerable to BREACH. Now we generate a random salt (SSSSSSSS) on each page and compute H = HASH(T, S). Then we send down:

B@SSSSSSSSHHHHHHHHHHHHHHHH

Before, we verified the token T against known valid token V like this:

if (TTTTTTTTTTTTTTTT == VVVVVVVVVVVVVVVV) { ... }

Now, we verify the B@... string against known valid token V like this:

if (HASH(V, S) == HHHHHHHHHHHHHHHH) { ... }

Essentially, this is just a cosmetic change which means the CSRF token varies from page to page but can always be verified on the server side.

The slightly cleaner way would be to compute:

Z = RAND()
X = T xor Z
csrf="ZX"
if (V xor X == T) { ... }

However, we can't use xor safely on printable character strings which have limited byte ranges, and bringing base64 into the equation or doing a bunch of encode/decode of raw hex seemed like it would make the XOR version messy enough that this ends up being a little cleaner.