Ref T3684 for discussion. This could be cleaned up a bit (it would be nice to draw entropy once per request, for instance, and maybe respect CSRF_TOKEN_LENGTH more closely) but should effectively mitigate BREACH.
Details
- Reviewers
btrahan - Maniphest Tasks
- T3684: Determine if we need to mitigate BREACH
- Commits
- Restricted Diffusion Commit
rP7298589c86ec: Proof of concept mitigation of BREACH
Submitted forms; submitted forms after mucking with CSRF and observed CSRF error. Verified that source now has "B@..." tokens.
Diff Detail
- Branch
- BREACH
- Lint
Lint Passed Severity Location Code Message Advice src/applications/people/storage/PhabricatorUser.php:236 XHP16 TODO Comment - Unit
Tests Passed
Event Timeline
- Reduce the number of magic numbers and magic strings.
- Use PhabricatorStartup to store CSRF salt.
A note on the latter bit: this is basically a "static", except that some day we'd like to reuse PHP interpreters across multiple requests, and to make that work we can't put any data which varies from request to request into "static" variables. This definitely needs to vary from request to request. PhabricatorStartup globals are explicitly cleared at the beginning of each (logical) request, so they're safe to use as a quasi-static for data like this. We have very few types of static data which vary across requests, which is why this problem hasn't come up much yet.
The quick version of this is that we previously sent down some token:
csrf="TTTTTTTTTTTTTTTT"
This is vulnerable to BREACH. Now we generate a random salt (SSSSSSSS) on each page and compute H = HASH(T, S). Then we send down:
B@SSSSSSSSHHHHHHHHHHHHHHHH
Before, we verified the token T against known valid token V like this:
if (TTTTTTTTTTTTTTTT == VVVVVVVVVVVVVVVV) { ... }
Now, we verify the B@... string against known valid token V like this:
if (HASH(V, S) == HHHHHHHHHHHHHHHH) { ... }
Essentially, this is just a cosmetic change which means the CSRF token varies from page to page but can always be verified on the server side.
The slightly cleaner way would be to compute:
Z = RAND() X = T xor Z csrf="ZX" if (V xor X == T) { ... }
However, we can't use xor safely on printable character strings which have limited byte ranges, and bringing base64 into the equation or doing a bunch of encode/decode of raw hex seemed like it would make the XOR version messy enough that this ends up being a little cleaner.