Currently, we salt passwords and hash them with 1000 rounds of MD5. This is pretty okay given what we reasonably have access to (although we could do better on at least some systems with crypt()), but modern PHP introduces password_hash(). On systems where we have access to it, it would be better to prefer it (roughly, it's a thin layer on top of bcrypt).
Description
Revisions and Commits
Event Timeline
The classical guide with solid info: http://codahale.com/how-to-safely-store-a-password
Mentioned in said guide is a PHP library that may be of use: http://www.openwall.com/phpass
Not sure if the library itself is necessary, but there looks to be lots of good information in there.
Of note from their password guide:
phpass provides an easy to use abstraction layer on top of PHP's cryptographic hash functions suitable for password hashing. As of this writing, it supports three password hashing methods, including two via PHP's crypt() function - these are known in PHP as CRYPT_BLOWFISH and CRYPT_EXT_DES - and one implemented in phpass itself on top of MD5. All three employ salting, stretching, and variable iteration counts (configurable by an administrator, encoded/stored along with the hashes).
PHP 5.3.0 and above is guaranteed to support all three of these hashing methods due to code included into the PHP interpreter itself. Specific builds/installs of older versions of PHP may or may not support the CRYPT_BLOWFISH and CRYPT_EXT_DES methods - this is system-specific.
As a 7K single-file public-domain library, this might be a convenient way to handle the problems they've already solved.
Any thoughts on upgrade path? It looks like the phabricator_user table only has passwordSalt and passwordHash fields; the table will additionally need a passwordHashType or similar. Hopefully accompanied by a notification for the user to change their password.
Generally, I don't think the jump from 1000-iteration md5 to bcrypt is especially important, and particularly that it's not important enough to pressure installs to deal with installing more PHP extensions over. phpass itself falls back to roughly the same thing that we use (iterated md5 in PHP) if better mechanisms aren't available. My understanding is that the major weakness of iterated md5 is ASIC hashing, and bcrypt is only more resistant to ASIC hashing, but still has a fixed amount of memory-hardness.
That is, my thinking is that the approaches here are roughly like this:
Current | Target | Aggressive | Paranoid | Very Paranoid | |
---|---|---|---|---|---|
No Crypto Extensions | md5 | md5 | md5 | Setup Failure | Setup Failure |
With Crypto Extensions | md5 | md5 | bcrypt | bcrypt | Setup Failure |
With hash_password() | md5 | bcrypt | bcrypt | bcrypt | Setup Failure |
With scrypt Extension | md5 | bcrypt | scrypt | scrypt | scrypt |
I plan to pursue the "target" strategy, which is bascially "use bcrypt if we have it and it's easy". We could pursue more aggressive strategies, all the way up to refusing to run unless the user compiles and installs the nonstandard scrypt extension, but I don't think we'd benefit much. Particularly, I think the benefit of adding code to use the older crypt extensions or the scrypt extension if they're available probably isn't worth the complexity of that code.
Since I intend to maintain md5 capability indefinitely (or, at least, until we drop support for old PHP versions), there's no need to migrate hashes aggressively. We could opportunistically rehash passwords on login, and/or show the user a warning like "The hashing strength of this password can be upgraded, change your password to upgrade", but I don't plan to remove md5 hashing. Generally, I intend to just communicate this during setup and make sure installs have enough information to make informed choices ("You're about to enable password login, but here's some stuff you should know..."), but if you don't care and just mash "Next" I think it's fine if you end up with iterated md5.
(We don't strictly need to add a passwordHashType because the contents of passwordHash are always a hex string, so we could store newer hashes as, say, bcrypt:.... We do this kind of thing already with CSRF tokens and session keys, where newer styles are marked with some well-known prefix. We do probably need to extend the size of the passwordHash field, because bcrypt hashes may be longer than hex md5 hashes IIRC.)
Basically, my goals are just:
- Make a reasonable hashing function available;
- give installs enough information to make an informed choice between installing/updating stuff and accepting a weaker hash.
This approach seems reasonable. I support transparent upgrades on login, which makes a lot more sense than my idea, and using a hash-type prefix. During the database upgrade, it might be convenient to update md5 hashes to a similar md5:... prefix, to simplify the hash-type check (if substring worked, then match on type, else fail).
The main motivation behind my comments were to get phab on absolutely anything other than md5, as it's so hilariously broken (length extension stuff, etc) that I'm just assuming it's dramatically weaker than any other option. I didn't know scrypt was so easy to use via PHP, however bcrypt would be sufficient. Comments about ASIC attacks are relevant, but simply switching to SHA-512 would make GPU cracking two orders of magnitude slower.
At this point it should be noted that I am dangerously bad at cryptography (roughly, stage 2 of 4) and my total experience with PHP is "while waiting for a flight to a job interview, I once read the docs for about 10 minutes until I decided to treat PHP as C syntax inside ASP templates, then closed the laptop". So my advice should be treated with copious suspicion :)
I've marked D8272 as fixing this issue, since I think it covers pretty much everything that it should for now. Possible future work includes:
Audit Authentication: Give administrators a tool to audit users who have older password storage so they can bug them to update. This could generally show auth settings -- a related request was "find users who don't have linked LDAP accounts" in the past. After we implement two-factor authentication in T4398, finding users without two-factor configured might be useful too. I've seen a couple of "nice-to-have" requests for this kind of thing, but nothing too serious. Most of this can be figured out with database queries, so the current plan of attack is "tell users what queries to execute until that becomes cumbersome enough that building a tool is worthwhile".
Implement Other Hashers: The two interesting ones are probably scrypt (for the truly paranoid) and a bcrypt variant based on crypt(), for installs that are behind 5.5.0, but either have the crypt extension or don't mind installing it. These are now easy to implement and migration is free/automatic, but I'd like to wait for user requests / actual use cases before pursuing them. In theory, the greater exposure of hashing algorithms in the administrative UI and user UIs should help surface any need for this (e.g., users can see that it says "bcrypt" and be like "oh, I want scrypt, I'll file a request"). @kofalt, I'm assuming you're not in either of these buckets, but yell if you are (or send a patch :)).
User Experience for Bogus Stored Hashes: If administrators install and then uninstall hashers, the UX might be kind of garbage (e.g., exceptions preventing you from setting a new password). I'm planning to wait until this actually happens before fixing it, since it's hard to imagine many installs will jump at the opportunity to implement password hashing extensions and deploy them it into production immediately and I'd like to see any exceptions users hit until this code stabilizes in case there are upstream issues.
Automatic Difficulty Scaling: We could make the cost function of bcrypt automatically increase over time. It is easy to imagine various scenarios in which this goes hilariously wrong.
Thanks @epriestley.
Regarding scrypt, I'll let you know if a desire for additional tinfoil becomes unbearable :)
Moving to bcrypt is sufficient for my needs. Thanks for moving quickly on this one.
I'll migrate my instance sometime in the next few days; I'll hit IRC if there's any problems.