Page MenuHomePhabricator

Remove SHA1 file content hashing and make Files work without any hashing
ClosedPublic

Authored by epriestley on Apr 4 2017, 10:24 PM.
Tags
None
Referenced Files
F13060751: D17619.diff
Fri, Apr 19, 6:21 PM
Unknown Object (File)
Tue, Apr 16, 10:19 AM
Unknown Object (File)
Sun, Apr 14, 5:03 PM
Unknown Object (File)
Sun, Apr 14, 5:03 PM
Unknown Object (File)
Sun, Apr 14, 5:03 PM
Unknown Object (File)
Sun, Apr 14, 3:32 AM
Unknown Object (File)
Sat, Apr 13, 7:37 PM
Unknown Object (File)
Sat, Apr 13, 5:05 PM
Subscribers
None

Details

Summary

Ref T12464. We currently use SHA1 to detect when two files have the same content so we don't have to store two copies of the data.

Now that a SHA1 collision is known, this is theoretically dangerous. T12464 describes the shape of a possible attack.

Before replacing this with something more robust, shore things up so things work correctly if we don't hash at all. This mechanism is entirely optional; it only helps us store less data if some files are duplicates.

(This mechanism is also less important now than it once was, before we added temporary files.)

Test Plan

Uploaded multiple identical files, saw the uploads work and the files store separate copies of the same data.

Diff Detail

Repository
rP Phabricator
Branch
files9
Lint
Lint Passed
Unit
Tests Passed
Build Status
Buildable 16298
Build 21669: Run Core Tests
Build 21668: arc lint + arc unit