Page MenuHomePhabricator

Remove SHA1 file content hashing and make Files work without any hashing
ClosedPublic

Authored by epriestley on Apr 4 2017, 10:24 PM.
Tags
None
Referenced Files
Unknown Object (File)
Sat, May 3, 5:43 PM
Unknown Object (File)
Wed, Apr 30, 11:08 AM
Unknown Object (File)
Sat, Apr 26, 11:43 AM
Unknown Object (File)
Fri, Apr 25, 6:42 AM
Unknown Object (File)
Thu, Apr 24, 10:50 AM
Unknown Object (File)
Wed, Apr 23, 8:12 PM
Unknown Object (File)
Mon, Apr 21, 6:59 PM
Unknown Object (File)
Mar 26 2025, 11:52 AM
Subscribers
None

Details

Summary

Ref T12464. We currently use SHA1 to detect when two files have the same content so we don't have to store two copies of the data.

Now that a SHA1 collision is known, this is theoretically dangerous. T12464 describes the shape of a possible attack.

Before replacing this with something more robust, shore things up so things work correctly if we don't hash at all. This mechanism is entirely optional; it only helps us store less data if some files are duplicates.

(This mechanism is also less important now than it once was, before we added temporary files.)

Test Plan

Uploaded multiple identical files, saw the uploads work and the files store separate copies of the same data.

Diff Detail

Repository
rP Phabricator
Lint
Lint Not Applicable
Unit
Tests Not Applicable