Page MenuHomePhabricator

Remove SHA1 file content hashing and make Files work without any hashing
ClosedPublic

Authored by epriestley on Apr 4 2017, 10:24 PM.
Tags
None
Referenced Files
Unknown Object (File)
Sun, Mar 24, 6:19 AM
Unknown Object (File)
Tue, Mar 5, 3:01 AM
Unknown Object (File)
Feb 19 2024, 6:09 PM
Unknown Object (File)
Feb 18 2024, 8:19 PM
Unknown Object (File)
Feb 10 2024, 4:43 PM
Unknown Object (File)
Feb 8 2024, 11:19 PM
Unknown Object (File)
Feb 3 2024, 3:24 PM
Unknown Object (File)
Jan 24 2024, 1:28 AM
Subscribers
None

Details

Summary

Ref T12464. We currently use SHA1 to detect when two files have the same content so we don't have to store two copies of the data.

Now that a SHA1 collision is known, this is theoretically dangerous. T12464 describes the shape of a possible attack.

Before replacing this with something more robust, shore things up so things work correctly if we don't hash at all. This mechanism is entirely optional; it only helps us store less data if some files are duplicates.

(This mechanism is also less important now than it once was, before we added temporary files.)

Test Plan

Uploaded multiple identical files, saw the uploads work and the files store separate copies of the same data.

Diff Detail

Repository
rP Phabricator
Lint
Lint Not Applicable
Unit
Tests Not Applicable