Page MenuHomePhabricator

Remove SHA1 file content hashing and make Files work without any hashing
ClosedPublic

Authored by epriestley on Apr 4 2017, 10:24 PM.
Tags
None
Referenced Files
F18809098: D17619.diff
Sun, Oct 19, 11:08 AM
F18768763: D17619.id42382.diff
Wed, Oct 8, 5:14 AM
F18694523: D17619.id42371.diff
Sat, Sep 27, 3:39 AM
F18684399: D17619.diff
Fri, Sep 26, 9:24 AM
F18684390: D17619.diff
Fri, Sep 26, 9:21 AM
F18661992: D17619.diff
Tue, Sep 23, 8:22 PM
F18636719: D17619.diff
Sep 17 2025, 4:59 AM
F18616621: D17619.id42382.diff
Sep 14 2025, 6:21 PM
Subscribers
None

Details

Summary

Ref T12464. We currently use SHA1 to detect when two files have the same content so we don't have to store two copies of the data.

Now that a SHA1 collision is known, this is theoretically dangerous. T12464 describes the shape of a possible attack.

Before replacing this with something more robust, shore things up so things work correctly if we don't hash at all. This mechanism is entirely optional; it only helps us store less data if some files are duplicates.

(This mechanism is also less important now than it once was, before we added temporary files.)

Test Plan

Uploaded multiple identical files, saw the uploads work and the files store separate copies of the same data.

Diff Detail

Repository
rP Phabricator
Lint
Lint Not Applicable
Unit
Tests Not Applicable