Page MenuHomePhabricator

file chunks not encrypted when enabling file encryption
Closed, ResolvedPublic

Description

There was no mentioned in the configuring encryption document for the File application, but it seems like files that get split into chunks don't get encrypted.

I would get the message "Stored as chunks, no data to encode directly" when i tried to reencode the file from raw chunk format to aes-256-cbc. So if indeed that case that files using the chunked engine can't be used with the encryption mechanism, then how to turn off the chunk engine? should i just override the hasFilesizeLimit() in the base PhabricatorFileStorageEngine astract class for the S3 engine to return false and call that good?

Event Timeline

phabricator b9f35351e9d486f7b825652ba2de0074b35cd2c5 (Aug 29 2016)
arcanist 10e5194752901959507223c01e0878e6b8312cc5 (Aug 26 2016)
phutil d6818e59c1764ede22cad56f8c5b1b527cb6a577 (Aug 26 2016)

Why do you believe that files which are split into chunks don't get encrypted?

when i run the command:

./bin/file encode --as aes-256-cbc <file> --key master.key

where file is a file that has been split into chunks, the file tool would return "Stored as chunks, no data to encode directly" and they still indicated as RAW chunk type in the File application. And the other files that were not split into chunks, meaning they were smaller than 8mb limit, those i see indicated as being AES-256-CBC type. This then leads me to believe chunked files are not being encrypted.

When you upload a file that is stored as chunks, the chunks are also files. For example, file F123 may have chunks F124, F125, F126, etc. These files are not normally visible in the web UI but you can see that the next file you upload will skip some numbers (e.g., be F127).

When you use a command like bin/files encode --all --as aes-256-cbc, all the chunk data will be encrypted. The master file record is never encrypted.

If you are just trying to encrypt everything, --all will encrypt all existing chunked files.

There is no way to selectively encrypt the chunks of a particular chunked file without looking up all the chunk numbers in the database. I believe this is unlikely to be useful/interesting, which is why we don't support it.

The "Format: Raw" refers to the chunk record itself. After D16636, this will read "Format: Chunks" instead.

The actual chunk data is encrypted if uploaded with encryption enabled or later encrypted with bin/files encode --all. You can verify this by examining the underlying datastore (MySQL, local disk, or S3) and looking at the content.