Page MenuHomePhabricator

Add a chunking storage engine for files
ClosedPublic

Authored by epriestley on Mar 13 2015, 12:10 AM.
Tags
None
Referenced Files
F14055174: D12060.diff
Sat, Nov 16, 9:11 AM
F14046873: D12060.diff
Thu, Nov 14, 12:29 AM
F14042457: D12060.diff
Tue, Nov 12, 3:40 AM
F14025825: D12060.diff
Thu, Nov 7, 8:45 PM
F13967882: D12060.id29028.diff
Oct 16 2024, 4:48 PM
Unknown Object (File)
Oct 11 2024, 5:37 PM
Unknown Object (File)
Oct 10 2024, 1:38 PM
Unknown Object (File)
Oct 2 2024, 10:01 AM

Details

Summary

Ref T7149. This isn't complete and isn't active yet, but does basically work. I'll shore it up in the next few diffs.

The new workflow goes like this:

Client, file.allocate(): I'd like to upload a file with length L, metadata M, and hash H.

Then the server returns upload (a boolean) and filePHID (a PHID). These mean:

uploadfilePHIDmeans
falsefalseServer can't accept file.
falsetrueFile data already known, file created from hash.
truefalseJust upload normally.
truetrueQuery chunks to start or resume a chunked upload.

All but the last case are uninteresting and work like exising uploads with file.uploadhash (which we can eventually deprecate).

In the last case:

Client, file.querychunks(): Give me a list of chunks that I should upload.

This returns all the chunks for the file. Chunks have a start byte, an end byte, and a "complete" flag to indicate that the server already has the data.

Then, the client fills in chunks by sending them:

Client, file.uploadchunk(): Here is the data for one chunk.

This stuff doesn't work yet or has some caveats:

  • I haven't tested resume much.
  • Files need an "isPartial()" flag for partial uploads, and the UI needs to respect it.
  • The JS client needs to become chunk-aware.
  • Chunk size is set crazy low to make testing easier.
  • Some debugging flags that I'll remove soon-ish.
  • Downloading works, but still streams the whole file into memory.
  • This storage engine is disabled by default (hardcoded as a unit test engine) because it's still sketchy.
  • Need some code to remove the "isParital" flag when the last chunk is uploaded.
  • Maybe do checksumming on chunks.
Test Plan
  • Hacked up arc upload (see next diff) to be chunk-aware and uploaded a readme in 18 32-byte chunks. Then downloaded it. Got the same file back that I uploaded.
  • File UI now shows some basic chunk info for chunked files:

Screen_Shot_2015-03-12_at_5.08.29_PM.png (1×1 px, 159 KB)

Diff Detail

Repository
rP Phabricator
Lint
Lint Not Applicable
Unit
Tests Not Applicable

Event Timeline

epriestley retitled this revision from to Add a chunking storage engine for files.
epriestley updated this object.
epriestley edited the test plan for this revision. (Show Details)
epriestley added a reviewer: btrahan.
btrahan edited edge metadata.
This revision is now accepted and ready to land.Mar 13 2015, 6:04 PM
This revision was automatically updated to reflect the committed changes.