Paths

Table of Contentst

Differential D12060

Add a chunking storage engine for files
ClosedPublic
Actions

Authored by epriestley on Mar 13 2015, 12:10 AM.

Tags

None

Referenced Files

	F15482264: D12060.id29038.diff
	Wed, Apr 9, 3:58 AM

	F15479797: D12060.id29028.diff
	Tue, Apr 8, 9:36 AM

	F15476393: D12060.id.diff
	Mon, Apr 7, 5:58 AM

	F15474319: D12060.diff
	Sun, Apr 6, 7:14 AM

	F15453704: D12060.diff
	Sat, Mar 29, 1:42 PM

	F15437590: D12060.id29038.diff
	Tue, Mar 25, 8:36 PM

	F15437589: D12060.id.diff
	Tue, Mar 25, 8:36 PM

	F15437406: D12060.diff
	Tue, Mar 25, 7:16 PM

Subscribers

Details

Reviewers

Maniphest Tasks

T7149: Allow users to import data into a new Phacility instance

Commits

Restricted Diffusion Commit
rP4aed453b06e4: Add a chunking storage engine for files

Summary

Ref T7149. This isn't complete and isn't active yet, but does basically work. I'll shore it up in the next few diffs.

The new workflow goes like this:

Client, file.allocate(): I'd like to upload a file with length L, metadata M, and hash H.

Then the server returns upload (a boolean) and filePHID (a PHID). These mean:

upload	filePHID	means
false	false	Server can't accept file.
false	true	File data already known, file created from hash.
true	false	Just upload normally.
true	true	Query chunks to start or resume a chunked upload.

All but the last case are uninteresting and work like exising uploads with file.uploadhash (which we can eventually deprecate).

In the last case:

Client, file.querychunks(): Give me a list of chunks that I should upload.

This returns all the chunks for the file. Chunks have a start byte, an end byte, and a "complete" flag to indicate that the server already has the data.

Then, the client fills in chunks by sending them:

Client, file.uploadchunk(): Here is the data for one chunk.

This stuff doesn't work yet or has some caveats:

I haven't tested resume much.
Files need an "isPartial()" flag for partial uploads, and the UI needs to respect it.
The JS client needs to become chunk-aware.
Chunk size is set crazy low to make testing easier.
Some debugging flags that I'll remove soon-ish.
Downloading works, but still streams the whole file into memory.
This storage engine is disabled by default (hardcoded as a unit test engine) because it's still sketchy.
Need some code to remove the "isParital" flag when the last chunk is uploaded.
Maybe do checksumming on chunks.

Test Plan

Hacked up arc upload (see next diff) to be chunk-aware and uploaded a readme in 18 32-byte chunks. Then downloaded it. Got the same file back that I uploaded.
File UI now shows some basic chunk info for chunked files:

Screen_Shot_2015-03-12_at_5.08.29_PM.png (1×1 px, 159 KB)

Diff Detail

Repository

Lint

Lint Not Applicable

Unit

Tests Not Applicable

Event Timeline

epriestley updated this revision to Diff 29028.Mar 13 2015, 12:10 AM

epriestley retitled this revision from to Add a chunking storage engine for files.

epriestley updated this object.

epriestley edited the test plan for this revision. (Show Details)

epriestley added a reviewer: btrahan.

epriestley added a task: T7149: Allow users to import data into a new Phacility instance.

Herald added a subscriber: epriestley. · View Herald TranscriptMar 13 2015, 12:10 AM

epriestley mentioned this in T7411: Allow uploading files from the command line.Mar 13 2015, 1:20 AM

joshuaspence added a subscriber: joshuaspence.Mar 13 2015, 6:55 AM

btrahan accepted this revision.Mar 13 2015, 6:04 PM

btrahan edited edge metadata.

This revision is now accepted and ready to land.Mar 13 2015, 6:04 PM

Closed by commit rP4aed453b06e4: Add a chunking storage engine for files (authored by epriestley, committed by epriestley). · Explain WhyMar 13 2015, 6:30 PM

This revision was automatically updated to reflect the committed changes.

Revision Contents
Changeset List

Path

Size

resources/

sql/

autopatches/

20150312.filechunk.1.sql

9 lines

src/

__phutil_library_map__.php

16 lines

applications/

files/

conduit/

FileAllocateConduitAPIMethod.php

131 lines

FileConduitAPIMethod.php

100 lines

FileQueryChunksConduitAPIMethod.php

47 lines

FileUploadChunkConduitAPIMethod.php

74 lines

FileUploadHashConduitAPIMethod.php

1 line

controller/

PhabricatorFileInfoController.php

61 lines

engine/

PhabricatorChunkedFileStorageEngine.php

171 lines

PhabricatorFileStorageEngine.php

15 lines

query/

PhabricatorFileChunkQuery.php

116 lines

storage/

PhabricatorFile.php

41 lines

PhabricatorFileChunk.php

105 lines

Diff 29038

resources/sql/autopatches/20150312.filechunk.1.sql

Loading...

src/__phutil_library_map__.php

Loading...

src/applications/files/conduit/FileAllocateConduitAPIMethod.php

Loading...

src/applications/files/conduit/FileConduitAPIMethod.php

Loading...

src/applications/files/conduit/FileQueryChunksConduitAPIMethod.php

Loading...

src/applications/files/conduit/FileUploadChunkConduitAPIMethod.php

Loading...

src/applications/files/conduit/FileUploadHashConduitAPIMethod.php

Loading...

src/applications/files/controller/PhabricatorFileInfoController.php

Loading...

src/applications/files/engine/PhabricatorChunkedFileStorageEngine.php

Loading...

src/applications/files/engine/PhabricatorFileStorageEngine.php

Loading...

src/applications/files/query/PhabricatorFileChunkQuery.php

Loading...

src/applications/files/storage/PhabricatorFile.php

Loading...

src/applications/files/storage/PhabricatorFileChunk.php

Loading...