Page MenuHomePhabricator

Dramatically improve subprocess I/O for large buffers
ClosedPublic

Authored by epriestley on Dec 14 2013, 12:50 AM.
Tags
None
Referenced Files
F15434793: D7768.diff
Tue, Mar 25, 5:25 AM
F15432177: D7768.id17575.diff
Mon, Mar 24, 4:39 PM
F15430346: D7768.id17575.diff
Mon, Mar 24, 7:13 AM
F15398547: D7768.id17568.diff
Mon, Mar 17, 12:35 AM
F15392177: D7768.id.diff
Sat, Mar 15, 1:37 PM
F15391584: D7768.id17575.diff
Sat, Mar 15, 10:39 AM
F15391098: D7768.diff
Sat, Mar 15, 8:13 AM
F15390744: D7768.id17569.diff
Sat, Mar 15, 6:32 AM
Subscribers

Details

Summary

Ref T4189. When we write() a large buffer to an ExecFuture, the fwrite() loop currently looks like this:

$bytes_written = fwrite($socket, $buffer);
$buffer = substr($buffer, $bytes_written);

This is normally fine, but substr() is approximately O(N) in the size of the new string, since it has to allocate and copy it.

Instead, add PhutilRope, which stores a string as a list of small buffers. This allows us to remove bytes from the beginning of the string very cheaply.

In particular, this can occur when you git push a very large repository. If we read off the network faster than we write to git receive-pack, we end up with a very large internal buffer which is expensive and slow to write through.

Test Plan

I ran this test script before and after the changes:

<?php

require_once 'scripts/__init_script__.php';

$large = str_repeat('x', 1024 * 1024 * $argv[1]);

echo "OK...\n";

$future = new ExecFuture('cat');
$future->write($large);
$future->resolvex();

echo "OK.\n";

A 32MB write took 16s before the change and 400ms afterward. Generally, cost is close to O(N) now and was close to O(N^2) before, in the size of the buffer.

Diff Detail

Branch
rope
Lint
Lint Passed
Unit
Tests Passed

Event Timeline

epriestley updated this revision to Unknown Object (????).Dec 14 2013, 1:18 AM

Allow write(null, true) to close pipes even if they're already closed.

epriestley added a subscriber: Unknown Object (MLST).Dec 14 2013, 1:30 AM

Facebook, this is a substantial performance improvement in certain ExecFuture edge cases. Last I knew it was still in use in www/, and this might make some stuff faster.

This is pretty cool.

src/utils/PhutilRope.php
14

How come 16384 ?

src/utils/PhutilRope.php
14

Oh, yeah, I should leave a comment about this -- his is the just the process buffer size on OSX.