Page MenuHomePhabricator

Dramatically improve subprocess I/O for large buffers
ClosedPublic

Authored by epriestley on Dec 14 2013, 12:50 AM.
Tags
None
Referenced Files
Unknown Object (File)
Sat, Dec 21, 5:01 PM
Unknown Object (File)
Wed, Dec 18, 4:50 AM
Unknown Object (File)
Thu, Dec 12, 6:17 PM
Unknown Object (File)
Wed, Dec 11, 5:52 PM
Unknown Object (File)
Mon, Dec 9, 3:48 PM
Unknown Object (File)
Sun, Dec 8, 6:03 PM
Unknown Object (File)
Thu, Dec 5, 8:00 AM
Unknown Object (File)
Thu, Dec 5, 4:05 AM
Subscribers

Details

Summary

Ref T4189. When we write() a large buffer to an ExecFuture, the fwrite() loop currently looks like this:

$bytes_written = fwrite($socket, $buffer);
$buffer = substr($buffer, $bytes_written);

This is normally fine, but substr() is approximately O(N) in the size of the new string, since it has to allocate and copy it.

Instead, add PhutilRope, which stores a string as a list of small buffers. This allows us to remove bytes from the beginning of the string very cheaply.

In particular, this can occur when you git push a very large repository. If we read off the network faster than we write to git receive-pack, we end up with a very large internal buffer which is expensive and slow to write through.

Test Plan

I ran this test script before and after the changes:

<?php

require_once 'scripts/__init_script__.php';

$large = str_repeat('x', 1024 * 1024 * $argv[1]);

echo "OK...\n";

$future = new ExecFuture('cat');
$future->write($large);
$future->resolvex();

echo "OK.\n";

A 32MB write took 16s before the change and 400ms afterward. Generally, cost is close to O(N) now and was close to O(N^2) before, in the size of the buffer.

Diff Detail

Lint
Lint Skipped
Unit
Tests Skipped

Event Timeline

epriestley updated this revision to Unknown Object (????).Dec 14 2013, 1:18 AM

Allow write(null, true) to close pipes even if they're already closed.

epriestley added a subscriber: Unknown Object (MLST).Dec 14 2013, 1:30 AM

Facebook, this is a substantial performance improvement in certain ExecFuture edge cases. Last I knew it was still in use in www/, and this might make some stuff faster.

This is pretty cool.

src/utils/PhutilRope.php
15

How come 16384 ?

src/utils/PhutilRope.php
15

Oh, yeah, I should leave a comment about this -- his is the just the process buffer size on OSX.