Page MenuHomePhabricator

Fix an issue where subprocesses could have data left on stdout/stderr
ClosedPublic

Authored by epriestley on Dec 31 2015, 9:59 PM.
Tags
None
Referenced Files
Unknown Object (File)
Wed, Nov 20, 8:32 AM
Unknown Object (File)
Fri, Nov 15, 9:52 PM
Unknown Object (File)
Thu, Nov 14, 7:43 AM
Unknown Object (File)
Tue, Nov 12, 10:45 AM
Unknown Object (File)
Sun, Nov 10, 3:11 PM
Unknown Object (File)
Fri, Nov 8, 1:57 AM
Unknown Object (File)
Oct 16 2024, 6:26 PM
Unknown Object (File)
Oct 14 2024, 10:55 PM
Subscribers
None

Details

Summary

Ref T9724. This is likely a fix for that issue, which could cause hangs on git clone and similar requests.

It appears that we were sometimes leaving data on the subprocess pipes. Specifically, here's what would happen:

  • we'd have full or nearly-full read buffers on the ExecFuture;
  • we'd update the subprocess;
  • it would read only some of the data, but not be able to read everything (since the buffers were full, or became full after reading a little bit);
  • however, since the subprocess had exited, we'd terminate the subprocess and close everything down.

In this situation, we would never read the final bytes off stdout or stderr.

This could lead to git clone correctly hanging, waiting for those last few bytes, which would never come.

Test Plan
  • Ran git clone ... against a large repository (Phabricator) in a loop for a long time.
  • Before the patch, it would very occasionally hang randomly, on the "Receiving objects: ..." step (on my local install, I could get this maybe 2-3% of the time?)
  • After the patch, I can't get it to hang anymore and have been running it for ~10x longer than it ever needed before.

Beyond this, I added a bunch of debugging/logging which seemed to confirm that my analysis/fix are correct. Specifically:

  • I added a log when we were tearing down the process to see if either stdout or stderr were not feof()'d. When things hung, this message logged; normally, it did not.
  • I added logging to show the final bytes read from the subprocess and the final bytes sent to the client. The correct final bytes look something like ...\002Total 190735 (delta 132738), reused 186896 (delta 129329)\n0000. When the error occurred, different final bytes were observed.

Diff Detail

Repository
rPHU libphutil
Lint
Lint Not Applicable
Unit
Tests Not Applicable

Event Timeline

epriestley retitled this revision from to Fix an issue where subprocesses could have data left on stdout/stderr.
epriestley updated this object.
epriestley edited the test plan for this revision. (Show Details)
epriestley added a reviewer: chad.
chad edited edge metadata.
This revision is now accepted and ready to land.Dec 31 2015, 10:01 PM
This revision was automatically updated to reflect the committed changes.