We have one lingering, repeatedly-failing task that when it runs burns a CPU for a long time and consumes memory until it eventually fails with ENOMEM.
Symptoms are similar to T11743, though the exact pattern differs, and it seems to be in a different part of the code. (And D16682 doesn't fix this one.) Crucially for my personal immediate happiness, this one doesn't cause the system to go unavailable, or not as conspicuously so -- either because of the subtle differences in the symptoms, or just because we happen to have only one affected task instead of a posse of them.
In brief: the task consumes tons of memory while computing various diffs in a Differential revision. Ultimately it fails when it tries to fork a process (a simple diff, it looks like) after already getting up to about 16GB of memory. The most painful case here for the code appears to be when we try to compute a nice diff on a giant, multi-MB line.
I should go home and eat dinner and things, but I may come back tomorrow and attempt a repro. Or perhaps these traces will be enough already to debug the issue and make a fix.
TRACES
The backtrace from SIGHUP, in the middle of the hang while memory use is growing, looks like this (full copy below):
#2 /phab_path/libphutil/src/utils/utf8.php(789): phutil_utf8v_combine_characters(Array) #3 /phab_path/arcanist/src/difference/ArcanistDiffUtils.php(160): phutil_utf8v_combined('{"version":3,"s...') #4 /phab_path/arcanist/src/difference/ArcanistDiffUtils.php(59): ArcanistDiffUtils::computeIntralineEdits('{"version":3,"s...', '{"version":3,"s...') #5 /phab_path/phabricator/src/applications/differential/parser/DifferentialHunkParser.php(347): ArcanistDiffUtils::generateIntralineDiff('{"version":3,"s...', '{"version":3,"s...') #6 /phab_path/phabricator/src/applications/differential/parser/DifferentialChangesetParser.php(745): DifferentialHunkParser->generateIntraLineDiffs() #7 /phab_path/phabricator/src/applications/differential/parser/DifferentialChangesetParser.php(645): DifferentialChangesetParser->process()
The trace from --trace has several iterations like this:
>>> [75] <exec> $ pygmentize -O encoding=utf-8 -O stripnl=False -f html -l 'html+evoque' >>> [76] <exec> $ pygmentize -O encoding=utf-8 -O stripnl=False -f html -l 'html+evoque' <<< [75] <exec> 107,102 us <<< [76] <exec> 104,431 us >>> [77] <exec> $ diff '-bw' '-U65535' '-L' '/dev/universe 9999-99-99' '-L' '/dev/universe 9999-99-99' '/tmp/beqle54vuq88g8so/21297-Q4vlwR' '/tmp/dhoisyi69zk84csk/21297-c2v0aA' <<< [77] <exec> 8,708 us >>> [78] <event> differential.willMarkGenerated <listeners = 1> <<< [78] <event> 266 us
Then it does this, and then hangs for a while, spinning the CPU:
>>> [79] <exec> $ pygmentize -O encoding=utf-8 -O stripnl=False -f html -l 'ts' >>> [80] <exec> $ pygmentize -O encoding=utf-8 -O stripnl=False -f html -l 'ts' <<< [79] <exec> 111,197 us <<< [80] <exec> 107,945 us >>> [81] <event> differential.willMarkGenerated <listeners = 1> <<< [81] <event> 257 us >>> [82] <exec> $ diff '-bw' '-U65535' '-L' '/dev/universe 9999-99-99' '-L' '/dev/universe 9999-99-99' '/tmp/4rxajs43r7acscsg/21297-pSmAQk' '/tmp/10x24j128vjkoc8k/21297-a0nqz5' <<< [82] <exec> 161,331 us
At this point it's using about 3GB of memory, and the number isn't growing.
Then after a few minutes it actually prints another pair of lines, and after a few more minutes another:
>>> [83] <event> differential.willMarkGenerated <listeners = 1> <<< [83] <event> 420 us
During this time there are bursts of growing memory use, and periods of stable memory use. It's continuously burning one core of CPU.
And a few more minutes later, we get to the end, including an exception where fork fails for lack of memory. At this point the process has had stable memory use for a while -- at nearly 16GiB, nearly half the memory on the machine, which is presumably why the kernel says "no thanks" to forking that.
Handily, this exception comes with its own backtrace:
>>> [84] <event> differential.willMarkGenerated <listeners = 1> <<< [84] <event> 420 us >>> [85] <exec> $ diff '-bw' '-U65535' '-L' '/dev/universe 9999-99-99' '-L' '/dev/universe 9999-99-99' '/tmp/a0kgzfdxruo0k4wg/21297-dDSlRR' '/tmp/diolz5sd7e8sk8o8/21297-rZbt9D' >>> [86] <query> UPDATE `worker_activetask` SET `failureTime` = '1475805673', `taskClass` = 'PhabricatorApplicationTransactionPublishWorker', `leaseOwner` = NULL, `leaseExpires` = '1475805400', `failureCount` = '1754', `dataID` = '29804072', `priority` = '1000', `objectPHID` = 'PHID-DREV-olkoii564vunfq2hdkmw', `id` = '29804024' WHERE `id` = '29804024' <<< [86] <query> 4,992 us [2016-10-07 02:01:13] EXCEPTION: (Exception) Failed to `proc_open()`: proc_open(): fork failed - Cannot allocate memory at [<phutil>/src/future/exec/ExecFuture.php:675] arcanist(head=master, ref.master=d455ddde6a9c), phabricator(head=master, ref.master=fd5825fafdcb, custom=1), phutil(head=master, ref.master=6c65998b0b1c), pytest-phabricator(head=1550854ff89dca474176d307f6a485cd5eff0452, ref.master=1939f1e4f656) #0 ExecFuture::isReady() called at [<phutil>/src/future/Future.php:37] #1 Future::resolve() called at [<phutil>/src/future/exec/execx.php:37] #2 exec_manual(string, array, TempFile, TempFile) called at [<phabricator>/src/infrastructure/diff/PhabricatorDifferenceEngine.php:100] #3 PhabricatorDifferenceEngine::generateRawDiffFromFileContent(string, string) called at [<phabricator>/src/infrastructure/diff/PhabricatorDifferenceEngine.php:163] #4 PhabricatorDifferenceEngine::generateChangesetFromFileContent(string, string) called at [<phabricator>/src/applications/differential/parser/DifferentialChangesetParser.php:710] #5 DifferentialChangesetParser::process() called at [<phabricator>/src/applications/differential/parser/DifferentialChangesetParser.php:645] [... and then the same as from SIGHUP below.]
By my reading of these traces, the point where the exception is thrown isn't the main code at fault -- the real problem is that we're allocating such a huge amount of memory in the earlier phases.
Looking on the web at the diff the --trace output seems to refer to, it contains several large files that look like generated, minified CSS and JS files. In particular these have some gigantic lines which have changed. On the web we seem to actually successfully render most of the diffs, though I notice some busy apache2 processes for a while when I load the page... aha, in fact there's one file it tries to render and times out on, with the "Loading..." box sticking around forever after the apache2 process stops.
That file is one single line of 7387356 bytes (in both old and new versions), a giant JSON document with an extension .js.map that looks like a generated sourcemap. There are a few diffs scattered through it.
YET MORE DETAIL
Full backtrace from SIGHUP:
#0 /phab_path/libphutil/src/future/exec/PhutilSignalRouter.php(61): PhutilBacktraceSignalHandler->handleSignal(Object(PhutilSignalRouter), 1) #1 /phab_path/libphutil/src/utils/utf8.php(832): PhutilSignalRouter->routeSignal(1) #2 /phab_path/libphutil/src/utils/utf8.php(789): phutil_utf8v_combine_characters(Array) #3 /phab_path/arcanist/src/difference/ArcanistDiffUtils.php(160): phutil_utf8v_combined('{"version":3,"s...') #4 /phab_path/arcanist/src/difference/ArcanistDiffUtils.php(59): ArcanistDiffUtils::computeIntralineEdits('{"version":3,"s...', '{"version":3,"s...') #5 /phab_path/phabricator/src/applications/differential/parser/DifferentialHunkParser.php(347): ArcanistDiffUtils::generateIntralineDiff('{"version":3,"s...', '{"version":3,"s...') #6 /phab_path/phabricator/src/applications/differential/parser/DifferentialChangesetParser.php(745): DifferentialHunkParser->generateIntraLineDiffs() #7 /phab_path/phabricator/src/applications/differential/parser/DifferentialChangesetParser.php(645): DifferentialChangesetParser->process() #8 /phab_path/phabricator/src/applications/differential/parser/DifferentialChangesetParser.php(859): DifferentialChangesetParser->tryCacheStuff() #9 /phab_path/phabricator/src/applications/differential/mail/DifferentialChangeDetailMailView.php(53): DifferentialChangesetParser->render() #10 /phab_path/phabricator/src/applications/differential/editor/DifferentialTransactionEditor.php(1424): DifferentialChangeDetailMailView->buildMailSection() #11 /phab_path/phabricator/src/applications/differential/editor/DifferentialTransactionEditor.php(1271): DifferentialTransactionEditor->appendChangeDetailsForMail(Object(DifferentialRevision), Object(DifferentialDiff), 'Index: arp/arp/...', Object(PhabricatorMetaMTAMailBody)) #12 /phab_path/phabricator/src/applications/transactions/editor/PhabricatorApplicationTransactionEditor.php(2511): DifferentialTransactionEditor->buildMailBody(Object(DifferentialRevision), Array) #13 /phab_path/phabricator/src/applications/transactions/editor/PhabricatorApplicationTransactionEditor.php(2468): PhabricatorApplicationTransactionEditor->buildMailForTarget(Object(DifferentialRevision), Array, Object(PhabricatorMailTarget)) #14 /phab_path/phabricator/src/applications/transactions/editor/PhabricatorApplicationTransactionEditor.php(1150): PhabricatorApplicationTransactionEditor->buildMail(Object(DifferentialRevision), Array) #15 /phab_path/phabricator/src/applications/transactions/worker/PhabricatorApplicationTransactionPublishWorker.php(21): PhabricatorApplicationTransactionEditor->publishTransactions(Object(DifferentialRevision), Array) #16 /phab_path/phabricator/src/infrastructure/daemon/workers/PhabricatorWorker.php(123): PhabricatorApplicationTransactionPublishWorker->doWork() #17 /phab_path/phabricator/src/infrastructure/daemon/workers/storage/PhabricatorWorkerActiveTask.php(171): PhabricatorWorker->executeTask() #18 /phab_path/phabricator/src/infrastructure/daemon/workers/management/PhabricatorWorkerManagementExecuteWorkflow.php(52): PhabricatorWorkerActiveTask->executeTask() #19 /phab_path/libphutil/src/parser/argument/PhutilArgumentParser.php(438): PhabricatorWorkerManagementExecuteWorkflow->execute(Object(PhutilArgumentParser)) #20 /phab_path/libphutil/src/parser/argument/PhutilArgumentParser.php(330): PhutilArgumentParser->parseWorkflowsFull(Array) #21 /phab_path/phabricator/scripts/setup/manage_worker.php(21): PhutilArgumentParser->parseWorkflows(Array) #22 {main}