We have a lingering, repeatedly-failing task that when it runs burns a CPU for a long time before eventually failing. This is the same task that starred in T11744, and after that's fixed (with D16683) it gets farther and seems to reach a different bug.
Running the task now under bin/worker execute --trace, it runs for a little over 5 minutes, continuously spinning a CPU core; gets to the point of making a bunch of database INSERTs and UPDATEs; and after a bunch of those have apparently succeeded, there's this:
>>>  <query> INSERT INTO `metamta_mail` (`actorPHID`, `parameters`, `status`, `message`, `relatedPHID`, `phid`, `dateCreated`, `dateModified`) VALUES [redacted] <<<  <query> 233,365 us >>>  <query> UPDATE `worker_activetask` SET `failureTime` = '1475890723', `taskClass` = 'PhabricatorApplicationTransactionPublishWorker', `leaseOwner` = NULL, `leaseExpires` = '1475890693', `failureCount` = '1885', `dataID` = '29804072', `priority` = '1000', `objectPHID` = 'PHID-DREV-olkoii564vunfq2hdkmw', `id` = '29804024' WHERE `id` = '29804024' <<<  <query> 5,285 us [2016-10-08 01:38:43] EXCEPTION: (AphrontConnectionLostQueryException) #2006: MySQL server has gone away This error may occur if your MySQL 'wait_timeout' or 'max_allowed_packet' configuration values are set too low. at [<phutil>/src/aphront/storage/connection/mysql/AphrontBaseMySQLDatabaseConnection.php:304] arcanist(head=master, ref.master=c8f70fab07d7), phabricator(head=master, ref.master=5973df176809), phutil(head=master, ref.master=6c65998b0b1c), pytest-phabricator(head=1550854ff89dca474176d307f6a485cd5eff0452, ref.master=1939f1e4f656) #0 AphrontBaseMySQLDatabaseConnection::throwCommonException(integer, string) called at [<phutil>/src/aphront/storage/connection/mysql/AphrontBaseMySQLDatabaseConnection.php:348] #1 AphrontBaseMySQLDatabaseConnection::throwQueryCodeException(integer, string) called at [<phutil>/src/aphront/storage/connection/mysql/AphrontBaseMySQLDatabaseConnection.php:289] #2 AphrontBaseMySQLDatabaseConnection::throwQueryException(mysqli) called at [<phutil>/src/aphront/storage/connection/mysql/AphrontBaseMySQLDatabaseConnection.php:185] #3 AphrontBaseMySQLDatabaseConnection::executeRawQuery(string) called at [<phutil>/src/xsprintf/queryfx.php:8] #4 queryfx(AphrontMySQLiDatabaseConnection, string, string, string, array, string) #5 call_user_func_array(string, array) called at [<phutil>/src/aphront/storage/connection/AphrontDatabaseConnection.php:58] #6 AphrontDatabaseConnection::query(string, string, string, array, string) called at [<phabricator>/src/infrastructure/storage/lisk/LiskDAO.php:1261] #7 LiskDAO::insertRecordIntoDatabase(string) called at [<phabricator>/src/infrastructure/storage/lisk/LiskDAO.php:1106] #8 LiskDAO::insert() called at [<phabricator>/src/infrastructure/storage/lisk/LiskDAO.php:1075] #9 LiskDAO::save() called at [<phabricator>/src/applications/metamta/storage/PhabricatorMetaMTAMail.php:355] #10 PhabricatorMetaMTAMail::save() called at [<phabricator>/src/applications/transactions/editor/PhabricatorApplicationTransactionEditor.php:1178] #11 PhabricatorApplicationTransactionEditor::publishTransactions(DifferentialRevision, array) called at [<phabricator>/src/applications/transactions/worker/PhabricatorApplicationTransactionPublishWorker.php:21] [...]
For at least a couple of minutes of that time, it's sitting with no trace output happening, occupying under 250MB of memory (a great improvement!), and a SIGHUP finds it here:
#0 /phab_path/libphutil/src/future/exec/PhutilSignalRouter.php(61): PhutilBacktraceSignalHandler->handleSignal(Object(PhutilSignalRouter), 1) #1 /phab_path/arcanist/src/difference/ArcanistDiffUtils.php(137): PhutilSignalRouter->routeSignal(1) #2 /phab_path/phabricator/src/applications/differential/parser/DifferentialChangesetParser.php(574): ArcanistDiffUtils::applyIntralineDiff(Object(PhutilSafeHTML), Array) #3 /phab_path/phabricator/src/applications/differential/parser/DifferentialChangesetParser.php(812): DifferentialChangesetParser->applyIntraline(Array, Array, Array) #4 /phab_path/phabricator/src/applications/differential/parser/DifferentialChangesetParser.php(645): DifferentialChangesetParser->process() #5 /phab_path/phabricator/src/applications/differential/parser/DifferentialChangesetParser.php(859): DifferentialChangesetParser->tryCacheStuff() [...]
That line 137 where it's interrupted appears to be
$buf .= $str[$i];
so it's building up a big string.
I'm not entirely sure why that MySQL query fails after all the others succeeded -- perhaps the message's guess is correct and the issue is that we took so long to get here? Or perhaps the metamta_mail INSERT, despite apparently succeeding, is so giant that it causes the connection to drop.
In any case, we don't really want to be spending several minutes of CPU rendering the transaction. Probably there's another spot where we ought to apply a limit and bail on fancy diffing, in addition to the one in D16683.