Page MenuHomePhabricator

MySQL connection may not respect timeout?
Closed, ResolvedPublic

Description

We got a user report of a repository failing to update in the cluster. The PullLocal daemon appeared to not be doing much of anything, but it started working again as soon as I attached to it with strace. We did get this in the logs:

Daemon 1204 STDE [Wed, 29 Jun 2016 17:56:56 +0000] [2016-06-29 17:56:56] EXCEPTION: (AphrontConnectionLostQueryException) #2013: Lost connection to MySQL server at 'reading initial communication packet', system error: 4 at [<phutil>/src/aphront/storage/connection/mysql/AphrontBaseMySQLDatabaseConnection.php:297]
Daemon 1204 STDE [Wed, 29 Jun 2016 17:56:56 +0000] arcanist(head=stable, ref.master=172c930630a9, ref.stable=18b27b03fa3d), libcore(), phabricator(head=stable, ref.master=4b43667086ae, ref.stable=cadac75b82bb), phutil(head=stable, ref.master=aee76238d694, ref.stable=8aa8612a094b), services(head=stable, ref.master=58a0e84d4e0a, ref.stable=66cf59af4580)
Daemon 1204 STDE [Wed, 29 Jun 2016 17:56:56 +0000]   #0 AphrontBaseMySQLDatabaseConnection::throwCommonException(integer, string) called at [<phutil>/src/aphront/storage/connection/mysql/AphrontBaseMySQLDatabaseConnection.php:334]
Daemon 1204 STDE [Wed, 29 Jun 2016 17:56:56 +0000]   #1 AphrontBaseMySQLDatabaseConnection::throwConnectionException(integer, string, string, string) called at [<phutil>/src/aphront/storage/connection/mysql/AphrontMySQLiDatabaseConnection.php:72]
Daemon 1204 STDE [Wed, 29 Jun 2016 17:56:56 +0000]   #2 AphrontMySQLiDatabaseConnection::connect() called at [<phutil>/src/aphront/storage/connection/mysql/AphrontBaseMySQLDatabaseConnection.php:101]
Daemon 1204 STDE [Wed, 29 Jun 2016 17:56:56 +0000]   #3 AphrontBaseMySQLDatabaseConnection::establishConnection() called at [<phutil>/src/aphront/storage/connection/mysql/AphrontBaseMySQLDatabaseConnection.php:124]
Daemon 1204 STDE [Wed, 29 Jun 2016 17:56:56 +0000]   #4 AphrontBaseMySQLDatabaseConnection::requireConnection() called at [<phutil>/src/aphront/storage/connection/mysql/AphrontMySQLiDatabaseConnection.php:15]
Daemon 1204 STDE [Wed, 29 Jun 2016 17:56:56 +0000]   #5 AphrontMySQLiDatabaseConnection::escapeBinaryString(string) called at [<phutil>/src/aphront/storage/connection/mysql/AphrontMySQLiDatabaseConnection.php:11]
Daemon 1204 STDE [Wed, 29 Jun 2016 17:56:56 +0000]   #6 AphrontMySQLiDatabaseConnection::escapeUTF8String(string) called at [<phutil>/src/xsprintf/qsprintf.php:178]
Daemon 1204 STDE [Wed, 29 Jun 2016 17:56:56 +0000]   #7 xsprintf_query(AphrontMySQLiDatabaseConnection, string, integer, string, integer) called at [<phutil>/src/xsprintf/xsprintf.php:70]
Daemon 1204 STDE [Wed, 29 Jun 2016 17:56:56 +0000]   #8 xsprintf(string, AphrontMySQLiDatabaseConnection, array) called at [<phutil>/src/xsprintf/qsprintf.php:64]
Daemon 1204 STDE [Wed, 29 Jun 2016 17:56:56 +0000]   #9 qsprintf(AphrontMySQLiDatabaseConnection, string, string, string, string, string)
Daemon 1204 STDE [Wed, 29 Jun 2016 17:56:56 +0000]   #10 call_user_func_array(string, array) called at [<phutil>/src/xsprintf/queryfx.php:5]
Daemon 1204 STDE [Wed, 29 Jun 2016 17:56:56 +0000]   #11 queryfx(AphrontMySQLiDatabaseConnection, string, string, string, string, string)
Daemon 1204 STDE [Wed, 29 Jun 2016 17:56:56 +0000]   #12 call_user_func_array(string, array) called at [<phutil>/src/xsprintf/queryfx.php:11]
Daemon 1204 STDE [Wed, 29 Jun 2016 17:56:56 +0000]   #13 queryfx_all(AphrontMySQLiDatabaseConnection, string, string, string, string, string)
Daemon 1204 STDE [Wed, 29 Jun 2016 17:56:56 +0000]   #14 call_user_func_array(string, array) called at [<phutil>/src/aphront/storage/connection/AphrontDatabaseConnection.php:26]
Daemon 1204 STDE [Wed, 29 Jun 2016 17:56:56 +0000]   #15 AphrontDatabaseConnection::queryData(string, string, string, string, string)
Daemon 1204 STDE [Wed, 29 Jun 2016 17:56:56 +0000]   #16 call_user_func_array(array, array) called at [<phabricator>/src/infrastructure/storage/lisk/LiskDAO.php:535]
Daemon 1204 STDE [Wed, 29 Jun 2016 17:56:56 +0000]   #17 LiskDAO::loadRawDataWhere(string, string, string)
Daemon 1204 STDE [Wed, 29 Jun 2016 17:56:56 +0000]   #18 call_user_func_array(array, array) called at [<phabricator>/src/infrastructure/storage/lisk/LiskDAO.php:476]
Daemon 1204 STDE [Wed, 29 Jun 2016 17:56:56 +0000]   #19 LiskDAO::loadAllWhere(string, string, string) called at [<phabricator>/src/applications/repository/daemon/PhabricatorRepositoryPullLocalDaemon.php:278]
Daemon 1204 STDE [Wed, 29 Jun 2016 17:56:56 +0000]   #20 PhabricatorRepositoryPullLocalDaemon::loadRepositoryUpdateMessages() called at [<phabricator>/src/applications/repository/daemon/PhabricatorRepositoryPullLocalDaemon.php:457]
Daemon 1204 STDE [Wed, 29 Jun 2016 17:56:56 +0000]   #21 PhabricatorRepositoryPullLocalDaemon::waitForUpdates(integer, array) called at [<phabricator>/src/applications/repository/daemon/PhabricatorRepositoryPullLocalDaemon.php:213]
Daemon 1204 STDE [Wed, 29 Jun 2016 17:56:56 +0000]   #22 PhabricatorRepositoryPullLocalDaemon::run() called at [<phutil>/src/daemon/PhutilDaemon.php:184]
Daemon 1204 STDE [Wed, 29 Jun 2016 17:56:56 +0000]   #23 PhutilDaemon::execute() called at [<phutil>/scripts/daemon/exec/exec_daemon.php:127]

The expectation is that we set MYSQLI_OPT_CONNECT_TIMEOUT and won't spend more than a few seconds attempting to connect, so failures during connection shouldn't make daemons hang, but it's possible that this particular error bypasses that or that there's a configuration issue with timeout.

Event Timeline

D16194 may resolve this. I'll keep an eye on things after the deployment and close this if we don't see any more issues for a bit.

In the near future, T11044 will also unify the connection establishment pathways.

(Leaving this open to remind myself to keep an eye on it.)

This didn't have a reliable reproduction case, but from limited spot-checking of production logs I don't see it happening any more.