Phabricator should survive a restart and setup checks with an unreachable master
Open, NormalPublic

Description

See PHI36. Currently, restarting Phabricator with an unreachable master may fail during setup checks:

Attempt to connect to phabricator@p:db001.epriestley.com failed with error #2002: Operation timed out.

Underlying trace:

[Thu Aug 17 11:09:25.597583 2017] [php7:notice] [pid 3817] [client 127.0.0.1:65337] [2017-08-17 21:09:25] EXCEPTION: (AphrontConnectionQueryException) Attempt to connect to phabricator@p:db001.epriestley.com failed with error #2002: Operation timed out. at [<phutil>/src/aphront/storage/connection/mysql/AphrontBaseMySQLDatabaseConnection.php:343]
[Thu Aug 17 11:09:25.598528 2017] [php7:notice] [pid 3817] [client 127.0.0.1:65337] arcanist(head=experimental, ref.master=5eda40337bb4, ref.experimental=dc65bfbe5434), corgi(head=master, ref.master=7e90a51a3172), instances(head=stable, ref.master=84a242a24abb, ref.stable=10e152155ed7), ledger(head=master, ref.master=4da4a24b8779), libcore(), phabricator(head=master, ref.master=c9986fd5dee6, custom=1), phutil(head=stable, ref.master=276f6d304b69, ref.stable=ee5ebf668ad4), secure(head=master, ref.master=988cf9bd7958), services(head=master, ref.master=08219d678cee)
[Thu Aug 17 11:09:25.598542 2017] [php7:notice] [pid 3817] [client 127.0.0.1:65337]   #0 <#2> AphrontBaseMySQLDatabaseConnection::throwConnectionException(integer, string, string, string) called at [<phutil>/src/aphront/storage/connection/mysql/AphrontMySQLiDatabaseConnection.php:76]
[Thu Aug 17 11:09:25.598546 2017] [php7:notice] [pid 3817] [client 127.0.0.1:65337]   #1 <#2> AphrontMySQLiDatabaseConnection::connect() called at [<phutil>/src/aphront/storage/connection/mysql/AphrontBaseMySQLDatabaseConnection.php:101]
[Thu Aug 17 11:09:25.598549 2017] [php7:notice] [pid 3817] [client 127.0.0.1:65337]   #2 <#2> AphrontBaseMySQLDatabaseConnection::establishConnection() called at [<phutil>/src/aphront/storage/connection/mysql/AphrontBaseMySQLDatabaseConnection.php:124]
[Thu Aug 17 11:09:25.598553 2017] [php7:notice] [pid 3817] [client 127.0.0.1:65337]   #3 <#2> AphrontBaseMySQLDatabaseConnection::requireConnection() called at [<phutil>/src/aphront/storage/connection/mysql/AphrontMySQLiDatabaseConnection.php:15]
[Thu Aug 17 11:09:25.598556 2017] [php7:notice] [pid 3817] [client 127.0.0.1:65337]   #4 <#2> AphrontMySQLiDatabaseConnection::escapeBinaryString(string) called at [<phutil>/src/aphront/storage/connection/mysql/AphrontMySQLiDatabaseConnection.php:11]
[Thu Aug 17 11:09:25.598561 2017] [php7:notice] [pid 3817] [client 127.0.0.1:65337]   #5 <#2> AphrontMySQLiDatabaseConnection::escapeUTF8String(string) called at [<phutil>/src/xsprintf/qsprintf.php:178]
[Thu Aug 17 11:09:25.598564 2017] [php7:notice] [pid 3817] [client 127.0.0.1:65337]   #6 <#2> xsprintf_query(AphrontMySQLiDatabaseConnection, string, integer, string, integer) called at [<phutil>/src/xsprintf/xsprintf.php:70]
[Thu Aug 17 11:09:25.598567 2017] [php7:notice] [pid 3817] [client 127.0.0.1:65337]   #7 <#2> xsprintf(string, AphrontMySQLiDatabaseConnection, array) called at [<phutil>/src/xsprintf/qsprintf.php:64]
[Thu Aug 17 11:09:25.598570 2017] [php7:notice] [pid 3817] [client 127.0.0.1:65337]   #8 <#2> qsprintf(AphrontMySQLiDatabaseConnection, string, string) called at [<phutil>/src/xsprintf/queryfx.php:5]
[Thu Aug 17 11:09:25.598573 2017] [php7:notice] [pid 3817] [client 127.0.0.1:65337]   #9 <#2> queryfx(AphrontMySQLiDatabaseConnection, string, string) called at [<phutil>/src/xsprintf/queryfx.php:13]
[Thu Aug 17 11:09:25.598576 2017] [php7:notice] [pid 3817] [client 127.0.0.1:65337]   #10 <#2> queryfx_all(AphrontMySQLiDatabaseConnection, string, string) called at [<phutil>/src/xsprintf/queryfx.php:19]
[Thu Aug 17 11:09:25.598579 2017] [php7:notice] [pid 3817] [client 127.0.0.1:65337]   #11 <#2> queryfx_one(AphrontMySQLiDatabaseConnection, string, string) called at [<phabricator>/src/infrastructure/storage/management/PhabricatorStorageManagementAPI.php:305]
[Thu Aug 17 11:09:25.598593 2017] [php7:notice] [pid 3817] [client 127.0.0.1:65337]   #12 <#2> PhabricatorStorageManagementAPI::isCharacterSetAvailableOnConnection(string, AphrontMySQLiDatabaseConnection) called at [<phabricator>/src/applications/config/check/PhabricatorMySQLSetupCheck.php:329]
[Thu Aug 17 11:09:25.598597 2017] [php7:notice] [pid 3817] [client 127.0.0.1:65337]   #13 <#2> PhabricatorMySQLSetupCheck::executeRefChecks(PhabricatorDatabaseRef) called at [<phabricator>/src/applications/config/check/PhabricatorMySQLSetupCheck.php:12]
[Thu Aug 17 11:09:25.598600 2017] [php7:notice] [pid 3817] [client 127.0.0.1:65337]   #14 <#2> PhabricatorMySQLSetupCheck::executeChecks() called at [<phabricator>/src/applications/config/check/PhabricatorSetupCheck.php:63]
[Thu Aug 17 11:09:25.598604 2017] [php7:notice] [pid 3817] [client 127.0.0.1:65337]   #15 <#2> PhabricatorSetupCheck::runSetupChecks() called at [<phabricator>/src/applications/config/check/PhabricatorSetupCheck.php:258]
[Thu Aug 17 11:09:25.598607 2017] [php7:notice] [pid 3817] [client 127.0.0.1:65337]   #16 <#2> PhabricatorSetupCheck::runNormalChecks() called at [<phabricator>/src/applications/config/engine/PhabricatorSetupEngine.php:26]
[Thu Aug 17 11:09:25.598610 2017] [php7:notice] [pid 3817] [client 127.0.0.1:65337]   #17 <#2> PhabricatorSetupEngine::execute() called at [<phabricator>/src/applications/config/check/PhabricatorSetupCheck.php:194]
[Thu Aug 17 11:09:25.598613 2017] [php7:notice] [pid 3817] [client 127.0.0.1:65337]   #18 <#2> PhabricatorSetupCheck::willProcessRequest() called at [<phabricator>/src/aphront/configuration/AphrontApplicationConfiguration.php:137]
[Thu Aug 17 11:09:25.598616 2017] [php7:notice] [pid 3817] [client 127.0.0.1:65337]   #19 <#2> AphrontApplicationConfiguration::runHTTPRequest(AphrontPHPHTTPSink) called at [<phabricator>/webroot/index.php:17]
[Thu Aug 17 11:09:25.598619 2017] [php7:notice] [pid 3817] [client 127.0.0.1:65337]   #20 phlog(AphrontConnectionQueryException) called at [<phabricator>/src/aphront/response/AphrontUnhandledExceptionResponse.php:20]
[Thu Aug 17 11:09:25.598622 2017] [php7:notice] [pid 3817] [client 127.0.0.1:65337]   #21 AphrontUnhandledExceptionResponse::setException(AphrontConnectionQueryException) called at [<phabricator>/webroot/index.php:21]

A way to simulate this is:

  • Configure things as master/replica.
  • Give the master an invalid port.
  • Restart Apache.

Phabricator should survive setup checks and return to service, albeit in read-only mode. If it can not, operations staff restarting webservers (or deploying new webservers) during an incident may fail.

Additionally, we should clearly document the behavior of "disabled" and cluster.read-only, and possibly provide a new option like "pretend-all-connections-to-this-host-fail", i.e. equivalent to changing the port to an invalid one, and describe a specific "how to test failover" workflow in the documentation. Although these options have at least somewhat-legitimate reasons to exist and work like they do, their behavior is not clear or obvious and they look like test switches for failover.

See also T12965.