Fixes T9715. Adds a MySQL-based lock to ensure that schema migrations are not applied on multiple hosts simultaneously.
Details
- Reviewers
epriestley - Group Reviewers
Blessed Reviewers - Maniphest Tasks
- T9715: Prevent upgrade scripts from being run multiple times simultaneously
- Commits
- Restricted Diffusion Commit
rP71646062852b: Add a lock to storage upgrade and adjustment
Ran ./bin/storage upgrade concurrently. One invocation was successful whilst the other hit a PhutilLockException.
Diff Detail
- Repository
- rP Phabricator
- Branch
- master
- Lint
Lint Passed - Unit
Tests Passed - Build Status
Buildable 9168 Build 10832: Run Core Tests Build 10831: arc lint + arc unit
Event Timeline
I'm not entirely sure why the unit tests fail...
Exception Command failed with error #255! COMMAND php '/home/josh/workspace/github.com/phacility/phabricator/scripts/sql/manage_storage.php' upgrade --force --no-adjust --namespace 'phabricator_unittest_kucdz6axnh3ugtmzfbk76g2b' STDOUT (empty) STDERR [2015-11-11 23:15:08] EXCEPTION: (AphrontConnectionQueryException) Attempt to connect to root@localhost failed with error #1049: Unknown database 'phabricator_unittest_kucdz6axnh3ugtmzfbk76g2b_repository'. at [<phutil>/src/aphront/storage/connection/mysql/AphrontMySQLiDatabaseConnection.php:72] arcanist(head=master, ref.master=2db40f995337), phabricator(head=master, ref.master=aa11cd4d2b7f, custom=1), phutil(head=master, ref.master=66bf71f94817) #0 AphrontMySQLiDatabaseConnection::connect() called at [<phutil>/src/aphront/storage/connection/mysql/AphrontBaseMySQLDatabaseConnection.php:97] #1 AphrontBaseMySQLDatabaseConnection::establishConnection() called at [<phutil>/src/aphront/storage/connection/mysql/AphrontBaseMySQLDatabaseConnection.php:120] #2 AphrontBaseMySQLDatabaseConnection::requireConnection() called at [<phutil>/src/aphront/storage/connection/mysql/AphrontBaseMySQLDatabaseConnection.php:156] #3 AphrontBaseMySQLDatabaseConnection::executeRawQuery(string) called at [... (808 more bytes) ... (Run with `--trace` for a full exception trace.)
Don't use a static method.
The lock returned by PhabricatorGlobalLock::newLock is unique for the storage namespace, so it's not necessary to use a static method (in fact, it's wrong).
Unfortunately the unit tests are still failing...
Exception Command failed with error #255! COMMAND php '/home/josh/workspace/github.com/phacility/phabricator/scripts/sql/manage_storage.php' upgrade --force --no-adjust --namespace 'phabricator_unittest_6pf7om7fpm5jxmcjwt3losy7' STDOUT Loading quickstart template... Applying patch 'phabricator:db.xhpast'... Applying patch 'phabricator:20150906.mailinglist.sql'... Applying patch 'phabricator:20150930.drydock.log.1.sql'... Applying patch 'phabricator:20151001.drydock.rname.1.sql'... Applying patch 'phabricator:20151002.dashboard.status.1.sql'... Applying patch 'phabricator:20151002.harbormaster.bparam.1.sql'... Applying patch 'phabricator:20151009.drydock.auth.1.sql'... Applying patch 'phabricator:20151010.drydock.auth.2.sql'... Applying patch 'phabricator:20151013.drydock.op.1.sql'... Applying patch 'phabricator:20151023.harborpolicy.1.sql'... Applying patch 'phabricator:20151023.harborpolicy.2.php'... Applying patch 'phabricator:20151023.patchduration.sql'... Applying patch 'phabricator:20151030.harbormaster.initiator.sql'... Applying patch 'phabricator:20151106.editengine.1.table.sql'... Applying patch 'phabricator:20151106.editengine.2.xactions.sql'... Applying patch 'phabricator:20151106.phame.post.mailkey.1.sql'.... (691 more bytes) ... STDERR [2015-11-15 13:40:22] EXCEPTION: (Exception) Lock 'ph:phabric-7FvKCwb1Slte:PhabricatorStorageManagementWorkflow is not locked by this process! at [<phutil>/src/filesystem/PhutilLock.php:196] arcanist(head=master, ref.master=9dd6eafb5254), phabricator(head=master, ref.master=1363aae8687b, custom=1), phutil(head=master, ref.master=e9ed72483a14) #0 PhutilLock::unlock() called at [<phabricator>/src/infrastructure/storage/management/workflow/PhabricatorStorageManagementWorkflow.php:723] #1 PhabricatorStorageManagementWorkflow::upgradeSchemata(NULL, boolean, boolean) called at [<phabricator>/src/infrastructure/storage/management/workflow/PhabricatorStorageManagementUpgradeWorkflow.php:78] #2 PhabricatorStorageManagementUpgradeWorkflow::execute(PhutilArgumentParser) called at [<phutil>/src/parser/argument/PhutilArgumentParser.php:406] #3 PhutilArgumentParser::parseWorkflowsFull(array) called at [<phutil>/src/parser/argument/PhutilArgumentParser.php:301] #4 PhutilArgumentParser::pars... (79 more bytes) ... (Run with `--trace` for a full exception trace.)
You have to call $lock->lock() to actually acquire the lock: lock acquisition is not a side effect of lock construction.
This code calls $workflow->lock(), which does new Lock(), but nothing actually calls $lock->lock(). My expectation is that the lock is not acquired, which is why it fails to release.
Your test plan should involve running bin/storage upgrade multiple times in parallel and observing the processes serialize their execution (and/or abort with lock failures).
The unit tests still aren't working...
Exception Command failed with error #255! COMMAND php '/home/josh/workspace/github.com/phacility/phabricator/scripts/sql/manage_storage.php' upgrade --force --no-adjust --namespace 'phabricator_unittest_pwmcasqzaya64mzxrxyfkhua' STDOUT (empty) STDERR [2015-11-16 07:38:13] EXCEPTION: (AphrontConnectionQueryException) Attempt to connect to root@localhost failed with error #1049: Unknown database 'phabricator_unittest_pwmcasqzaya64mzxrxyfkhua_repository'. at [<phutil>/src/aphront/storage/connection/mysql/AphrontMySQLiDatabaseConnection.php:72] arcanist(head=master, ref.master=66ab1c955d27), phabricator(head=master, ref.master=fce301da06dc, custom=1), phutil(head=master, ref.master=e9ed72483a14) #0 AphrontMySQLiDatabaseConnection::connect() called at [<phutil>/src/aphront/storage/connection/mysql/AphrontBaseMySQLDatabaseConnection.php:97] #1 AphrontBaseMySQLDatabaseConnection::establishConnection() called at [<phutil>/src/aphront/storage/connection/mysql/AphrontBaseMySQLDatabaseConnection.php:120] #2 AphrontBaseMySQLDatabaseConnection::requireConnection() called at [<phutil>/src/aphront/storage/connection/mysql/AphrontBaseMySQLDatabaseConnection.php:156] #3 AphrontBaseMySQLDatabaseConnection::executeRawQuery(string) called at [... (1,179 more bytes) ... (Run with `--trace` for a full exception trace.)
I don't like requiring parent::execute() to be called -- can we final the method, get the lock, and then have it call some abstract protected function executeWithLock() or whatever? Or maybe don't final the execute() since a couple of these workflows don't need the lock (see next section).
We don't need a lock for shell or renamespace, and getting a lock for shell is probably negative/undesirable (limits 1 total active shell per instance?). Locking renamespace is probably moot, but the operation is purely a local file processing disk-oriented operation.
Locking probe, status, etc., isn't really necessary but does help reduce the possibility of generating misleading/confusing output, I think, so leaving those locks seems fine.
The unit tests are failing because PhabricatorGlobalLock uses this code to connect to the database:
// NOTE: Using the 'repository' database somewhat arbitrarily, mostly // because the first client of locks is the repository daemons. We must // always use the same database for all locks, but don't access any // tables so we could use any valid database. We could build a // database-free connection instead, but that's kind of messy and we // might forget about it in the future if we vertically partition the // application. $dao = new PhabricatorRepository(); // NOTE: Using "force_new" to make sure each lock is on its own // connection. $conn = $dao->establishConnection('w', $force_new = true);
...but on a clean database when you first run bin/storage upgrade, the Repository database won't exist yet.
I don't have a terribly clean fix for this. PhabricatorGlobalLock also tries to pool connections. I guess we can do this:
- In PhabricatorGlobalLock, add setUseThisSpecificConnection($conn) or whatever.
- If that property is set, ignore the connection pooling and connection establishment and just use the provided connection explicitly.
- In PhabricatorStorageManagementWorkflow->lock(), get a database-less connection with $this->getAPI()->getConn(null) and pass it to the lock before calling lock().
We'll need to figure out how to deal with this if/when we do shard hosts, but I don't think it makes that problem any more complex (we have the same set of issues in all storage management API code in general).
src/infrastructure/util/PhabricatorGlobalLock.php | ||
---|---|---|
65 | This should just be MySQLDatabaseConnection (or DatabaseConnection? Or something? Go one level up the class tree.), not MySQLi specifically. |