Page MenuHomePhabricator

Add a lock to storage upgrade and adjustment
ClosedPublic

Authored by joshuaspence on Nov 11 2015, 12:17 PM.
Tags
None
Referenced Files
Unknown Object (File)
Fri, May 3, 6:00 AM
Unknown Object (File)
Thu, Apr 25, 1:00 AM
Unknown Object (File)
Wed, Apr 17, 12:58 PM
Unknown Object (File)
Sun, Apr 14, 3:12 PM
Unknown Object (File)
Fri, Apr 12, 7:57 PM
Unknown Object (File)
Thu, Apr 11, 8:44 AM
Unknown Object (File)
Wed, Apr 10, 12:43 PM
Unknown Object (File)
Wed, Apr 10, 9:48 AM
Subscribers

Details

Summary

Fixes T9715. Adds a MySQL-based lock to ensure that schema migrations are not applied on multiple hosts simultaneously.

Test Plan

Ran ./bin/storage upgrade concurrently. One invocation was successful whilst the other hit a PhutilLockException.

Diff Detail

Repository
rP Phabricator
Lint
Lint Not Applicable
Unit
Tests Not Applicable

Event Timeline

joshuaspence retitled this revision from to Add a lock to storage upgrade and adjustment.
joshuaspence updated this object.
joshuaspence edited the test plan for this revision. (Show Details)
joshuaspence added a reviewer: epriestley.

I'm not entirely sure why the unit tests fail...

Exception
Command failed with error #255!
COMMAND
php '/home/josh/workspace/github.com/phacility/phabricator/scripts/sql/manage_storage.php' upgrade --force --no-adjust --namespace 'phabricator_unittest_kucdz6axnh3ugtmzfbk76g2b'

STDOUT
(empty)

STDERR
[2015-11-11 23:15:08] EXCEPTION: (AphrontConnectionQueryException) Attempt to connect to root@localhost failed with error #1049: Unknown database 'phabricator_unittest_kucdz6axnh3ugtmzfbk76g2b_repository'. at [<phutil>/src/aphront/storage/connection/mysql/AphrontMySQLiDatabaseConnection.php:72]
arcanist(head=master, ref.master=2db40f995337), phabricator(head=master, ref.master=aa11cd4d2b7f, custom=1), phutil(head=master, ref.master=66bf71f94817)
  #0 AphrontMySQLiDatabaseConnection::connect() called at [<phutil>/src/aphront/storage/connection/mysql/AphrontBaseMySQLDatabaseConnection.php:97]
  #1 AphrontBaseMySQLDatabaseConnection::establishConnection() called at [<phutil>/src/aphront/storage/connection/mysql/AphrontBaseMySQLDatabaseConnection.php:120]
  #2 AphrontBaseMySQLDatabaseConnection::requireConnection() called at [<phutil>/src/aphront/storage/connection/mysql/AphrontBaseMySQLDatabaseConnection.php:156]
  #3 AphrontBaseMySQLDatabaseConnection::executeRawQuery(string) called at [... (808 more bytes) ...
(Run with `--trace` for a full exception trace.)

Don't use a static method.

The lock returned by PhabricatorGlobalLock::newLock is unique for the storage namespace, so it's not necessary to use a static method (in fact, it's wrong).

Unfortunately the unit tests are still failing...

Exception
Command failed with error #255!
COMMAND
php '/home/josh/workspace/github.com/phacility/phabricator/scripts/sql/manage_storage.php' upgrade --force --no-adjust --namespace 'phabricator_unittest_6pf7om7fpm5jxmcjwt3losy7'

STDOUT
Loading quickstart template...
Applying patch 'phabricator:db.xhpast'...
Applying patch 'phabricator:20150906.mailinglist.sql'...
Applying patch 'phabricator:20150930.drydock.log.1.sql'...
Applying patch 'phabricator:20151001.drydock.rname.1.sql'...
Applying patch 'phabricator:20151002.dashboard.status.1.sql'...
Applying patch 'phabricator:20151002.harbormaster.bparam.1.sql'...
Applying patch 'phabricator:20151009.drydock.auth.1.sql'...
Applying patch 'phabricator:20151010.drydock.auth.2.sql'...
Applying patch 'phabricator:20151013.drydock.op.1.sql'...
Applying patch 'phabricator:20151023.harborpolicy.1.sql'...
Applying patch 'phabricator:20151023.harborpolicy.2.php'...
Applying patch 'phabricator:20151023.patchduration.sql'...
Applying patch 'phabricator:20151030.harbormaster.initiator.sql'...
Applying patch 'phabricator:20151106.editengine.1.table.sql'...
Applying patch 'phabricator:20151106.editengine.2.xactions.sql'...
Applying patch 'phabricator:20151106.phame.post.mailkey.1.sql'.... (691 more bytes) ...

STDERR
[2015-11-15 13:40:22] EXCEPTION: (Exception) Lock 'ph:phabric-7FvKCwb1Slte:PhabricatorStorageManagementWorkflow is not locked by this process! at [<phutil>/src/filesystem/PhutilLock.php:196]
arcanist(head=master, ref.master=9dd6eafb5254), phabricator(head=master, ref.master=1363aae8687b, custom=1), phutil(head=master, ref.master=e9ed72483a14)
  #0 PhutilLock::unlock() called at [<phabricator>/src/infrastructure/storage/management/workflow/PhabricatorStorageManagementWorkflow.php:723]
  #1 PhabricatorStorageManagementWorkflow::upgradeSchemata(NULL, boolean, boolean) called at [<phabricator>/src/infrastructure/storage/management/workflow/PhabricatorStorageManagementUpgradeWorkflow.php:78]
  #2 PhabricatorStorageManagementUpgradeWorkflow::execute(PhutilArgumentParser) called at [<phutil>/src/parser/argument/PhutilArgumentParser.php:406]
  #3 PhutilArgumentParser::parseWorkflowsFull(array) called at [<phutil>/src/parser/argument/PhutilArgumentParser.php:301]
  #4 PhutilArgumentParser::pars... (79 more bytes) ...
(Run with `--trace` for a full exception trace.)

Remove some unrelated changes

You have to call $lock->lock() to actually acquire the lock: lock acquisition is not a side effect of lock construction.

This code calls $workflow->lock(), which does new Lock(), but nothing actually calls $lock->lock(). My expectation is that the lock is not acquired, which is why it fails to release.

Your test plan should involve running bin/storage upgrade multiple times in parallel and observing the processes serialize their execution (and/or abort with lock failures).

epriestley edited edge metadata.
This revision now requires changes to proceed.Nov 15 2015, 3:34 PM

You have to call $lock->lock() to actually acquire the lock: lock acquisition is not a side effect of lock construction.

Weird, I thought I had done that... will fix now.

The unit tests still aren't working...

Exception
Command failed with error #255!
COMMAND
php '/home/josh/workspace/github.com/phacility/phabricator/scripts/sql/manage_storage.php' upgrade --force --no-adjust --namespace 'phabricator_unittest_pwmcasqzaya64mzxrxyfkhua'

STDOUT
(empty)

STDERR
[2015-11-16 07:38:13] EXCEPTION: (AphrontConnectionQueryException) Attempt to connect to root@localhost failed with error #1049: Unknown database 'phabricator_unittest_pwmcasqzaya64mzxrxyfkhua_repository'. at [<phutil>/src/aphront/storage/connection/mysql/AphrontMySQLiDatabaseConnection.php:72]
arcanist(head=master, ref.master=66ab1c955d27), phabricator(head=master, ref.master=fce301da06dc, custom=1), phutil(head=master, ref.master=e9ed72483a14)
  #0 AphrontMySQLiDatabaseConnection::connect() called at [<phutil>/src/aphront/storage/connection/mysql/AphrontBaseMySQLDatabaseConnection.php:97]
  #1 AphrontBaseMySQLDatabaseConnection::establishConnection() called at [<phutil>/src/aphront/storage/connection/mysql/AphrontBaseMySQLDatabaseConnection.php:120]
  #2 AphrontBaseMySQLDatabaseConnection::requireConnection() called at [<phutil>/src/aphront/storage/connection/mysql/AphrontBaseMySQLDatabaseConnection.php:156]
  #3 AphrontBaseMySQLDatabaseConnection::executeRawQuery(string) called at [... (1,179 more bytes) ...
(Run with `--trace` for a full exception trace.)
joshuaspence edited edge metadata.
epriestley edited edge metadata.

I don't like requiring parent::execute() to be called -- can we final the method, get the lock, and then have it call some abstract protected function executeWithLock() or whatever? Or maybe don't final the execute() since a couple of these workflows don't need the lock (see next section).


We don't need a lock for shell or renamespace, and getting a lock for shell is probably negative/undesirable (limits 1 total active shell per instance?). Locking renamespace is probably moot, but the operation is purely a local file processing disk-oriented operation.

Locking probe, status, etc., isn't really necessary but does help reduce the possibility of generating misleading/confusing output, I think, so leaving those locks seems fine.


The unit tests are failing because PhabricatorGlobalLock uses this code to connect to the database:

// NOTE: Using the 'repository' database somewhat arbitrarily, mostly
// because the first client of locks is the repository daemons. We must
// always use the same database for all locks, but don't access any
// tables so we could use any valid database. We could build a
// database-free connection instead, but that's kind of messy and we
// might forget about it in the future if we vertically partition the
// application.
$dao = new PhabricatorRepository();

// NOTE: Using "force_new" to make sure each lock is on its own
// connection.
$conn = $dao->establishConnection('w', $force_new = true);

...but on a clean database when you first run bin/storage upgrade, the Repository database won't exist yet.

I don't have a terribly clean fix for this. PhabricatorGlobalLock also tries to pool connections. I guess we can do this:

  • In PhabricatorGlobalLock, add setUseThisSpecificConnection($conn) or whatever.
  • If that property is set, ignore the connection pooling and connection establishment and just use the provided connection explicitly.
  • In PhabricatorStorageManagementWorkflow->lock(), get a database-less connection with $this->getAPI()->getConn(null) and pass it to the lock before calling lock().

We'll need to figure out how to deal with this if/when we do shard hosts, but I don't think it makes that problem any more complex (we have the same set of issues in all storage management API code in general).

This revision now requires changes to proceed.Nov 23 2015, 4:33 PM
joshuaspence edited edge metadata.

Add useSpecificConnection method

epriestley edited edge metadata.
epriestley added inline comments.
src/infrastructure/util/PhabricatorGlobalLock.php
65

This should just be MySQLDatabaseConnection (or DatabaseConnection? Or something? Go one level up the class tree.), not MySQLi specifically.

This revision is now accepted and ready to land.Nov 30 2015, 9:15 PM
joshuaspence marked an inline comment as done.
joshuaspence edited edge metadata.

Change typehint

This revision was automatically updated to reflect the committed changes.