Page MenuHomePhabricator

Show task duplicates as related objects in Maniphest and migrate old duplicates

Authored by amckinley on May 27 2017, 1:14 AM.
Referenced Files
Unknown Object (File)
Sun, Jul 3, 9:45 AM
Unknown Object (File)
Sun, Jun 26, 7:48 AM
Unknown Object (File)
Sun, Jun 26, 12:27 AM
Unknown Object (File)
Fri, Jun 24, 3:17 AM
Unknown Object (File)
Mon, Jun 20, 11:18 AM
Unknown Object (File)
Thu, Jun 16, 4:26 PM
Unknown Object (File)
Wed, Jun 15, 2:31 AM
Unknown Object (File)
Mon, Jun 13, 12:39 AM
"Mountain of Wealth" token, awarded by epriestley.



Does the UI work that's part of T12234 and adds migrations for both of the old-style duplicate transactions.

Test Plan
  • Started with a clean DB.
  • Checked out really old code that marks tasks as dupes using comments.
  • Made a bunch of tasks and closed some as dupes. Made a bunch of additional comments.
  • Checked out D10427 and did a storage upgrade.
  • Made a bunch more new tasks and dupes.
  • Snapshotted DB.
  • Ran migration repeatedly until all expected edges showed up in the phabricator_maniphest.edgetable.

Diff Detail

rP Phabricator
Lint Not Applicable
Tests Not Applicable

Event Timeline

  • add migration for D10427-style task merges

The new UI stuff looks right, some probable improvements for the migration part.


Prefer the more modern withStatuses(<list of specific statuses>) -- I'd like to get rid of withStatus() eventually.


It's possible that we'll get more than one merge here. A task can be merged, reopened, then merged again. Or just merged, then merged. In these cases, executeOne() will find multiple rows and throw an exception ("Expected exactly zero or one rows!").


It's possible this will come back empty. Examples include:

  • Deleted users.
  • Operations performed by non-user actors, like applications, although merges "shouldn't" have any of these.

We also probably don't actually need to load the User object (see below).


Instead of doing this much legwork (which will also dump a "zombie" transaction at the end of the transaction list), I think it's sufficient to only write a new ManiphestTaskHasDuplicateTaskEdgeType::EDGECONST edge, using EdgeEditor. Then we won't put a duplicate merge from years ago into the transaction log, don't need to load the merged-into object, don't need to load the author, and don't need to care if any of those are invalid or have been deleted (it's perfectly fine to write an edge to a nonexistent object).

This revision now requires changes to proceed.May 29 2017, 4:25 PM

Tested the migration by reverting to a pre-D10427 revision, making a bunch of tasks with dupes, and then running the migration. Observed expected new rows in phabricator_maniphest.edge table.

Code for creating the relationship was stolen from PhabricatorSearchRelationshipController and banged on until it started working.


Is there a better content source for migrations?


This creates the edges as expected, but it also adds new transactions of type core:edge. Not sure if that's a problem or not (or if the transactions should be post-dated to the timestamp of the original close-as-dupe action).


What about the inverse edge type, ManiphestTaskIsDuplicateOfTaskEdgeType? Will that get written automatically when trying to add an`ManiphestTaskHasDuplicateTaskEdgeType`?

Yeah, the editor will automatically write an inverse edge and detect cycles. So the actual edit will prrrrrobably look something like this:

try {
  $editor = ...
} catch (PhabricatorEdgeCycleException $ex) {
  // Some earlier or later merge made this invalid, just skip it.

(It knows to write the inverse edge because ManiphestTaskHasDuplicateTaskEdgeType->getInverseEdgeConstant() identifies which one to write.)

amckinley edited edge metadata.
  • switch to using EdgeEditor and LiskMigrationIterator

Hmmm this code is definitely creating the new edges (and is definitely not writing any new transactions), but now the UI changes aren't actually showing any dupes.

Oh, maybe this is actually the wrong half of the edge? Should "mergedinto" become ManiphestTaskIsDuplicateOfTaskEdgeType (instead of HasDuplicate)? If so, the dupes UI would probably show up on the other task? The code looks right to me...

Oh, derp, you're right. Edge went the wrong way.

Should I bang out the 2nd migration as part of this diff?

Seems reasonable to me, the actual code isn't too bad so far once we figure out what we actually need.

  • migrate both types
  • update to correctly handle both kinds of dupe logic
  • cleanup
amckinley retitled this revision from Show task duplicates as related objects in Maniphest to Show task duplicates as related objects in Maniphest and migrate old duplicates.Jun 7 2017, 8:04 PM
amckinley edited the summary of this revision. (Show Details)
amckinley edited the test plan for this revision. (Show Details)

add_duplicate_edge() has global scope, so possibly consider a construct like this instead to avoid defining a function:

foreach ($stuff as $thing) {
  $add_edges = array();
  if (1) {
   $add_edges[] = ..;
  } else if (2) {
   $add_edges[] = ...;
  if ($add_edges) {
    id(new EdgeEditor())->...

Defining the function doesn't cause any concrete problems, it's just a bit of a ragged edge that could cause surprises with third-party migrations, third-party libraries, etc., some day.

This revision is now accepted and ready to land.Jun 7 2017, 8:17 PM
This revision was automatically updated to reflect the committed changes.

Ran into this upgrading on old instance on my laptop - anything I can look into?

Applying patch "phabricator:20170528.maniphestdupes.php" to host "localhost"...
[2017-06-07 21:58:46] EXCEPTION: (AphrontQueryException) #1406: Data too long for column 'contentHash' at row 1 at [<phutil>/src/aphront/storage/connection/mysql/AphrontBaseMySQLDatabaseConnection.php:355]
arcanist(head=master, ref.master=129d51fa0936), phabricator(head=master, ref.master=c3f905af6c7e), phutil(head=master, ref.master=74a1350416eb)
  #0 AphrontBaseMySQLDatabaseConnection::throwQueryCodeException(integer, string) called at [<phutil>/src/aphront/storage/connection/mysql/AphrontBaseMySQLDatabaseConnection.php:289]

Is that the entire stack trace? The next few frames would probably give us a clearer idea of what happened.


Applying patch "phabricator:20170528.maniphestdupes.php" to host "localhost"...
[2017-06-08 06:43:41] EXCEPTION: (AphrontQueryException) #1406: Data too long for column 'contentHash' at row 1 at [<phutil>/src/aphront/storage/connection/mysql/AphrontBaseMySQLDatabaseConnection.php:355]
arcanist(head=master, ref.master=129d51fa0936), phabricator(head=master, ref.master=c3f905af6c7e), phutil(head=master, ref.master=74a1350416eb)
  #0 AphrontBaseMySQLDatabaseConnection::throwQueryCodeException(integer, string) called at [<phutil>/src/aphront/storage/connection/mysql/AphrontBaseMySQLDatabaseConnection.php:289]
  #1 AphrontBaseMySQLDatabaseConnection::throwQueryException(mysqli) called at [<phutil>/src/aphront/storage/connection/mysql/AphrontBaseMySQLDatabaseConnection.php:185]
  #2 AphrontBaseMySQLDatabaseConnection::executeRawQuery(string) called at [<phutil>/src/xsprintf/queryfx.php:8]
  #3 queryfx(AphrontMySQLiDatabaseConnection, string, string, string, array, string)
  #4 call_user_func_array(string, array) called at [<phutil>/src/aphront/storage/connection/AphrontDatabaseConnection.php:58]
  #5 AphrontDatabaseConnection::query(string, string, string, array, string) called at [<phabricator>/src/infrastructure/storage/lisk/LiskDAO.php:1262]
  #6 LiskDAO::insertRecordIntoDatabase(string) called at [<phabricator>/src/infrastructure/storage/lisk/LiskDAO.php:1106]
  #7 LiskDAO::insert() called at [<phabricator>/src/infrastructure/storage/lisk/LiskDAO.php:1075]
  #8 LiskDAO::save() called at [<phabricator>/src/applications/files/storage/PhabricatorFile.php:153]
  #9 PhabricatorFile::save() called at [<phabricator>/src/applications/files/storage/PhabricatorFile.php:157]
  #10 PhabricatorFile::saveAndIndex() called at [<phabricator>/src/applications/files/storage/PhabricatorFile.php:404]
  #11 PhabricatorFile::buildFromFileData(string, array) called at [<phabricator>/src/applications/files/storage/PhabricatorFile.php:419]
  #12 PhabricatorFile::newFromFileData(string, array) called at [<phabricator>/src/applications/files/storage/PhabricatorFile.php:1128]
  #13 PhabricatorFile::loadBuiltins(PhabricatorUser, array) called at [<phabricator>/src/applications/files/storage/PhabricatorFile.php:1167]
  #14 PhabricatorFile::loadBuiltin(PhabricatorUser, string) called at [<phabricator>/src/applications/project/query/PhabricatorProjectQuery.php:382]
  #15 PhabricatorProjectQuery::didFilterPage(array) called at [<phabricator>/src/infrastructure/query/policy/PhabricatorPolicyAwareQuery.php:273]
  #16 PhabricatorPolicyAwareQuery::execute() called at [<phabricator>/src/applications/phid/type/PhabricatorPHIDType.php:91]
  #17 PhabricatorPHIDType::loadObjects(PhabricatorObjectQuery, array) called at [<phabricator>/src/applications/phid/query/PhabricatorObjectQuery.php:153]
  #18 PhabricatorObjectQuery::loadObjectsByPHID(array, array) called at [<phabricator>/src/applications/phid/query/PhabricatorObjectQuery.php:73]
  #19 PhabricatorPolicyAwareQuery::execute() called at [<phabricator>/src/applications/phid/query/PhabricatorHandleQuery.php:46]
  #20 PhabricatorHandleQuery::loadPage() called at [<phabricator>/src/infrastructure/query/policy/PhabricatorPolicyAwareQuery.php:236]
  #21 PhabricatorPolicyAwareQuery::execute() called at [<phabricator>/src/applications/phid/handle/pool/PhabricatorHandlePool.php:73]
  #22 PhabricatorHandlePool::loadPHIDs(array) called at [<phabricator>/src/applications/phid/handle/pool/PhabricatorHandleList.php:44]
  #23 PhabricatorHandleList::loadHandles() called at [<phabricator>/src/applications/phid/handle/pool/PhabricatorHandleList.php:49]
  #24 PhabricatorHandleList::getHandle(string) called at [<phabricator>/src/applications/phid/handle/pool/PhabricatorHandleList.php:130]
  #25 PhabricatorHandleList::current()
  #26 iterator_to_array(PhabricatorHandleList) called at [<phabricator>/src/applications/transactions/query/PhabricatorApplicationTransactionQuery.php:151]
  #27 PhabricatorApplicationTransactionQuery::willFilterPage(array) called at [<phabricator>/src/infrastructure/query/policy/PhabricatorPolicyAwareQuery.php:247]
  #28 PhabricatorPolicyAwareQuery::execute() called at [<phabricator>/resources/sql/autopatches/20170528.maniphestdupes.php:26]
  #29 require_once(string) called at [<phabricator>/src/infrastructure/storage/management/PhabricatorStorageManagementAPI.php:285]
  #30 PhabricatorStorageManagementAPI::applyPatchPHP(string) called at [<phabricator>/src/infrastructure/storage/management/PhabricatorStorageManagementAPI.php:241]
  #31 PhabricatorStorageManagementAPI::applyPatch(PhabricatorStoragePatch) called at [<phabricator>/src/infrastructure/storage/management/workflow/PhabricatorStorageManagementWorkflow.php:1094]
  #32 PhabricatorStorageManagementWorkflow::doUpgradeSchemata(array, NULL, boolean, boolean) called at [<phabricator>/src/infrastructure/storage/management/workflow/PhabricatorStorageManagementWorkflow.php:840]
  #33 PhabricatorStorageManagementWorkflow::upgradeSchemata(array, NULL, boolean, boolean) called at [<phabricator>/src/infrastructure/storage/management/workflow/PhabricatorStorageManagementUpgradeWorkflow.php:78]
  #34 PhabricatorStorageManagementUpgradeWorkflow::didExecute(PhutilArgumentParser) called at [<phabricator>/src/infrastructure/storage/management/workflow/PhabricatorStorageManagementWorkflow.php:107]
  #35 PhabricatorStorageManagementWorkflow::execute(PhutilArgumentParser) called at [<phutil>/src/parser/argument/PhutilArgumentParser.php:441]
  #36 PhutilArgumentParser::parseWorkflowsFull(array) called at [<phutil>/src/parser/argument/PhutilArgumentParser.php:333]
  #37 PhutilArgumentParser::parseWorkflows(array) called at [<phabricator>/scripts/sql/manage_storage.php:249]

I'm setting up an old laptop for my trip, so it might have been a year since my last update, could be other issues (mac os)

Yuck, that's quite a mess.

It looks like we're loading transaction, which loads handles to make sure you can see the objects the transactions are about. This loads project handles, and projects load their profile images when their handles load. This may make us generate and load a new profile image, which involves writing a new file with a content hash.

The size of the content hash was 40 bytes until D17620, when it became 64 bytes. But since it's an adjustment, not a migration, it runs after migrations.

The cleanest fix is probably to sneak a retroactive 20170404 migration into the migration list, which explicitly changes the field width to 64 bytes.

Even with D18107 this may eventually break if the files or transactions tables see another change in the future, but that will probably be 6-12 months down the road and we can reasonably say "we tried our best", no-op this, and installs that only upgrade once every 3 years will just lose these old edges, which is completely fine.

This transaction could instead be made more robust by replacing ManiphestTransactionQuery and ManiphestTaskQuery with raw queries against the underlying tables, but this would be a large amount of extra work and I don't think it's ultimately worthwhile -- we've already gone above the call of duty to get this migration working.