Page MenuHomePhabricator

Herald rules with mailing lists are broken after mailing list -> user migration
Closed, ResolvedPublic

Description

We have many Herald rules whose action was to email a mailing list. After the migration in D13128, all of these Herald rules now say:

Add emails to CC Unknown Object (MLST)

(I have yet to test what happens when such a Herald rule is triggered.)

Given that we have ~500 Herald rules total, probably several dozen with mailing list actions, and the rule doesn't say in the UI which mailing list it used to refer to, I see no reasonable manual way to fix all of these. It would be great if there was a migration we could run to automatically fix all these Herald rules.

Event Timeline

jhurwitz raised the priority of this task from to Needs Triage.
jhurwitz updated the task description. (Show Details)
jhurwitz added a project: Restricted Project.
jhurwitz added subscribers: jhurwitz, angie.

Worth adding: most of these were not intentionally created to refer to mailing lists. I think they used to just refer to plain email addresses, but then some migration in the past detected all of them, automatically created mailing lists for them, and automatically changed the Herald rules to refer to the mailing lists. So, I wouldn't be surprised if many installs run into similar issues.

When I run arc diff and attempt to trigger such a Herald rule:

  1. arc diff succeeds, and the diff is created
  2. the arc diff output includes:
Exception
[HTTP/500] Internal Server Error
{"result":null,"error_code":"ERR-CONDUIT-CORE","error_info":"Edges are not available for objects of type 'MLST'!"}
  1. in the diff view (on the website), it looks as if no Herald rules were matched/triggered
jhurwitz moved this task from Restricted Project Column to Restricted Project Column on the Restricted Project board.Jun 6 2015, 4:10 AM
epriestley triaged this task as Normal priority.Jun 6 2015, 12:19 PM

Worth adding: most of these were not intentionally created to refer to mailing lists. I think they used to just refer to plain email addresses, but then some migration in the past detected all of them, automatically created mailing lists for them, and automatically changed the Herald rules to refer to the mailing lists. So, I wouldn't be surprised if many installs run into similar issues.

I don't think we've ever let you add a raw address or migrated raw addresses to mailing lists.

It's possible I'm wrong, but I didn't remove such a migration when I folded mailing lists into users recently, and don't remember writing one, and can't find any such migration by looking through all the migrations for "herald" or "mailing".

Mailing lists were a bit easier to create previously (didn't require administrative capabilities) -- is it possible users created them during that era of easy access to the tool?

If you haven't run into it, T8398 has general context and guidance on this change. I've updated it to include the information here.

I don't think this really matters if you've decided to fix the issue, but in case you're curious:

is it possible users created them during that era of easy access to the tool?

The naming scheme of our earliest mailing lists was consistent enough that they were almost certainly created by a migration and not a dozen different people all creating their own lists using the tool.

I think the migration ran when we upgraded on April 20, at which point we ran all the migrations from ~mid-January up to that point.

I believe the error is now fixed at HEAD.

D13184 will try to update rules automatically. It's a little bit conservative so it may not get everything, but let me know if it doesn't work and/or misses things.

If there are a handful of unusual cases left, the data is intact in the database, just no longer linked up in the UI. Specifically, you can find it in these tables:

  • Take the rule ID of the problematic rule, like H123.
  • phabricator_herald.herald_condition and phabricator_herald.herald_action have the conditions and actions for this rule in the rows where ruleID = 123.
  • These are stored as JSON blobs with lists of PHIDs, which will include PHID-MLST-... for mailing lists.
  • You can identify the addresses for these lists in phabricator_metamta.metamta_mailinglist, by finding the PHID in the phid column and reading the address from the email column.

The migration in D13184 will basically try to do this automatically, so my expectation is that everything will just be correct after that runs, but let me know how it goes.

We saw a bunch of lines like:

Updated mailing lists in Herald action 1466.

Followed by:

Storage is up to date. Use 'storage status' for details.

And the Herald rules seem up-to-date and functional! Unfortunately bin/storage upgrade hung after printing out the "up to date" line, I think on the adjust step due to D13164. After ~5m that was still at 0.0% progress so we ctrl+c'ed out of it, but I think that's a separate discussion to have.

It's only issuing one statement so it will jump from 0% to 100% after adding the key, but it might take a bit if you have a lot of data in the table.

You don't need the key for now, it just improves the performance of bin/remove destroy for complex/large objects.

jhurwitz moved this task from Restricted Project Column to Restricted Project Column on the Restricted Project board.Jun 9 2015, 5:17 AM

Presuming resolved, yell if anyone's still running into issues.

Not sure the migration works as intended.

For us, the migration from mailing list to mailing list user crashed due to not being able to create a user account which had the same email address as an existing user. (We had old disabled users with Flowdock inbox email addresses which we used for notifications before the introduction of mailing lists and the "send email to X" herald action.)

Rerunning the migrations, they passed (presumably skipping the bad migration), but now we have "Unknown Object (MLST)" as subscribers everywhere (and we had bad Herald rules).

To resolve the issue I manually edited the database to change the email address of the old disabled users so that proper new "mailing list" users could be created. Then I created said users and updated the Herald rules by hand. What seems more difficult is to delete the Unknown Objects; attempts result in:

Edges are not available for objects of type 'MLST'!

(Running Phabricator stable 7c6320d2112109a2c7ec600be8ffd9285f137eb6).

Do you still happen to have the original error message? That was unexpected, and the migration specifically accounted for the case of email collisions and I recall testing that case locally in detail, so I'm curious what I missed. The expected behavior is that that list wouldn't migrate (and would emit a message), but other lists would:

https://secure.phabricator.com/diffusion/P/browse/master/resources/sql/autopatches/20150602.mlist.2.php;bb54957d9687a9278caae41b2f0918de548254a6$56-66

Since you've already moved on and repaired things by hand, we may not meaningfully be able to identify or resolve the root cause. Since it sounds like you're comfortable working with the data, your best bet for cleaning up the "Unknown Objects" might be just removing them from the value column in phabricator_herald.herald_condition and phabricator_herald.herald_action -- just identify any JSON value which is a list of PHIDs including -MLST- PHIDs, and scrub those out. Not sure if that's satisfactory.

I wish I had saved the logs, but alas.

Right, I see you covered the case. Just writing from memory so it may be that your code for existing emails actually did work. Maybe I just misread its informative message as the source of the crash. The upgrade did crash, that I know, because I had to rerun it, but I might not know why.

Thanks for your note on how to fix things. Actually I fixed the Herald rules by merely editing them in the web UI. The trouble is in diffs where the mailing list was a subscriber. Taking a quick look at the data, there are many such references in phabricator_differential:

SELECT * FROM `edge` WHERE `dst` LIKE '%-MLST-%';
...
src	type	dst	dateCreated	seq	dataID
PHID-DREV-23kmmjdr2c2sisng35sk	21	PHID-MLST-pqodhs6mujgmbtzzchqb	1429714357	0	NULL
PHID-DREV-24cvgbh6aenodim4vi4b	21	PHID-MLST-2hio7tm747hfjfoz2aus	1413391633	0	NULL

Based on the code you linked to I can see how to UPDATE these myself to the new mailing list users: just swap the old PHID to the new one. I'll give that a shot later.

Going back to tracking down the original problem, I found one error in /var/tmp/phd/log/daemons.log. I guess it's unrelated since the upgrader is not a daemon, but here it is for reference:

[14-Jul-2015 13:54:44 UTC] [2015-07-14 13:54:44] EXCEPTION: (PhutilProxyException) Permanent failure while executing Task ID 402241. {>} (PhabricatorWorkerPermanentFailureException) Unable to load message! at [<phabricator>/src/applications/metamta/PhabricatorMetaMTAWorker.php:17]
[14-Jul-2015 13:54:44 UTC] arcanist(head=stable, ref.master=407af00ef613, ref.stable=d54cb072facd), phabricator(head=stable, ref.master=46c5e055a291, ref.stable=7c6320d21121), phutil(head=stable, ref.master=c2cd90ee7aec, ref.stable=83f09f6c5a03)
[14-Jul-2015 13:54:44 UTC]   #0 <#2> PhabricatorMetaMTAWorker::doWork() called at [<phabricator>/src/infrastructure/daemon/workers/PhabricatorWorker.php:91]
[14-Jul-2015 13:54:44 UTC]   #1 <#2> PhabricatorWorker::executeTask() called at [<phabricator>/src/infrastructure/daemon/workers/storage/PhabricatorWorkerActiveTask.php:162]
[14-Jul-2015 13:54:44 UTC]   #2 <#2> PhabricatorWorkerActiveTask::executeTask() called at [<phabricator>/src/infrastructure/daemon/workers/PhabricatorTaskmasterDaemon.php:22]
[14-Jul-2015 13:54:44 UTC]   #3 PhabricatorTaskmasterDaemon::run() called at [<phutil>/src/daemon/PhutilDaemon.php:183]
[14-Jul-2015 13:54:44 UTC]   #4 PhutilDaemon::execute() called at [<phutil>/scripts/daemon/exec/exec_daemon.php:125]

I also checked /var/log/syslog. Not sure if there are other files Phabricator logs to.