Page MenuHomePhabricator

MetaMTA worker can win a race against MTAMail despite both being inserted in the same transaction, because they aren't actually inserted in the same transaction
Closed, ResolvedPublic

Description

To reproduce:

  • Add sleep(15) in PhabricatorMetaMTAMail->save() before $this->saveTransaction().
  • Run bin/phd debug task in two windows.
  • Take an action which sends mail (like commenting on a task).

Expect:

  • Mail sent 15 seconds later.

Actual:

  • MetaMTA worker is picked up immediately and fails to load the mail row.
2017-06-14 9:55:49 PM [STDE] [2017-06-14 21:55:49] EXCEPTION: (PhutilProxyException) Task "2171049" encountered a permanent failure and was cancelled. {>} (PhabricatorWorkerPermanentFailureException) Unable to load mail message (with ID "82852") while preparing to deliver it. at [<phabricator>/src/applications/metamta/PhabricatorMetaMTAWorker.php:35]

My understanding of how transaction isolation levels work is either missing some pieces, or there's some sort of other bug here.

Event Timeline

Oh, this doesn't isolate things because they're on different databases, and thus we establish different connections. The daemon insert does not happen inside a transaction.

If it did, this would work like I expect:

MySQL-A
mysql> BEGIN;
Query OK, 0 rows affected (0.00 sec)

mysql> INSERT INTO worker_activetask (id, taskClass, failureCount, priority) values (123, "A", 0, 0);
Query OK, 1 row affected (0.00 sec)
...
MySQL-B
mysql> SELECT * FROM worker_activetask;
Empty set (0.00 sec)
MySQL-A
...
mysql> COMMIT;
Query OK, 0 rows affected (0.00 sec)
MySQL-B
mysql> SELECT * FROM worker_activetask;
+-----+-----------+------------+--------------+--------------+--------+-------------+----------+------------+
| id  | taskClass | leaseOwner | leaseExpires | failureCount | dataID | failureTime | priority | objectPHID |
+-----+-----------+------------+--------------+--------------+--------+-------------+----------+------------+
| 123 | A         | NULL       |         NULL |            0 |   NULL |        NULL |        0 | NULL       |
+-----+-----------+------------+--------------+--------------+--------+-------------+----------+------------+
1 row in set (0.01 sec)

The code just doesn't work, and I overlooked that since I assumed I would only write working code.

We could fake this but I'm just going to move the task insertion outside of the transaction since we have bin/mail resend anyway if this legitimately fails for some bizarre reason. This is unusual and not difficult to recover from.

epriestley renamed this task from MetaMTA worker can win a race against MTAMail despite both being inserted in the same transaction to MetaMTA worker can win a race against MTAMail despite both being inserted in the same transaction, because they aren't actually inserted in the same transaction.Jun 14 2017, 7:19 PM