Workers obtain 2-hour leases by default.
If a TransactionPublishWorker runs for more than two hours, it will lose its lease and be unable to update the task after completion.
In the original case, a very large change was pushed which touched approximately 100 packages and projects and presumably had a recipient list of several hundred users. From a --trace, it appears that much of the time was spent inefficiently rebuilding the Diff, probably during patch rendering. This can probably be brought down to nearly zero: recipients need to individually survive a visibility check, but do not need to individually rebuild the diff content.
More broadly, we can verify and/or renew the lease before moving from mail construction to mail queueing. This would reduce the bad case (mail loop) into a minor nuisance (worker loop with no side effects).