Page MenuHomePhabricator

Allow the SMTP mailer to be configured as a "Does Not Support Message-ID Header" mailer
Closed, ResolvedPublic

Description

When sending mail via SMTP, the outbound pathway may be something like SMTPSES. In this case, we can't set a "Message-ID" header: even though it may be respected by the local MTA, it will be discarded by SES.

Today, we'll incorrectly believe that we can set a "Message-ID" header: we can with vanilla SMTP servers, and configuring a problematic service via SMTP is somewhat unusual because we have direct bindings for the most popular ones. Still, it's reasonable to configure them via SMTP instead.

When we guess wrong about whether we can set "Message-ID", we get less-than-optimal threading behavior.

To allow these mailers to be configured so we guess right, I'm planning to:

  • Add a true/false/null flag to the SMTP and sendmail mailers, called something like 'supports-message-id'.
  • A value of true or false explicitly configures our detection of supportsMessageIDHeader().
  • A value of null (default) causes us to guess based on the SMTP hostname. This doesn't have to be terribly exhaustive/reliable, but we can get it right in some reasonable cases (e.g., if we're doing SMTP directly to SES), and problems here are generally difficult to detect/identify (they look like "threading is bad", which is many steps removed from "configure an obscure secondary option in your outbound mailer setup") so there's probably enough value in this to justify a list of regexps living in the codebase.

Originally

We have upgraded to 2019 Week 8. Gmail, even when "Vary Subjects" is turned off, now threads the first email about a Differential revision in a separate thread from all subsequent emails about the same revision. I believe the difference is that the first email no longer specifies a (fictional) In-Reply-To identifier.

Before the upgrade:

# First email
Message-ID: <010101693a7e588f-7613c74e-e583-4270-8b10-02decd239f82-000000@us-west-2.amazonses.com>
In-Reply-To: <differential-rev-PHID-DREV-yoje66fdu52hyk3rahir-req@dropbox.com>

# Second email
Message-ID: <010101693a87e12d-0d5987de-7f7e-4f05-835f-4dca8f317ddd-000000@us-west-2.amazonses.com>
In-Reply-To: <differential-rev-PHID-DREV-yoje66fdu52hyk3rahir-req@dropbox.com>

This previous behavior violated standards — the In-Reply-To identifier is supposed to be the Message-ID of a previously delivered email — but at least it wove a consistent fiction for mail clients. But following the upgrade:

# First email
Message-ID: <010101693a7e588f-7613c74e-e583-4270-8b10-02decd239f82-000000@us-west-2.amazonses.com>

# Second email
Message-ID: <010101693a87e12d-0d5987de-7f7e-4f05-835f-4dca8f317ddd-000000@us-west-2.amazonses.com>
In-Reply-To: <differential-rev-PHID-DREV-yoje66fdu52hyk3rahir-req@dropbox.com>

I vaguely suspect b5797ce60a35e1b5d395598dec5dc22d57288120 but am not at leisure to actually do any bisection today (or to figure out the new email order of operations to learn when the header gets added).

Would it be possible either:

  • For the fictional In-Reply-To to be restored to the initial email in each series.
  • Or, for the initial email to have as its Message-ID the same identifier that all subsequent In-Reply-To's will specify?

Either tweak would get threading working again for our myriad developers.

While I'm typing, I should share that the fusillade of tags that Phabricator now attaches to the bottom of every email have revolutionized our ability to use Gmail's weak filtering system with Phabricator notifications, and has made many developers much happier with their experience! Thanks for implementing that.

Event Timeline

Have you possibly changed mailers from "SES" to "SES via SMTP" or something like that? The logic here is basically:

if ($is_first && $mailer->supportsMessageIDHeader()) {
  $headers[] = $this->newEmailHeader('Message-ID',  $thread_id);
} else {
  $headers[] = $this->newEmailHeader('In-Reply-To', $in_reply_to);
  $headers[] = $this->newEmailHeader('References',  $references);
}

...and does not appear to have changed in a very long time (it's present in approximately the same form in D13878 in 2015 as it is today as of D19955).

If the value of $mailer->supportsMessageIDHeader() changed, that could change the behavior. My expectation is that this value is false for SES, but the SES Message-ID might be SES-via-SMTP, and we assume we can set a message ID on SMTP.

Another possibility is that $is_first has changed because of Differential draft interactions, although I don't think those have been touched in about a year.

Upshot:

  • Does cluster.mailers configure SES as "ses", or as "smtp"?
  • How old were things before the upgrade? (Roughly: less or more than about a year old?)

If SES is configured as SMTP a possible workaround is to configure it as SES instead. Upstream, we could add a "this SMTP mailer will overwrite message ID flags" flag and hard-code a list of known services that overwrite things as the default behavior if this is the issue.

This previous behavior violated standards — the In-Reply-To identifier is supposed to be the Message-ID of a previously delivered email — but at least it wove a consistent fiction for mail clients.

See also T13204, perhaps.

Briefly, I claim that any standard/assertion that mail can't reference a fictional message is a big dum dum, because anyone can "Reply All" to an email and add a new recipient. That new recipient will receive a mail with headers that reference messages they did not receive. Clients must handle this mail sensibly when they receive it, and transport agents, at least, must be willing to transmit it.

I'm not sure what standard you're referring to exactly, but this rule could only take the shape of a "please play nice" recommendation suggesting that clients generating their own mail PRETTY PLEASE SHOULD NOT make up Message-IDs, I think?

(It's moot anyway because the message ID is unknowable when transmitting with several different mailers which ignore Message-ID values we supply in favor of their own (including SES), and we'd obviously prefer sensible client behavior over standards adherence even if we have to BE QUITE RUDE about fabricating message IDs. When the transmission pathway allows us to select a Message-ID, we attempt to use authentic references.)

How old were things before the upgrade? (Roughly: less or more than about a year old?)

More than a year old. This upgrade was our move to a cluster.mailers setting. We use sendmail directing mail to a local Postfix config that sometimes uses SES and sometimes Gmail to work around limits built into both systems.

Thanks VERY much for this context, it's identified the exact issue. I'll for the moment edit our supportsMessageIDHeader() for sendmail to return false and report back on whether it mitigates. I'll leave it up to you to decide whether that's too much of an edge case to support with a separate setting that could flip that method response from a config option.

Briefly, I claim that any standard/assertion that mail can't reference a fictional message is a big dum dum, because anyone can "Reply All" to an email and add a new recipient.

Agreed! Mail readers find themselves jumping in mid-thread all the time, and Gmail was clearly working great before with no real message ever having had the specified Message-ID. I merely wanted to have both possible resolutions on the table, one of which was to give the first message the ID later used In-Reply-To. (I was referencing RFC 2822's definition of the field.)

I merely wanted to have both possible resolutions on the table, one of which was to give the first message the ID later used In-Reply-To.

And I should add: a possibility that wouldn't have worked with SES in any case, because even if a matching message ID were generated, it would be promptly overwritten.

report back on whether it mitigates

Sounds good. Let me know how it goes, I think adding a flag for this is pretty reasonable even though it's a little messy.

It worked!

   public function supportsMessageIDHeader() {
+    // Dropbox addition: since SES is downstream of Postfix, message IDs
+    // get clobbered anyway.  So the first email about a revision won't
+    // thread with the rest of its emails, unless we tell Phabricator to
+    // use an alternative means (In-Reply-To) of threading them together.
+    return false;
+
     return true;
   }

I'm happy that Phabricator already includes a clever hack to get emails threading that otherwise would not thread, and that it was just a matter of flipping this switch to get it working again in our unusual case. I'll leave it to you whether this deserves a config option or not. Thanks for the guidance and background on which workarounds existed, and why they had turned off in our case.

epriestley renamed this task from Restore In-Reply-To field to first email about a Differential revision to Allow the SMTP mailer to be configured as a "Does Not Support Message-ID Header" mailer.Mar 14 2019, 1:03 PM
epriestley triaged this task as Normal priority.
epriestley updated the task description. (Show Details)
epriestley added a project: Mail.

There's probably some argument here for removing this behavior and assuming we can never generate a "Message-ID", but I'm hesitant to make changes like that since client threading is largely unknowable black magic which we can not test, and the current approach seems to generally produce acceptable results without an overwhelming amount of collateral complexity.

The only effect I can imagine of always setting the In-Reply-To would be to encourage threading where it's not currently happening, so removing the behavior and deleting all these little methods would be very attractive to me if it were my project. :)

I can imagine that cases like this may exist:

  • Exchange/Outlook can only thread if "Message-ID" works.
  • Companies that use Exchange/Outlook as a primary mail client almost always use Microsoft Active Mail Services For Server Windows Mailers 2019 Pro, which supports "Message-ID", so things work fine by accident.

Although we would probably expect that these hypothetical clients likely suffer broken threading when someone is added to a thread midstream, adding people mid-thread is somewhat rare and we might just not have heard of it, and it might be a minor annoyance in the scheme of broken client behavior.

Or they may use other headers to implement threading behavior in a subset of cases, possibly including "Thread-Index". We know that many versions of Apple "Mail.app" use the presence of a "Re:" prefix in the "Subject" header to control some threading, and T13204 suggests that at least one client (KMail) may not handle missing "Message-ID" headers well.

Removing the "Message-ID" support would possibly break things in these cases, and it's possible this is a relatively large number of users.

While I'm eager to remove the code, I'm not eager to deal with "threading doesn't work any more" bug reports, since they're very rarely anywhere near as easy to resolve as this one was.

I was imagining keeping the Message-ID, while having the first email also include the same value for In-Reply-To — I completely agree that removing the Message-ID would not work out. If the first email always included both fields, thus referencing itself as a reply to itself, I suppose there could be MUAs that would go into infinite loops. So, maybe too risky.

I also vaguely think some mailer API historically rejected mail with a "Message-ID" header with an actual error ("You can't set this header, we're going to set it for you.") rather than just ignoring/replacing it (maybe SendGrid?), but I might be misremembering or that behavior may have changed. We could still always set the header and drop it in the Adapter nowadays, this whole thing is just a nest of hornets and kicking it has pretty limited upside.