Page MenuHomePhabricator

Refactor mail to produce an intermediate "bag of strings" object in preparation for SMS
ClosedPublic

Authored by epriestley on Jan 4 2019, 6:06 PM.
Tags
None
Referenced Files
F14803668: D19955.diff
Sat, Jan 25, 4:02 PM
Unknown Object (File)
Sat, Jan 25, 3:05 AM
Unknown Object (File)
Sat, Jan 25, 3:05 AM
Unknown Object (File)
Sat, Jan 25, 3:04 AM
Unknown Object (File)
Tue, Jan 21, 9:57 AM
Unknown Object (File)
Fri, Jan 17, 9:28 AM
Unknown Object (File)
Fri, Jan 17, 9:28 AM
Unknown Object (File)
Fri, Jan 17, 9:24 AM
Subscribers
None

Details

Summary

Depends on D19954. Ref T920. This is a step toward a world where "Mailers" are generic and may send messages over a broader array of channels (email, SMS, postal mail).

There are a few major parts here:

  • Instead of calling $mailer->setSubject(), $mailer->setFrom(), etc., build in intermediate $message object first, then pass that to the mailer.
    • This breaks every mailer! This change on its own does not fix them. I plan to fix them in a series of "update mailer X", "update mailer Y" followups.
    • This generally makes the API easier to change in the far future, and in the near future supports mailers accepting different types of $message objects with the same API.
  • Pull the "build an email" stuff out into a PhabricatorMailEmailEngine. MetaMTAMail is already a huge object without also doing this translation step. This is just a separation/simplification change, but also tries to fight against MetaMTAMail getting 5K lines of email/sms/whatsapp/postal-mail code.
  • Try to rewrite the "build an email" stuff to be a bit more straightforward while making it generate objects. Prior to this change, it had this weird flow:
foreach ($properties as $key => $prop) {
  switch ($key) {
    case 'xyz':
      // ...
  }
}

This is just inherently somewhat hard to puzzle out, and it means that processing order depends on internal property order, which is quite surprising.

Test Plan

This breaks everything on its own; adapters must be updated to use the new API. See followups.

Diff Detail

Repository
rP Phabricator
Lint
Lint Not Applicable
Unit
Tests Not Applicable

Event Timeline

Harbormaster returned this revision to the author for changes because remote builds failed.Jan 4 2019, 6:07 PM
Harbormaster failed remote builds in B21484: Diff 47631!

This isn't a standalone change, but provides context for the next set of changes.

(But wait until I actually get things working again before reviewing it.)

I believe this diff has no logical/behavioral changes except the two UI things that show mail headers for unsent mail to the user (see inline), although I might have done something minor on purpose and forgotten about it.

I'll fix the newVoid/DefaultEmailAddress callers inline before this heads toward review, unit tests would catch them anyway. Tests currently fatal themselves to death since we try to call bad methods on mailers.

src/applications/metamta/controller/PhabricatorMetaMTAMailViewController.php
190–192

generateHeaders() is no longer a method on MetaMTAMail, since this is email-specific.

This (and the other similar callsite removed elsewhere) were mostly backward-compatibility code for older messages that did not store headers-as-delivered, so removing them now shouldn't really impact anything.

For sent messages, the only behavioral change is that very, very old messages will no longer show headers. These have likely all been GC'd by now anyway.

For unsent messages, this will no longer show the headers we plan to send. That seems like it's probably fine. We could build an Engine and pre-generate the headers if it turns out to be weird.

src/applications/metamta/storage/PhabricatorMetaMTAMail.php
600–602

I'm going to remove this method in the upcoming API changes, so I'm just dropping the callsite for now since I'm breaking everything anyway.

653–654

I'll fix this TODO while going through the mailers to update them, most of them are exception-only nowadays anyway.

702

This line is the primary goal the change as a whole is accomplishing. Software engineering!

1477–1491

Oh, these have a caller in PhabricatorMailUtil now that needs to be updated.

Oh, these have a caller in PhabricatorMailUtil now that needs to be updated.

D19957 fixes this and makes all the tests pass.

So far I haven't actually found any real problems with this change, but I have like 13 more adapters to convert and test.

epriestley retitled this revision from (DEFINITELY A DRAFT) Refactor mail to produce an intermediate "bag of strings" object in preparation for SMS to Refactor mail to produce an intermediate "bag of strings" object in preparation for SMS.

I think there are a couple of very small adjustments in followup diffs (like D19957 fixing some of the test/util stuff), but everything is converted now and it looks like this didn't actually need any significant changes.

This was pretty exhausting to review because I had to give up on figuring out what was just moved and what was actually new, so I just read all of everything and spot checked that a few things like the must encrypt implementation got moved faithfully.

As a side question, how does the moved code detection give up on changes like this? I guess there isn't a good visual way to express moves like "this file got its guts split into six chunks and hidden in six different horcruxes".

Also, wasn't the mailer implementation explicitly designated as a reasonable 3rd party extension point? I worry that we're going to get a lot of pitchforks for breaking someone's Aritic mailer or whatever.

Last question: I could try and tackle T12404 as something orthogonal to these changes but still relevant to breaking all our email code simultaneously if you think now's the time.

src/applications/metamta/engine/PhabricatorMailEmailEngine.php
367

Does this code just predate the introduction of foreach ($value as $key => $value) into PHP?

src/applications/metamta/storage/PhabricatorMetaMTAMail.php
672

The real policy check for file visibility happened much earlier in the email construction stack, right? I'm trying to come up with an attack where someone exfiltrates Phabricator objects by manually specifying the To address as russian-dropbox@sketchy.ru , but if you already control an account that can see the secret objects or edit their subscribers, the game is over.

This revision is now accepted and ready to land.Jan 16 2019, 6:52 PM

As a side question, how does the moved code detection give up on changes like this? I guess there isn't a good visual way to express moves like "this file got its guts split into six chunks and hidden in six different horcruxes".

Yeah, I think it was a mix of lots of (mostly surface-level) rewriting, separation, and general complexity.

Also, wasn't the mailer implementation explicitly designated as a reasonable 3rd party extension point? I worry that we're going to get a lot of pitchforks for breaking someone's Aritic mailer or whatever.

Yes, but I'm not aware of anyone actually building a mailer subclass (and I don't get any hits for "extends PhabricatorMailImplementationAdapter" on Google), and the actual rewrites for the Adapters are pretty straightforward. We could theoretically provide a "glue" adapter that simplifies the API changes, but I suspect the value is near-zero. If anyone shows up with a reasonable mailer and difficulty rewriting it, I'm (probably) happy to rewrite it for them, but we haven't really seen too much of this in the past unless I did something especially tricky.

Last question: I could try and tackle T12404 as something orthogonal to these changes but still relevant to breaking all our email code simultaneously if you think now's the time.

If you want to go down this road, I think the way forward is:

  • Build a ThingThatBuildsARawEmailFromAMessage. You can reference PHPMailerLite as a starting point, delete all the "quoted printable encoding" and "DKIM signature" code, and hopefully end up with something reasonable. Input should be a PhabricatorMailExternalMessage, output should be a raw mail blob string that Sendmail/SMTP/SES accept.
  • Convert SESAdapter to use ThingThatBuildsARawEmailFromAMessage instead of PHPMailerLite to build the raw message blob.
  • Also, convert SESAdapter to use an AWSFuture subclass and show the user the "You haven't verified your From address" error (T13235). Remove SimpleEmailService.
  • Convert SendmailAdapter to use ThingThatBuildsARawEmailFromAMessage instead of PHPMailerLite, and add the (hopefully trivial?) amount of wrapper code to pass that to sendmail. Should just be "pipe through with execfuture" pretty much. Add a timeout.
  • Remove PHPMailerLite.
  • Convert SMTPAdapter to use ThingThatBuildsARawEmailFromAMessage instead of PHPMailer, then implement SMTP. Add a timeout.
  • Remove PHPMailer.
  • Resolve T5969.

You can go through however much of that as you feel like; each step moves us forward, but the whole sequence is likely a ton of work. ThingThatBuildsARawEmailFromAMessage can have unit test coverage in a pretty natural way and SES/SMTP are reasonably easy to test, but I have no idea how to actually configure sendmail and we can't really unit test SES / SMTP so those bits are liable to be more fragile.

This is definitely nice to have (and maybe has significant security value if PHPMailer has more RCE issues) but a fair chunk of work and the current approach isn't super-obviously-broken so I cut it out of scope even though it would be great to kill this stuff off.

src/applications/metamta/engine/PhabricatorMailEmailEngine.php
367

It's possible for the same header to be specified more than once. This doesn't come up very often (ever?) with mail, and not every adapter supports sending the same header name more than once, but it crops up occasionally with HTTP.

For now, I'm just being as faithful as possible to intent, so ->addHeader('X', 'y')->addHeader('X', 'z') sends:

X: y
X: z

That might or might not be the right program behavior, but it's most closely what the code asked us for.

If we use a map/dictionary, the second call will overwrite the first one instead of adding a second header.

(After mail makes it through real MTAs it's common for it to end up with multiple Received: headers, at least.)

src/applications/metamta/storage/PhabricatorMetaMTAMail.php
672

On the actual "attach to the mail using the omnipotent user" part (which isn't quite what you asked for, but I typed all this up...), this relies on some claim like "you can't get attach random files to messages and queue them to be sent" being true.

Today, my belief is that you can't: the "attached files" are always new files created from the mail attachments as we prepare to send the message.

We could possibly get rid of this and require Editors to attach anything they're sending to some appropriate application object first. That would be reasonable, but is bit more complex, and would cause some weird side effects in obscure cases.

One such case (maybe the only case?) is:

  • You get mail about /D123 with a changes.diff.
  • Then, someone hides /D123 from you.
  • You click the /mail/456/ link to view the mail in the web UI.
  • Can you see the changes.diff?

Today, the answer is "yes": the file is attached to the mail and you can see the mail, so you can see the attachment. This seems expected/consistent/reasonable since you got a copy of the actual attachment in your mailbox.

If we didn't attach here, the answer could become "no". This seems purely paranoid/confusing since the actual file got mailed to you anyway, although it's pretty moot in practice since the UI could easily just say "Something was attached to this originally, but you can't see it anymore." if we wanted to go down that road for some other reason.

672

On "send stuff to a random email address somehow", I think there are about 6 layers of stuff preventing that right now. It's bad in the general case too, even if there aren't attachments (since the mail content/subject/etc could be sensitive).

A possible future case where we need to navigate this is "add an accounts-payable address to Phortune payment accounts".

src/applications/metamta/engine/PhabricatorMailEmailEngine.php
367

If we use a map/dictionary, the second call will overwrite the first one instead of adding a second header.

Yeah, my thought is that would be a good thing because

not every adapter supports sending the same header name more than once

I was just thinking about how hypothetical mail code that wants to overwrite some of the "default" headers generated here basically can't do that. None of these headers seem like things we'd ever want to overwrite though, so, c'est la vie.