Page MenuHomePhabricator

No mail deliveries: Mail daemon throwing "Undefined property" exceptions in a loop
Closed, ResolvedPublic

Description

After upgrading to 82f97b6b339aff261cc30c39261f77ea1b7212ed, all outbound mail has been failing. Upgrading to HEAD this morning didn't improve things.

Example output from /var/tmp/phd/log/daemons.log:

{P1966}

Example failing task ./bin/mail show-outbound --id 14220 output:

{P1967}

Example send-test output:

{P1968}

Task ID 14220 has a failure count of 208/250.

This looks similar to D7660. It's possible that this is a problem with our SES config (which hasn't changed recently). A cursory glance at our SES console doesn't show any issues, and other tools using the same SES sender address are behaving correctly. Let me know if you'd like some temporary SES credentials to help reproduce this and I'll provide them out-of-band.

Event Timeline

It looks like this is occurring during error handling so there's likely some other root error that this is masking, but we can fix this and see what it unveils.

Your SES config possibly should have changed recently: amazon-ses.endpoint must now be specified explicitly. But you should have a giant warning banner at the top of the page about this if it isn't set up properly.

You should get a little further now in HEAD of master.

The underlying issue might just be a missing amazon-ses.endpoint configuration, and maybe the setup issue cache didn't get cleared after upgrading, or maybe you have a bunch of other issues you haven't finished sorting through so you missed it. That's the only recent change I can think of.

If that isn't the issue, let us know what the new errors is.

You're right; our amazon-ses.endpoint wasn't set. We just upgraded to HEAD and (incorrectly) configured amazon-ses.endpoint to email-smtp.us-east-1.amazonaws.com and got some exciting new errors that also might be worth fixing:

{P1969}

Correctly configuring amazon-ses.endpoint to email.us-east-1.amazonaws.com fixed our delivery issues.

Cool, let me see if I can repro that locally.

(All the stuff in ses.php is third-party code with interesting new ideas about how errors work.)

After D15632:

  • The underlying error should work properly and be generally more reasonable.
  • This specific issue (reading the wrong column out of AWS and picking an SMTP endpoint) should be flagged explicitly with a more helpful message pointing you at the correct resolution.
  • An unrelated issue in T10476 should be flagged helpfully, too.

This should now be fully resolved in HEAD of master. Thanks for the report! Your details and stack traces were very helpful in understanding and resolving the issue.

Let us know if you run into anything else.