Last night I queued a large number of background jobs (reparsing of commit messages and reindex search documents). A few hours later, I noticed that our two Phabricator daemons hosts had died (they quite possibly died earlier than this, but it took me a few hours to notice.
It is worth noting that I have `phd.taskmasters` set to `8`. The instance is a `c3.large`. The following groups show the CPU usage over the past 24 hours:
{F630134}
My theory is that the taskmasters are autoscaling themselves and then eventually running out of memory and dying miserably. I've attached some relevant log files:
- {F630145}
- Actually the daemon log files aren't particularly useful here.