In remote deploy, we currently restart MySQL and then try to connect to it shortly afterward.
Sometimes, probably when MySQL has a large amount of data (in this instance, one affected host has 76GB of data), the socket may not be listening by the time the restart command exits, leading to this error:
[db010] [2017-10-14 12:19:13] EXCEPTION: (CommandException) Command failed with error #1! [db010] COMMAND [db010] echo 'DELETE FROM mysql.user WHERE User = "root" AND Host != "localhost"' | mysql -uroot [db010] [db010] STDOUT [db010] (empty) [db010] [db010] STDERR [db010] ERROR 2002 (HY000): Can't connect to local MySQL server through socket '/var/run/mysqld/mysqld.sock' (2)
We should probably make sure the socket is listening before continuing past the service mysqld restart.
In this case, the two affected hosts (db010 and db014) haven't been purged in a while and have some large test instances, so I expect I can just reduce the data size to something manageable with the current workflow fairly easily (bin/host destroy --instance-kinds test --instance-statuses suspended,disabled). I'm running the destruction workflows now.