Page MenuHomePhabricator

Remove "if sent two graceful shutdowns, terminate" logic from PhutilDaemonOverseer
Closed, WontfixPublic

Description

This prevents us from queuing multiple reloads, because the second reload will terminate any processes that are already gracefully shutting down.

I can't see any reason for this logic, given that if someone wants to forcefully terminate the daemons they can use bin/phd stop --graceful 0 from another shell to send terminate signals to everything immediately.

Logic is in https://secure.phabricator.com/diffusion/PHU/browse/master/src/daemon/PhutilDaemonOverseer.php;124e5a41e0864ea68f7c921aafb01c5d457623a2$350

Event Timeline

hach-que raised the priority of this task from to Needs Triage.
hach-que updated the task description. (Show Details)
hach-que updated the task description. (Show Details)
hach-que added subscribers: hach-que, epriestley.

The reason for this logic is so that you can ^C twice to get out if a daemon is hung and you're running it in the foreground. This avoids the need to bring up a second terminal, figure out the PID of the daemon running in the first terminal, kill it, hope you got the PID right, etc., or abandon the first terminal entirely.

You could use ^\ / SIGQUIT to force the non-graceful exit when the daemon's in the foreground and keep ^C as graceful only.

epriestley claimed this task.

Queuing multiple graceful exits just isn't a use case we're going to support in the upstream. You can conceivably have your over-overseer decline to send the second SIGINT if an overseer has already been SIGINT'd once. bin/phd stop <pid> --graceful ... can send the signals selectively. I'd maaaaaybe accept a bin/phd status --json since parsing bin/phd status is a bit of a mess, but the other tools to build this already exist.

There's no way an over-overseer can know when the previous reload is finished; all it can do is call bin/phd stop --graceful 999999 in the background and leave it to run. In addition, a previous reload can take several hours to run; if you need to deploy out another set of code before the reload is finished, you have no safe way of doing so until the first one is entirely complete.

You can't even shutdown the machine safely because once you issue the first graceful shutdown, there's no way to know if all the old daemon processes have gracefully exited (unless you want to wait 999999 seconds).

There's no way an over-overseer can know when the previous reload is finished

It can look for the overseer PID. Once it exits, the shutdown finished.

there's no way to know if all the old daemon processes have gracefully exited

If all the overseers have exited, all the daemons have exited. If not, they haven't.

Okay, but you still have no way of safely issuing another bin/phd stop without it stopping both overseers?

You have to bin/phd stop <only the second overseer PID>.