Ref T7352. We currently run one overseer per daemon. I want to run one overseer for a group of daemons, to reduce the minimum memory footprint of an instance.
One barrier is how hang detection works: we detect daemon hangs by requiring them to send a periodic heartbeat. If a daemon doesn't heartbeat for a while, we assume it has hung and restart it.
Currently, this heartbeat is sent by having the daemons send SIGUSR1 to the overseer. When the overseer receives the signal, it extends the deadline for the next heartbeat.
However, the overseer can't tell where the signal came from. Right now it can only come from one place, but in a world where overseers run multiple daemons it could have come from any of the children.
Instead of using signals, this turns the daemon's stdout (which we already consume) into a structured message pipeline, and sends the heartbeat over stdout.
In a future diff, the overseer will be able to attriubute heartbeats to the correct child process.