Restart daemons automagically
Closed, ResolvedPublic
Actions

Assigned To

Authored By

	joshuaspence
	Jan 27 2015, 11:46 AM

Description

I've often wondered if we can restart the daemons automagically. Essentially, I was imaging running a very lightweight "watchdog" daemon which could detect configuration changes and then restart the other daemons accordingly.

Revisions and Commits

rPHU libphutil
	D14452	rPHU66bf71f94817 Add daemon overseer modules to allow daemons to be externally reloaded
rP Phabricator
	D14446	rP321c61a853d9 Remove daemon envHash and envInfo
	D14458	rPa07a8aca2462 Add a daemon overseer module to restart daemons when config changes

Related Objects

Mentioned In: Blog Post: Development Notes (2015 Week 47)
T9322: Avoid spawning setup errors when changing parameters in application "Config"
T9702: sane defaults for phd.variant-config

Event Timeline

joshuaspence created this task.Jan 27 2015, 11:46 AM

joshuaspence raised the priority of this task from to Needs Triage.

joshuaspence updated the task description. (Show Details)

joshuaspence added a project: Daemons.

joshuaspence added a subscriber: joshuaspence.

sascha-egerer added a subscriber: sascha-egerer.Jan 28 2015, 6:21 AM

I'm sure this question has been asked previously, but why can't the daemons read the updated configuration themselves? I imagine that the configuration would be cached, but surely it could be purged periodically?

@epriestley, I'm happy to work on this if you give me some pointers

joshuaspence claimed this task.Feb 11 2015, 8:52 PM

This is more possible after the centralization of the overseer and the introduction of phd reload and the SIGHUP handling. Specifically, we could do this:

Make the overseer periodically emit some sort of timer event (e.g., once a minute).
Have an event listener in Phabricator query the database to look for config changes.
If it detects a config change, SIGHUP the process.

However, this implies a fairly long delay. Better would be:

Let the overseer connect to the notification server.
Push a notification.

This could also let the overseer react to "queue is no longer empty" events and send SIGUSR2 to daemons to wake them up.

This is somewhat more interesting to solve in the upstream now because the beahavior is worse in the Phacility cluster (users must go to the admin console to restart daemons, which isn't obvious). In general, though, I still think this is an enormous amount of work for a tiny amount of benefit.

devurandom added a subscriber: devurandom.Mar 3 2015, 12:30 PM

joshuaspence triaged this task as Normal priority.May 14 2015, 8:57 AM

raylillywhite added a subscriber: raylillywhite.Aug 13 2015, 7:12 PM

Herald added a subscriber: eadler. · View Herald TranscriptAug 13 2015, 7:12 PM

epriestley mentioned this in T9702: sane defaults for phd.variant-config .Nov 3 2015, 9:14 PM

joshuaspence added a revision: D14446: Remove daemon envHash and envInfo.Nov 9 2015, 11:04 AM

joshuaspence added a revision: D14452: Add daemon overseer modules to allow daemons to be externally reloaded.Nov 10 2015, 10:08 AM

joshuaspence added a revision: D14458: Add a daemon overseer module to restart daemons when config changes.Nov 10 2015, 8:08 PM

epriestley merged a task: T9702: sane defaults for phd.variant-config .Nov 10 2015, 8:37 PM

epriestley added a subscriber: avivey.

joshuaspence closed this task as Resolved by committing rPa07a8aca2462: Add a daemon overseer module to restart daemons when config changes.Nov 10 2015, 9:44 PM

joshuaspence added a commit: rPHU66bf71f94817: Add daemon overseer modules to allow daemons to be externally reloaded.

joshuaspence added a commit: rPa07a8aca2462: Add a daemon overseer module to restart daemons when config changes.

joshuaspence added a commit: rP321c61a853d9: Remove daemon envHash and envInfo.Nov 10 2015, 10:01 PM

epriestley awarded a token.Nov 10 2015, 10:14 PM

epriestley mentioned this in T9322: Avoid spawning setup errors when changing parameters in application "Config".

One possible minor followup to maybe keep on the radar is showing some extra warnings in a few cases:

When editing the small amount of config which can't auto-restart (phd.user, phd.start-taskmasters). I think phd.user already has an existing warning anyway.
Some notification stuff requires restarts but this isn't really material to the phd stuff and we haven't seen issues with it.
We're now essentially training users that the daemons auto-restart, but they don't when you bin/config. So maybe that should just have a "restart the daemons" message.

We can wait for these to actually be real issues, though.

This also seems to work properly on this host, although I hit the "daemons are not running" setup warning once, presumably by racing their restart. If this is an issue we can tweak how that warning works, I think.

In T7053#144090, @epriestley wrote:

When editing the small amount of config which can't auto-restart (phd.user, phd.start-taskmasters). I think phd.user already has an existing warning anyway.

Yeah, I would like to handle this properly but I'm not sure if it is worthwhile doing so?

Some notification stuff requires restarts but this isn't really material to the phd stuff and we haven't seen issues with it.

Did you mean that some config changes requires Aphlict to be restarted?

We're now essentially training users that the daemons auto-restart, but they don't when you bin/config. So maybe that should just have a "restart the daemons" message.

I think that this is worthwhile fixing but I think that doing so involves implementing some form of config.hash or config.id. In the general case we can't really just tell users to run ./bin/phd reload because there could be daemons running on multiple hosts.

In T7053#144095, @epriestley wrote:

This also seems to work properly on this host, although I hit the "daemons are not running" setup warning once, presumably by racing their restart. If this is an issue we can tweak how that warning works, I think.

Yeah, I've hit the setup warning a couple of times myself.

epriestley mentioned this in Blog Post: Development Notes (2015 Week 47).Nov 22 2015, 9:55 PM

urzds added a subscriber: urzds.Jul 12 2017, 11:13 AM

Ray added a subscriber: Ray.Jun 19 2018, 7:01 AM

Restart daemons automagicallyClosed, ResolvedPublicActions

Description

Revisions and Commits

Related Objects

Event Timeline

Restart daemons automagically
Closed, ResolvedPublic
Actions