Page MenuHomePhabricator

"Daemon out-of-date" detection shows incorrect status in high availability setup
Closed, ResolvedPublic

Description

Noticed this with my testing "high availability" setup. Basically viewing /daemon/ on host1 will show daemons running on host2 as being out-of-date, even though when you view /daemon/ on host2 they report fine.

Event Timeline

hach-que raised the priority of this task from to Needs Triage.
hach-que updated the task description. (Show Details)
hach-que added a project: Phabricator.
hach-que added subscribers: hach-que, epriestley.

Can you give me some more details on the "high availability" setup? Apologies if there's lots of context I should probably have already on that. :/

Yeah I figured that "high availability" was probably not going to be a huge amount of information.

Basically I've set myself up a "high availability" setup, which consists of 3 isolated Docker instances running on the same machine. One docker instance is the storage tier (running the pull daemon and storing repos). Another is the taskmaster dameon, where the taskmasters and the GC daemon run. And the final instance is the web tier instance. They are all isolated and talk to each other via Conduit over HTTP (and they point at the same database).

Basically I'm planning on tackling a bunch of stuff related to T4209 and I'm documenting what currently breaks / doesn't work right in this scenario.

(I can give you explicit instructions on setting up a setup like this, and provide you the configuration files, if you want to get a similar setup going)

Also, none of these HA tasks are "this is breaking some production or real-world system"; it's just me documenting what does and doesn't work so there's a clear picture on outstanding work for HA to be usable in production systems (or for Phacility SaSS).

Do you intentionally have different configuration on host1 and host2?

We're basically just hashing the config to detect out-of-date daemons, so I'd expect the same results if the machines have the same configuration.

Ah actually you're right; the configurations are different between hosts in the local config because different tiers have different configurations.

So the solution here is probably just a "am I on the same host as the daemons, if not, don't perform the check".

chad triaged this task as Normal priority.Sep 2 2014, 8:35 PM
chad edited projects, added Daemons; removed Phabricator.

What configuration differs between tiers? I sort of imagine moving all the tier-dependent stuff elsewhere, although I don't have a clear vision for it yet.

In this case I think it was just me configuring different UI headers to different tiers.

The only other one I changed was repo storage path because the web / daemon tiers shouldn't need to store any repos.

btrahan added a subscriber: btrahan.

upsforgrabbsing as we'll deal with this more as part of high availability stuff

epriestley claimed this task.

We got rid of this warning in favor of automatically reloading after config changes, which should survive HA setups without additional work.