Provide tools to drop severed nodes from load balancer pools by failing status checks
Open, NormalPublic
Actions

Assigned To

Authored By

	epriestley
	Apr 10 2016, 10:44 AM

Description

When a web node is unable to reach any database replica, it should report an unavailable status from /status/.

This would allow a configuration across multiple datacenters (where some replicas are mutually unreachable) to automatically stop sending traffic to web nodes in the bad datacenter after losing services there.

Doing this with SSH might be a little trickier, but you should be able to use the same health check to decide whether to connect to a box over SSH, and I think the configuration in the Phacility cluster (where SSH application servers and Web application servers share the same nodes) is generally a sensible one, so we may not really need more than this.

Related Objects
Search...

		Status	Assigned	Task
		Resolved	epriestley	T10751 Make Phabricator Highly Available
		Open	epriestley	T10768 Provide tools to drop severed nodes from load balancer pools by failing status checks

Event Timeline

epriestley created this task.Apr 10 2016, 10:44 AM

This is probably fairly straightforward, but I'd like to see some evidence that installs would actually benefit from it before pursuing it.

In the case of this host, all database replicas are reachable from all web nodes, so there's no way web nodes can become severed from the service.

Herald added a subscriber: eadler. · View Herald TranscriptApr 15 2016, 9:17 PM

epriestley mentioned this in T10751: Make Phabricator Highly Available.May 12 2016, 3:18 PM

chad added projects: Database, Clusters.Jun 8 2016, 4:51 PM

Provide tools to drop severed nodes from load balancer pools by failing status checksOpen, NormalPublicActions

Description

Related ObjectsSearch...

Event Timeline

Provide tools to drop severed nodes from load balancer pools by failing status checks
Open, NormalPublic
Actions

Related Objects
Search...