Run "DatabaseSetup" checks against all configured hosts
ClosedPublic
Actions

Authored by epriestley on Nov 21 2016, 1:39 PM.

Details

Reviewers

chad

Maniphest Tasks

T10759: Run PhabricatorDatabase/MySQLSetupCheck against all configured replicas

Commits

rP78040e0ff577: Run "DatabaseSetup" checks against all configured hosts

Summary

Ref T10759. Currently, these checks run only against configured masters. Instead, check every host.

These checks also sort of cheat through restart during a recovery, when some hosts will be unreachable: they test for "disaster" by seeing if no masters are reachable, and just skip all the checks in that case.

This is bad for at least two reasons:

After recent changes, it is possible that some masters are dead but it's still OK to start. For example, "slowvote" may have no master, but everything else is reachable. We can safely run without slowvote.
It's possible to start during a disaster and miss important setup checks completely, since we skip them, get a clean bill of health, and never re-test them.

Instead:

Test each host individually.
Fundamental problems (lack of InnoDB, bad schema) are fatal on any host.
If we can't connect, raise it as a warning to make sure we check it later. If you start during a disaster, we still want to make sure that schemata are up to date if you later recover a host.

In particular, I'm going to add these checks soon:

Fatal if a "master" is replicating.
Fatal if a "replica" is not replicating.
Fatal if a database partition config differs from web partition config.
When we let a database off with a warning because it's down, and later upgrade it to a fatal because we discover it is broken after it comes up again, fatal everything. Currently, we keep running if we "discover" the presence of new fatals after surviving setup checks for the first time.

Test Plan

Configured with multiple masters, intentionally broke one (simulating a disaster where one master is lost), saw Phabricator still startup.
Tested individual setup checks by intentionally breaking them.

Diff Detail

Repository

rP Phabricator

Branch

partition1

Lint

Lint Passed

Unit

Tests Passed

Build Status

Buildable 14585
Build 19033: Run Core Tests
Build 19032: arc lint + arc unit

Event Timeline

epriestley updated this revision to Diff 40690.Nov 21 2016, 1:39 PM

epriestley retitled this revision from to Run "DatabaseSetup" checks against all configured hosts.

epriestley updated this object.

epriestley edited the test plan for this revision. (Show Details)

epriestley added a reviewer: chad.

epriestley added a task: T10759: Run PhabricatorDatabase/MySQLSetupCheck against all configured replicas.

Also, run bin/storage destroy against ALL configured masters by default.
While this is probably the best behavior anyway, it directly makes unit test cleanup work correctly.

chad accepted this revision.Nov 21 2016, 3:24 PM

chad edited edge metadata.

This revision is now accepted and ready to land.Nov 21 2016, 3:24 PM

Closed by commit rP78040e0ff577: Run "DatabaseSetup" checks against all configured hosts (authored by epriestley, committed by epriestley). · Explain WhyNov 21 2016, 11:49 PM

This revision was automatically updated to reflect the committed changes.