Make repository synchronization safer when leaders are ambiguous
Summary:
Ref T4292. Right now, repository versions only get marked when a write happens.
This potentially creates a problem: if I pushed all the sync code to secure and enabled secure002 as a repository host, the daemons would create empty copies of all the repositories on that host.
Usually, this would be fine. Most repositories have already received a write on secure001, so that working copy has a verison and is a leader.
However, when a write happened to a rarely-used repository (say, rKEYSTORE) that hadn't received any write recently, it might be sent to secure002 randomly. Now, we'd try to figure out if secure002 has the most up-to-date copy of the repository or not.
We wouldn't be able to, since we don't have any information about which node has the data on it, since we never got a write before. The old code could guess wrong and decide that secure002 is a leader, then accept the write. Since this would bump the version on secure002, that would make it an authoritative leader, and secure001 would synchronize from it passively (or on the next read or write), which would potentially destroy data.
Instead:
- Refuse to continue in situations like this.
- When a repository is on exactly one device, mark it as a leader with version "0".
- When a repository is created into a cluster service, mark its version as "0" on all devices (they're all leaders, since the repository is empty).
This should mean that we won't lose data no matter how much weird stuff we run into.
Test Plan:
- In single-node mode, used repository update to verify that 0 was written properly.
- With multiple nodes, used repository update to verify that we refuse to continue.
- Created a new repository, verified versions were initialized correctly.
Reviewers: chad
Reviewed By: chad
Maniphest Tasks: T4292
Differential Revision: https://secure.phabricator.com/D15761