Page MenuHomePhabricator

Make repository synchronization safer when leaders are ambiguous
ClosedPublic

Authored by epriestley on Apr 19 2016, 7:37 PM.
Tags
None
Referenced Files
Unknown Object (File)
Thu, Dec 12, 3:32 PM
Unknown Object (File)
Mon, Dec 9, 6:12 PM
Unknown Object (File)
Thu, Dec 5, 8:31 PM
Unknown Object (File)
Wed, Dec 4, 4:50 AM
Unknown Object (File)
Tue, Nov 26, 5:12 AM
Unknown Object (File)
Nov 10 2024, 8:31 AM
Unknown Object (File)
Oct 23 2024, 5:33 AM
Unknown Object (File)
Oct 15 2024, 10:01 PM
Subscribers
None

Details

Summary

Ref T4292. Right now, repository versions only get marked when a write happens.

This potentially creates a problem: if I pushed all the sync code to secure and enabled secure002 as a repository host, the daemons would create empty copies of all the repositories on that host.

Usually, this would be fine. Most repositories have already received a write on secure001, so that working copy has a verison and is a leader.

However, when a write happened to a rarely-used repository (say, rKEYSTORE) that hadn't received any write recently, it might be sent to secure002 randomly. Now, we'd try to figure out if secure002 has the most up-to-date copy of the repository or not.

We wouldn't be able to, since we don't have any information about which node has the data on it, since we never got a write before. The old code could guess wrong and decide that secure002 is a leader, then accept the write. Since this would bump the version on secure002, that would make it an authoritative leader, and secure001 would synchronize from it passively (or on the next read or write), which would potentially destroy data.

Instead:

  • Refuse to continue in situations like this.
  • When a repository is on exactly one device, mark it as a leader with version "0".
  • When a repository is created into a cluster service, mark its version as "0" on all devices (they're all leaders, since the repository is empty).

This should mean that we won't lose data no matter how much weird stuff we run into.

Test Plan
  • In single-node mode, used repository update to verify that 0 was written properly.
  • With multiple nodes, used repository update to verify that we refuse to continue.
  • Created a new repository, verified versions were initialized correctly.

Diff Detail

Repository
rP Phabricator
Lint
Lint Not Applicable
Unit
Tests Not Applicable

Event Timeline

epriestley retitled this revision from to Make repository synchronization safer when leaders are ambiguous.
epriestley updated this object.
epriestley edited the test plan for this revision. (Show Details)
epriestley added a reviewer: chad.
chad edited edge metadata.
This revision is now accepted and ready to land.Apr 19 2016, 7:53 PM
This revision was automatically updated to reflect the committed changes.