Page MenuHomePhabricator

Disentangle "repoXYZ = dbXYZ" in the Phacility cluster
Closed, ResolvedPublic

Description

Followup from T12798.

Currently, instances always allocate on the same repo and db shards (for example, db013 is always paired with repo013).

There's no technical reason for this, it just made ops stuff a little easier, especially when we had only a handful of hosts.

Overall, this rule is probably slightly harmful in the long term: it prevents us from having different numbers of instances per db shard vs per repo shard, as a corollary prevents us from having different db vs repo tier sizes, it would make cases like T12798 much more difficult if we actually adhered to it (we would have had to move a bunch of instances between db shards, too), and it doesn't let us manage db vs repo load separately (e.g., particular instances may be large on one tier but small on the other tier). These are all fairly advanced situations but we're at a size now where this rule is probably hurting more than helping.

After T12798, we already have instances which violate this rule (all db012 instances are on repo025 after the move).

This rule is ONLY used by the service selection algorithm when instances are created, and the "Shards" UI which monitors service selection, so fixing it is an issue of:

  • Change InstancesShard to represent one service, not a service pair.
  • Change InstancesShardQuery to return a list of services.
  • Change the "Shards" UI to show two tables, one for db and one for repo.
  • Change the call in InstancesInstance that selects new services to pick one db and one repo service, instead of one db/repo pair.
  • Also, getServiceNumber() should be done away with as it's a concept we don't want to retain in the future, but that will probably fall out of these other changes.

This stuff should all be pretty self-contained.

I think the only negative effect of this is that connecting to a db or repo host for a particular instance will become very slightly more involved. We could add bin/remote ssh repo@instance or something, perhaps, if this proves cumbersome? But this is theoretically something we "shouldn't" be doing often anyway.

Revisions and Commits

Restricted Differential Revision