Page MenuHomePhabricator

Phacility cluster account/service sync issues (account identifiers, instance refs)
Closed, ResolvedPublic

Description

  • When bin/services sync targets an admin instance (e.g., locally) it exits with no error and no message. This is correct, but confusing.
  • bin/services sync --instance X exits with no error if X does not exist. This is strictly incorrect.
  • AccountIdentifiers do not sync during setup after T13493.

Revisions and Commits

Event Timeline

epriestley triaged this task as Normal priority.May 5 2020, 3:10 PM
epriestley created this task.

bin/services sync --instance X exits with no error if X does not exist.

The actual command includes the poorly-named --weak flag ("weaker error semantics") which means "don't do anything if the instance doesn't exist". So I think the primary issue here was that we should not use --weak during a database upgrade. I removed this flag in fdf5ec54abb4. It might be nice to change the name of this flag to something like --if-instance-exists.

With --weak, a database upgrade could fail if the instance does not exist yet when the upgrade command runs on the db host. This is definitely the wrong behavior, but it's unclear how this could happen (or if it did in this case). I identified 9 other instances in the last year which are stuck in "Initializing" (all but one are test instances), so that's a sort of upper bound on badness here.

Offhand, I'm not sure if a pathway exists which allows instances.state to execute and fail to return an instance after transactions have committed but before they've published. This seems difficult, but a handful of random failures per year is consistent with a rare race like this.

AccountIdentifiers do not sync during setup after T13493.

This was a straightforward issue. In T13493, external accounts are identified by a separate AccountIdentifier, not the on-object accountID. The hard-coded setup code for administrator account initialization didn't install this identifier properly.

This didn't prevent registration, but did prevent the default account from being recognized as linked to the external account, so first-time setup went down the "normal user" workflow. This wasn't a great experience since the automatic stub account had reserved the manager's email address and username, but only affected first-time setup.

In b8ae9d748aa8, I fixed this so the setup code writes the separate identifier row.

epriestley added a commit: Restricted Diffusion Commit.May 5 2020, 6:31 PM
epriestley added a commit: Restricted Diffusion Commit.

There was a related AccountIdentifier issue with InstancesShadowUserQuery: we loaded shadow users based on accountID, but this is no longer consistently populated after T13493.

In ce8f9c245a99, I changed the query to read the identifier table properly.

epriestley added a commit: Restricted Diffusion Commit.May 8 2020, 6:49 PM