This task originally reported a fairly mild, self-healing issue with clustered Git repositories. However, it's adjacent to a more practical issue.
In Differential's "Land Revision" workflow, we currently give the user a tokenizer and ask them to select a target branch, often, say, "master". This kind of workflow (where users select refs using a tokenizer) is likely to become more prevalent in the future as we do more repository operations.
This tokenizer is backed by the RefCursor table, and uses PHIDs from that table, but the rows in that table are currently transient and the PHIDs are not stable or persistent. If a repository is moving quickly enough, the PHIDs may shift between the time the user opens the dialog and submits the form, or even while they're typing.
The format of the table is currently <type, name, commit>, e.g. <"branch", "master", "abcd1234">. There is no unique key on <type, name> because Mercurial repositories may have multiple active heads for a single branch.
This table format does not lend itself well to identifying ref names with a single, stable, persistent PHID. A better format would be to use two tables: one for names (<type, name>) and one for the zero or more things the ref is actually pointing at (<refID, objectHash>).
The way this table is updated also currently has at least two other issues:
- The mild probable-race originally reported here, where Git repositories may end up with duplicate pointers to the same ref. We could make this state impossible by putting a unique key on the ref table.
- Incredibly high churn rate on row IDs (see comment below).
Taking slightly more care in performing writes to this table can likely clear up both of these issues.
Browsing to rP (here) right now shows "The ref "master" is ambiguous in this repository. View Alternatives" warning; There's only one value under available from that link.