Page MenuHomePhabricator

Make it more clear what renaming a user does and doesn't affect
Open, Needs TriagePublic

Description

We warned new users in our instance that they would want to select an appropriate user name from the beginning, but left them with the autonomy to do so. Some of our users chose to name themselves in ways they regret.

It would be nice if there was a rename procedure that propagated that rename to all the objects the user interacted with. Currently the rename feature doesn't retain referential integrity, stating:

Be careful when renaming users!

The old username will no longer be tied to the user, so anything which uses it (like old commit messages) will no longer associate correctly. (And, if you give a user a username which some other user used to have, username lookups will begin returning the wrong user.)

It is generally safe to rename newly created users (and test users and so on), but less safe to rename established users and unsafe to reissue a username.

Users who rely on password authentication will need to reset their password after their username is changed (their username is part of the salt in the password hash).

The user will receive an email notifying them that you changed their username, with instructions for logging in and resetting their password if necessary.

We would be happy to financially prioritize this if you consider it an acceptable feature to add.

Event Timeline

Can you describe the behavior you expect, given the text of the message? I think this is an issue with the text being unclear, but I'm not sure how you've interpreted it.

I've interpreted the text as renaming a user will break references to objects the user has authored/interacted with. As in, a single transaction occurs and user who is named A is now named B, however all objects that were authored/interacted with will still have a reference to user A, which now no longer exists.

What I'd like to have is a transaction kicked off to go and update every object that user interacted with (likely a costly transaction). Making the object that was formerly interacted with username A now have username B.

I think the (potential) opposition is that they'd have to process all the remarkup as well

Integrity is broken only in case where something literally uses the old username ("the old username ... anything which uses it").

This basically means "commit message text" (which may say Reviewers: catlover74 or Auditors: catlover74) and "comments" (which may say @catlover74, do you have any feedback?). We can't fix commit messages without rewriting every repository which mentions the user, and we can't fix comment text without a similarly huge amount of work. And we can never fix text which left the system via email, HTTP hooks, etc.

All relationships which are not based on persistent text written by humans are preserved. For example, if the user was an author, reviewer, subscriber, member, participant, owner, auditor, etc., etc., they still will be.

A rename is fully reversible, so you can try the rename and see if you like it or not. If you don't, just put it back.

I didn't anticipate that, conpherence entries would be.... a lot of stuff to slog through.

Correct, I may not be 100% on this but I believe they are using a regex (sorry to trivialize if this is true/untrue/complex/etc.) to do some of the linking for things like me saying @enckse here...not to mention commit messages and the like.

I did a local test of a rename and the fact that things like what I mention ^ would be a problem are the reason I'm not renaming current (not new) users.

I believe I understand now, however then I have one inquiry. Would there be a way to keep some sort of reference between the old username and new one, so if someone clicked/hovered on that mention the lookup would have some context for a rename that occured in the past. Would have to keep a chain then because people might rename more than once.

(We could silently rewrite @catlover74 in comments to be stored internally as {mention PHID-USER-asdlnasdlbn}, which would at least partially retain integrity for comments. However, I don't think there's much we can do about commit messages. We currently don't perform this class of invisible rewrite for any other rules -- the remarkup you edit is exactly the remarkup we store. This isn't out of the question, but I'd like a stronger set of motivating use cases for it since I think it's potentially fairly involved.)

I would argue "rename" isn't really "rename" if all linked references don't get updated BUT I also would be super-opposed to things like git history/commits being re-written (and there may be other cases I don't know of that I would be pro/con to). At some point you'd want to draw the line and I, personally, don't have a solid use case for moving the line.

We could retain a separate "@catlover74 is now named @doglover74" record, but I believe it would mostly only be useful in this case, although there is some overlap with T4267 and T1731 (now merged into T12164).

T12164 would let the new @doglover74 claim old commits made under "Cat Lover" <cat.lover@aol.com> or similar on its own, without a separate record.

If essentially the only thing we're resolving is old @username mentions in human-authored text like comments and chat messages, I'd possibly be more inclined to pursue the "invisible rewrite" approach -- which is potentially cleaner as a solution, even though it's more involved to build.

The "X is now Y" runs into issues like this, too:

  • @catlover74 was Alice until June 2018.
  • Then she was renamed to @doglover74.
  • In October 2018, Bailey joined the company as @catlover74.
  • References prior to June 2018 refer to Alice, references after October 2018 refer to Bailey.

Obviously, installs "should" never do this, but users "should" never pick @xXxBlackDragonHacker420tOpkEkxXx as a professional username.

(And there are somewhat more legitimate cases, like @cadams marrying and wanting to become @csmith, even though I think reissuing usernames is never the answer.)

You can sort of fake this today like this:

  • Rename catlover74 to doglover74.
  • Create a new catlover74 account.
  • As their title or description, just write: "This is an old account name for @doglover74."

Basically, I think we already have mostly reasonable behavior here (we do all the easy stuff correctly already).

It sounds like the dialog could be more clear, although it's already a page of text and I worry that making it describe the operation more clearly is going to take like four pages. See T8830 for a similar case of "an attempt to err on the side of caution while summarizing a complicated operation conveys the wrong meaning".

I don't think anyone wants us to rewrite repository history, nor to write durable references which humans can not read like Reviewers: {object PHID-USER-asdklnf} into commit messages, so I think there's nothing we can ever really do in that case to completely fix history.

To improve comments we could do aliases (formally store when a user is renamed) or invisible references (internally, store {phid PHID-USER-...} instead of @catlover in the comment text, but never show it to users). However, these are so complex for their value that I'd be hesitant to bring them upstream without stronger use cases. I lean toward the second being a much better solution than the first, and it's a fairly major infrastructure project which would change how remarkup is structured in all applications (we must begin distinguishing between "internal form" remarkup, like comments we've parsed and rewritten, and "external form" remarkup, like commit message text, which we can not directly rewrite).

This isn't necessarily off the table, but any quote is probably in the $10K range with a timeline of multiple months.

I'm thinking that the "faking" suggestion from above is likely the best situation given user mistake of using an initial name that is undesirable at a later date. I had wrongly thought that the remarkup was "internal form" already, which was a poor assumption.

If other use cases emerge where it makes sense to maintain internal representation that generalizes the PHID-USER- concept then I think it'd be interesting to talk about prioritization.

What I think we'd be really happy with is something like the 'Alias' discussion in T4267.

Another user coincidentally ran into some questions here yesterday; I'm going to retarget this to take a stab a improving the documentation/description.

For the rest of it, I'd like to complete T12164 before doing other alias work, since I think that's currently about 90% of the real alias-related issues. We could look at T4267 again after that. My current gut feeling is that the complexity/cost are way out of line with the benefits over just creating a fake @catlover74 account, but maybe some other use cases will turn up or we'll figure out a way to do it cheaply/easily.

Offhand, some cases where it gets tricky might be:

  • What does the API do when you query for an alias?
  • How do aliases appear in typeaheads?
  • How do aliases work in remarkup? What if aliases are changed? Does this just give us the same set of problems?
  • If aliases aren't user accounts, how do we guarantee that aliases and usernames are mutually unique (no single unique key can span both tables)?
  • Updating all the registration UX and CLI tooling to understand aliases.
  • Probably a whole big mess with invite flows, and particularly with the custom Phacility invite stuff (e.g., recognizing aliases for the purpose of linking accounts when importing an install's data).
epriestley renamed this task from Rename User to Make it more clear what renaming a user does and doesn't affect.May 2 2017, 1:48 PM