Page MenuHomePhabricator

Diffusion pings accidentally tasks because the same task numbers are mentioned in cloned commit messages
Closed, ResolvedPublic

Description

Welcome to a World with multiple Phabricator instances cloning each other's repositories!

  1. @epriestley merges this patch here mentioning T1191
  2. https://phabricator.wikimedia.org/T1191 gets an action: "Diffusion added a commit: rPHUTIL0135e57181a9: Assume utf8mb4 support"

Needless to say, T1191 here and the same task number in Wikimedia are totally unrelated. Yet both get the same ping. Of course, Remarkup has no idea and cannot do better...

Event Timeline

qgil raised the priority of this task from to Needs Triage.
qgil updated the task description. (Show Details)
qgil added projects: Diffusion, Wikimedia.
qgil added subscribers: ghost.of.chasemp, demon.
qgil added a subscriber: qgil.

...or what I had just written up, with an added aspect of different levels of information being revealed between web vs mail:

The fix for https://secure.phabricator.com/T1191 was backported to the Wikimedia Phabricator instance.
This triggered a comment in https://phabricator.wikimedia.org/T1191 which displays for me (being an admin in that instance):

Restricted Application added a commit: Restricted Commit.

in the web interface. The email notification for that ticket however stated:

"Diffusion added a commit: rPHUTIL0135e57181a9: Assume utf8mb4 support."

This triggered a comment in https://phabricator.wikimedia.org/T1191 which displays for me (being an admin in that instance):

Restricted Application added a commit: Restricted Commit.

You can ignore this, not relevant to the bug reported here. Currently in Wikimedia Phabriator, Diffusion is only visible to the members of https://phabricator.wikimedia.org/tag/diffusion/ . @aklapper, join that project and you will see the commit.

@aklapper, T6367 discusses the email thing in more detail.

Fix for this is probably some kind of "ignore references in this repository" flag. Prior to adding object mentions you could effectively do this by disabling autoclose, I think, but I believe mentions now punch through autoclose.

We require a full URI for "Differential Revision: ..." to avoid this problem there, but that's not practical in the general case.

Does it also mean that each clone allocates a new PHID for each commit (Because "edges")?

Each copy of a commit in a tracked repository has a unique PHID, yes. This is intentional -- it would be weird/confusing if commenting on rXabcdef1234 mirrored to rYabcdef1234 just because the "X" and "Y" repositories were copied from one another at some point in history. The two commits might also have totally different view/edit policies, etc.

ooh, I was thinking "high availability" clone....

Today Wikimedia Phabricator has been hit badly when we started importing the Phabricator repository (having forgotten about this problem). We stopped the import as soon as we saw our mailboxes flooded with notifications of commits and tasks marked as resolved.

This happened just an hour ago. We are tracking the repairs at https://phabricator.wikimedia.org/T91488

Some related discussion in T6887.

I think the root issue is that repositories created via the Conduit API (e.g., with the script in T6887) are not correctly flagged as "importing" at initial creation. Repositories created via the web UI do get flagged. Most users use the web UI to create repositories, and I always used the web UI in testing this, which is why I missed the issue.

When a repository has the "importing" flag, we mostly ignore all the commit content until we catch up to HEAD and process everything in the repository.

Had the repository been flagged correctly by the API endpoint, most of this damage (in particular, all the status changes) would have been averted. As we made new commits in the usptream their side effects would leak out, but that would have been caught quickly and autoclose disabled in the repository, affecting maybe a handful of tasks (maybe a few dozen if you, say, merged from the upstream after a few weeks).

Object mentions might still have fired. There's an argument that this behavior is correct in some cases, although we should probably disable them if they aren't already disabled. I'll look into this.

Those two patches should generally prevent problems here. With those applied, the behavior would have been:

  • Nothing happens during initial import (i.e., we do no publishing of any kind).
  • The next time the remote repository updates (on WMF Gerrit), new commits incorrectly publish. If the Gerrit repository is synchronized to the upstream, this would have been approximately one commit. If it's updated manually, it would have been however many commits we made between the update periods. So somewhere between one and "a handful" of tasks would have been undesirably affected.
  • Someone notices, goes into the settings, and disables "Publish/Notify" and "Autoclose" for the repository.
  • The repository never publishes again.

These patches aren't granular enough to fully support importing external repositories without affecting local objects. In particular, you could not do this:

  • Import Phabricator;
  • have it not affect local objects;
  • but still write Herald rules against it to learn when we push certain stuff.

Specifically, the "affect local objects" and "run Herald rules" parts are not separable, so you have to disable both or neither. I'm not sure if this is actually an interesting use case. If it is, we can separate the "Publish/Notify" option into a "Publish/Notify" option and a "Objects mentioned in this repository are local objects." option.

epriestley claimed this task.

I think this is fixed now. In particular, the behavior today would be:

  • You import a repository which also uses Phabricator.
  • You don't turn off publish/notify.
  • Nothing bad happens during import.
  • Import finishes.
  • The next time a new commit which affects a local task is pushed to the imported upstream, the local task is affected.
  • Hopefully, you notice this and figure out that you can go turn off Publish/Notify. At a minimum, this should probably be documented more clearly.

As noted above, disabling Publish/Notify also disables Herald. We don't have "affects local objects" and "activates Herald" as separable capabilities right now, but aren't aware of any installs with an interest in separating them. We can separate them in the future if we receive requests.