Page MenuHomePhabricator

Users can mistakenly create revisions with no repository through confusion or misconfiguration
Open, Needs TriagePublic


Root problem:

  • Some existing differential items on my install have no repository information attached
    • These are certainly based on a diffusion respository and have properly linked commit items
    • Seems to be due to user-side configuration issues with arcanist and svn repos
  • We are enabling spaces roughly in line with the 4th use case described in /book/phabricator/article/spaces/
    • Namely, we want a small cordoned off area for consultants
  • Because the diffs described above are not attached to any repo, they are visible to this lower-privilege space
    • This is undesirable; all the properly-created diffs are hidden as their underlying repo is not visible to the space

That's my basic problem. Please let me know if it's unclear or there's anything else I could provide.

Below are some of the ways I considered solving it:

  • There's no way to search for revisions which are not attached to any repositories (at say /differential/query/advanced/)
    • In that there's no way to specify "None" in the repository filter
    • I've worked around this by revoking all repository visibility permissions from a user and seeing which revisions remain visible
    • These are exactly the diffs I'm trying to prevent my unprivileged users from seeing
  • Differential support for spaces (as mentioned in T8493) would solve this for me
  • Similarly, inferring the Space for the differential item based on the Space of the eventual commit/diffusion item
  • Some sort of batch edit for differential so that I can get these differential items attached to the correct repository
    • Command-line would be fine, but suffers from having to populate the list
    • Barring any input from the upstream, my current plan is to attempt to do this directly in the database

Any input on any of those solutions or any other possibilities I didn't consider would be greatly appreciated.

Event Timeline

epriestley added subscribers: swisspol, epriestley.

We've seen a couple of flavors of this from @swisspol, too. Although the symptom there is a little different (not a policy issue, exactly), the root cause is mostly similar. In particular, we saw:

  • One case of a user using an SSH alias to alias something like repo to; see also T11882. arc can not translate SSH aliases so it could not figure out that git://repo/X/ is the same as git://
  • One case of a user using a personal GitHub fork as a remote. arc can not reasonably recognize that git:// and git:// are the same in the general case, and it would sometimes be wrong if it tried to infer this from available information (that is, even if it made calls to the GitHub API to see that "Z" was truly a fork of "X", this is not sufficient to unambiguously identify user intent).

No matter what the case, the affected user (or users) can run arc which to figure out why the repository is not being identified. This might be because they're using an SSH alias, or a personal fork, or for some other reason.

The easiest way to fix this is usually to add "repository.callsign" to .arcconfig, which will unambiguously bind the working copy to a Diffusion repository.

I think a blanket way to stem the bleeding here is to write this Herald rule:

Global Rule for Differential Diffs
[ Repository ] [ does not exist ]
Take these actions:
[ Block diff with message ] [ "Repository unrecognized. See for details." ]

That link could point users at documentation which says stuff like:

  • Maybe add repository.callsign to .arcconfig.
  • Maybe run arc which.
  • Maybe don't use SSH aliases / personal forks.

The exact instructions would depend on your install, but if you write this rule it should prevent users from disclosing information by accident.

Also note that I plan to replace "repository.callsign" with something like "repository" "soon", which will accept a callsign, ID, short name, or monogram.

A possible change here is to make some global option available like "Always Require a Repository". The reason we don't require a repository is to ease onboarding: in some environments, it's a lot simpler to evaluate Phabricator or deploy it alongside other software if it doesn't have to touch repositories and can purely layer on top of them. It's also occasionally useful to, e.g., create changes from third-party repositories in an ad-hoc way (e.g., to discuss a problem in a dependency or whatever).

We could let installs configure a "require a repository" option, but if it's on-by-default it partly defeats the purpose of easing onboarding, and if it's off-by-default it would likely be only somewhat less obscure than the Herald rule. Plus, T8227.

We could do something like prompting users, but in his case (where the repository is important for security reasons) it's particularly bad to let users "y" through an "ignore security" prompt. So you'd still have to back it up with something like the Herald rule.

In Git and Mercurial, we could also try to use commit hashes to identify a repository (we already use UUIDs for Subversion, I believe). However, I worry this may lead to significantly more false positives than the current URI-based repository identification schemes. Still, maybe worth looking into.

epriestley renamed this task from differential items with no repository attached cannot be hidden from spaces to Users can mistakenly create revisions with no repository through confusion or misconfiguration.May 29 2017, 2:42 PM
epriestley added a project: Arcanist.

I had indeed implemented precisely that herald rule once I realized that these diffs were being created without repository info attached. I also added repository.callsign to /etc/arcconfig to eliminate the problem going forward.

That roughly fixes the issue going forward (for me at least). My main issue now is how to gracefully handle the existing differential items s.t. I can apply a Space policy to them (i.e. is there a graceful way to backfill or to infer the space from the commit that eventually showed up).