Page MenuHomePhabricator

Tagged commits which are not ancestors of any branch head don't get imported
Closed, ResolvedPublic

Description

Tag a commit that's not already on a branch and run git push --tags. It doesn't import, and daemons don't show any queued tasks.

tags-dont-import.png (773×1 px, 102 KB)

In that screenshot, the "Importing..." commits are only available via the tag, and the commit at the bottom is on master.

Related Objects

Event Timeline

staticshock raised the priority of this task from to Needs Triage.
staticshock updated the task description. (Show Details)
staticshock added a subscriber: staticshock.

My git foo is somewhat weak. Can you provide a sequence of git, text editing, and arc commands to reproduce this?

Yeah, sure. I'm going to assume you're at the root of an existing git repo with at least two commits.

git checkout HEAD~
echo garbage > new-file
git add new-file
git commit -m "committing garbage"
git tag garbage
git push --tags

Then go /diffusion/CALLSIGN/history/;garbage

epriestley renamed this task from Tagged commits don't get imported to Tagged commits which are not ancestors of any branch head don't get imported.Oct 17 2015, 12:30 AM
epriestley added subscribers: chad, lfarkas.

for me it seem the imported git repository is a fully functional repo and the original git repo is fully imported only the diffusion's database changeset import partition.

Probaly because the paths are hardcoded please see https://github.com/phacility/phabricator/blob/master/src/applications/diffusion/query/lowlevel/DiffusionLowLevelGitRefQuery.php#L43

Possibly we should also hardcode refs/changes/ and refs/meta/changes paths.

eadler added a project: Restricted Project.Jan 20 2016, 7:27 PM
eadler moved this task from Restricted Project Column to Restricted Project Column on the Restricted Project board.

Specifically, the issue here is that we fetch and import branch heads, but not other refs, including tags. We also aren't totally consistent about which refs we import across hosted/imported repositories.

We should fetch and import all refs (tags, branches, and custom refs used by software like Gerrit and, in some past and future cases, Phabricator). This is likely straightforward. First, we need to fetch more aggressively when importing:

$future = $repository->getRemoteCommandFuture(
  'fetch origin %s --prune',
  '+refs/heads/*:refs/heads/*');

That should just fetch refs/*, not refs/heads/*.

Second, the logic in DiffusionLowLevelGitRefQuery needs to be made more general and callers need to be updated. The API is a little nonsense and inconsistent right now. Callsites need to be examined and updated:

  • withIsOriginBranch() should probably be removed, and the query should always examine the origin remote for non-bare working copies and local refs for modern bare working copies.
  • withIsTag() should probably become withRefPrefix('refs/tags/') or similar.
  • Callers interested in only branches should use withRefPrefix('refs/heads/').
  • We could implement withIsTag() and withIsBranch() in terms of these operations.
  • PhabricatorRepositoryRefCursor probably needs a TYPE_REF type for non-tag, non-branch refs, but maybe that doesn't interact here.

Wikimedia is definitely interested in this one.

@epriestley: What would happen with other types of refs that are neither tags nor branches? Specifically, gerrit's refs/changes/*. I don't know if it's a good idea to show them in the diffusion ui but they should at least get imported into the repo so that git clients can fetch those refs.

If phabricator supported custom refs then we could set the staging repo url to point to the origin url and arcanist could push differential changesets to refs/phabricator/diff/$id instead of needing a separate repository for change handoff. (see T8238: Formally support side-band change handoff in external repositories)

I probably wouldn't show them in the UI for now, although I don't think adding an "Other Refs" table underneath tags/branches on the main screen with, say, 5 entries and a "See All" button would necessarily be terrible if anyone can come up with a reason that users might ever care about it.

This UI is probably a bit useful for debugging/verifying that things are working, but I suspect it might not ever really be useful to normal users so burying it somewhere down a dark alley might be better if no one can think up a reasonable use case.

(I think "Edit Repository" is going to become more accessible so users can see things like the edit/push policy even if they can't edit the repository, and it might make sense to add it there instead of the home screen depending on how that turns out.)

We could push to secret refs instead of tags fairly easily, but they're harder for users to interact with and understand, and not always compatible with other tools. I think proxying the protocol is probably more promising as a way forward, but it's possible that we end up with refs too, or as an option, or some mix of things. I'm definitely open to exploring and expanding this. I resisted adding a "push to branches" option in T9456, but only because the blocker/motivation there was a slow-moving third party and I didn't want to completely bend over backward if they weren't willing to budge at all.

@20after4 and @epriestley may we can do

at https://github.com/phacility/phabricator/blob/973b8ace8616f96d77cede4ec620248be262e3ff/src/applications/repository/engine/PhabricatorRepositoryPullEngine.php#L315

this

$ref-types = PhabricatorEnv::getEnvConfig('files.ref-types.');

$future = $repository->getRemoteCommandFuture(

'fetch origin %s --prune',
$ref-types

);

I doint know if we can make it an array.

But I also doint know if it will work.

It will also look like in yaml.

files.ref-types:

'+refs/heads/*:refs/heads/*'
'+refs/changes/*:refs/changes/*'

or

files.ref-types:

'+refs/*:refs/*'

which will get all the refs but will not expose them in diffusion only allowing diffusion to view the refs if the user manually adds to the url the commit.

I also doint know if we need to also update https://github.com/phacility/phabricator/blob/973b8ace8616f96d77cede4ec620248be262e3ff/src/applications/repository/engine/PhabricatorRepositoryPullEngine.php#L319

"Autoclose" and "Track Only" configurations are limited to branches, and do not account for arbitrary tags/refs. For now, I think we can assume tags are always autoclose/track, but may need to refine this in the future. I'd ideally like to replace autoclose with "fork" (T10691) rather than introducing tag(regexp(...)) or whatever, but that may be a long road.

For now, I think we can assume tags are always autoclose/track, but may need to refine this in the future.

We host our staging repo in phabricator, won't this just autoclose all revisions as soon the they show up? And expose all of its commits in the search, etc, which is also undesirable?

Yeah, the actual rule D16129 implements is this:

  • Tags and refs are always tracked.
  • Tags and refs never autoclose.

Because refs are now tracked, it's possible that there will be some ref leakage into things like search and monograms. See also T10667. I think this stuff will be minor, but you could let it sit on this server for a bit and see how bad it is before upgrading.

A quick mitigation might be to just blacklist refs/phabricator from tracking for now.

D16134 just hard-codes phabricator/ tags as untracked, which should minimize collateral damage.

black-listing phabricator/ sounds good to me :)

I think this is resolved, but see T9028 for related followups.

This can cause a storm of commit mail when upgrading. If there are old tags which weren't imported at the time, when Phabricator is upgraded to a version containing D16129 it can fire off thousands of commit emails, flooding inboxes and making some folks grumpy at their Phabricator administrators.

Not that I speak from experience.

Fixing this requires some sort of flag on first-import after an upgrade past this commit.

This has happened about 2 months ago, so a lot of the damage has already happened. We had a listing somewhere about what to do, but I can't find it now, so I'll re-write it here for posterity:

If you're upgrading across rP2949905c045b603a95be6b43668fb2bb463efe8d (June 16th), and might have previously un-discovered refs/commits:

  • Before starting the daemons after the upgrade, run ./bin/repository mark-imported --mark-not-imported <repo> for all your repositories.

This will prevent the daemons from sending out email about newly discovered commits.
Once the import is complete (No more tasks queued in the Daemons console), the repositories should automatically be marked as "imported", but if not, use ./bin/repository mark-imported again to enable emails and herald.