Phabricator handles commits which are published and later deleted poorly
Closed, ResolvedPublic
Actions

Assigned To

Authored By

	jhurwitz
	Jul 30 2015, 11:51 PM

Description

We believe that the following happened (though we haven't tried to repro so we're not 100% sure):

A commit was pushed to a remote branch
This requested an audit
Some time later, someone went through and deleted many remote branches as part of a clean-up effort
The auditor was working through his queue and saw an old (incomplete) audit corresponding to a commit on one of the deleted branches
Clicking on the audit shows:

Command failed with error #128!
COMMAND
git log -n 1 --format='%P' '08d5e6c2a164f10679fdf412dc187c50f2acb289'

STDOUT
(empty)

STDERR
fatal: bad object 08d5e6c2a164f10679fdf412dc187c50f2acb289

The problem we encountered is two-fold:

It wasn't obvious (since the clean-up happened a while ago) that the reason for this error was that the commit was on a deleted branch. We wasted quite some time figuring out what was going on.
There is no way to delete, close, or resign from this audit via the Phabricator UI, so it will forever be in the auditor's list of "open audits."

Potential solution ideas:

Improve the error message and provide a button to close the audit
Have a daemon that processes deleted commits/branches and deletes associated (incomplete?) audits
Give us a script we can use that deletes a branch and cleans up its audits, or that given the name of a just-deleted branch cleans up its audits, or similar

Revisions and Commits

rP Phabricator
	D16136	rP28eb5628999b Ignore unrecognized refs in "refs/remotes/"
	D16135	rP8032a142232d Mark unreachable commits handles as "closed"
	D16134	rP7c8f9d7ba26c Don't track "phabricator/" staging area tags
	D16133	rP1c63ac6a3adf When a ref is moved or deleted, put it on a list; later, check for reachability
	D16132	rP02d7bb8604f3 Add "bin/repository mark-reachable" for fixing commit reachability flags
	D16131	rP77ee518d88af Make daemons ignore "Unreachable" commits and avoid duplicate work
	D16130	rPec89c7d63e26 Add an "Unreachable" flag for commits and revive them during discovery
	D16129	rP2949905c045b Fetch and discover all Git ref types, not just branches
	D14227	rP0db86cce7df7 Improve Diffusion behavior for no-longer-existing commits

Related Objects
Search...

		Status	Assigned	Task
		Resolved	epriestley	T9028 Phabricator handles commits which are published and later deleted poorly
		Resolved	None	T6878 Tagged commits which are not ancestors of any branch head don't get imported

Event Timeline

jhurwitz created this task.Jul 30 2015, 11:51 PM

jhurwitz raised the priority of this task from to Needs Triage.

jhurwitz updated the task description. (Show Details)

jhurwitz added projects: Restricted Project, Audit.

jhurwitz added subscribers: jhurwitz, angie, aj, • rameshdharan.

jhurwitz updated the task description. (Show Details)Jul 30 2015, 11:57 PM

jhurwitz updated the task description. (Show Details)Jul 30 2015, 11:59 PM

jhurwitz updated the task description. (Show Details)

Improve the error message and provide a button to close the audit

This seems reasonable.

Have a daemon that processes deleted commits/branches and deletes associated (incomplete?) audits

I think we probably don't want to do this. If someone accidentally deletes master and no one realizes for a little while, it would be bad to potentially destroy years worth of history, state and discussion contained in audits. Even just auto-closing them (without actually removing any comments/discussion) seems a bit bad.

Give us a script we can use that deletes a branch and cleans up its audits, or that given the name of a just-deleted branch cleans up its audits, or similar

You can use bin/audit delete to do this, although you'll have to massage things a bit. A working approach might be:

Use bin/audit delete --dry-run --repositories X to get a list of open audits for affected repositories.
Separately, remove all the audits from that list which are for commits which still exist in the remote, using some small script you write.
You can then use bin/audit delete --ids ... to close the remaining obsolete audits.

In T9028#129241, @epriestley wrote:

Have a daemon that processes deleted commits/branches and deletes associated (incomplete?) audits

I think we probably don't want to do this. If someone accidentally deletes master and no one realizes for a little while, it would be bad to potentially destroy years worth of history, state and discussion contained in audits. Even just auto-closing them (without actually removing any comments/discussion) seems a bit bad.

What about somehow ghosting them so that they don't show up in the UI, without actually deleting the data?

jhurwitz edited subscribers, added: ramesh_dharan; removed: • rameshdharan.Aug 4 2015, 1:14 AM

You can use bin/audit delete to do this, although you'll have to massage things a bit.

This won't really address the issue for us, since with our setup running bin/audit requires superadmin access, i.e. even as the admin of a particular project I don't have access to ssh to the machine actually hosting our Phabricator installation to run bin/audit.

Improve the error message and provide a button to close the audit

This seems reasonable.

Just wanted to chime in and say that I'd be perfectly happy with this fix for now.

Any updates on this? I've still got these annoying audits in my queue and they continue to frustrate me.

I think my preference would be the "Improve the error message and provide a button to close the audit" option. @epriestley, how long would this take? Assuming it's small, I think we'd want to prioritize this.

This is probably about an hour of work, although it may be a bit involved to distinguish between "the commit no longer exists in the repository" and other errors. If you want to prioritize this I can give you an hour-ish fix or a better estimate if it looks more complicated once I get into the code.

jhurwitz moved this task from Restricted Project Column to Restricted Project Column on the Restricted Project board.Oct 1 2015, 9:06 PM

epriestley added a revision: D14227: Improve Diffusion behavior for no-longer-existing commits.Oct 2 2015, 11:04 PM

D14227 should fix our mishandling of these commits:

Screen Shot 2015-10-02 at 4.01.27 PM.png (1×1 px, 219 KB)

As for a button to get rid of any audits, if you've been asked to audit you can already accept the commit -- if a concern has been raised, I think this is just T2393. Or is there another state I'm missing?

epriestley added a commit: rP0db86cce7df7: Improve Diffusion behavior for no-longer-existing commits.Oct 2 2015, 11:11 PM

angie moved this task from Restricted Project Column to Restricted Project Column on the Restricted Project board.Oct 2 2015, 11:22 PM

(I might not include this when I cut stable for this since the controller is a little bit of a mess, it's possible I missed something, and I don't expect it to be vetted much between 5PM Friday and 5AM Saturday.)

epriestley mentioned this in T9646: Duplicates commits after doing git reset.Oct 27 2015, 7:29 PM

chad moved this task from Restricted Project Column to Restricted Project Column on the Restricted Project board.Feb 2 2016, 4:46 AM

chad moved this task from Restricted Project Column to Restricted Project Column on the Restricted Project board.Feb 2 2016, 4:50 AM

epriestley added a project: Prioritized.May 25 2016, 9:19 PM

Herald added a subscriber: eadler. · View Herald TranscriptMay 25 2016, 9:19 PM

epriestley moved this task from Backlog to Preflight on the Prioritized board.May 25 2016, 9:20 PM

A sub-problem here is that if daemons discover commits but those commits are deleted before they can be processed, the tasks fail forever. This is normally unusual (git won't GC the commits for a few weeks, which often gives the daemons enough time to process them) but cases like fiddling with tracking, re-cloning a repository, everyone going on vacation for a couple weeks, or clustering an observed repository can exacerbate them.

I believe we can flag these commits as deleted, and revive them from the dead if they are later rediscovered fairly easily, to at least clear the daemon issues.

aj removed a subscriber: aj.May 25 2016, 9:27 PM

scode added a project: Restricted Project.May 25 2016, 9:36 PM

scode added a subscriber: scode.

epriestley renamed this task from Deleting a branch does not delete associated audits to Phabricator handles commits which are published and later deleted poorly.May 25 2016, 9:38 PM

eadler moved this task from Restricted Project Column to Restricted Project Column on the Restricted Project board.Jun 1 2016, 10:37 PM

To revive deleted commits when they are re-pushed, I think it is sufficient to ignore deleted commits when filling the discovery cache (PhabricatorRepositoryDiscoveryEngine->fillCommitCache()) and allow recordCommit() to detect existing commits and revive them.

This should work even if you delete a branch in the remote, add new commits to it locally, then push the branch back to the remote, so revived commits arrive underneath new commits.

I think this is straightforward, and using a free bit in the importStatus bitfield should be OK for storing the "deleted" flag for this purpose.

Marking commits as deleted is more complicated. Some strategies I can come up with:

Mark in Daemons: We can mark commits as deleted in the daemons. When a daemon tries to work on a commit and finds it does not exist, it marks the commit deleted and stops working on the task.

I think this approach has a lot of downsides:

Daemons perform a lot of different repository operations which might fail because of a deleted commit, and not all of them can easily or unambiguously detect "commit does not exist" vs other types of failure, so it would be somewhat hard to detect when we need to run the "mark deleted" code.
We could make the daemons all run an explicit "does this commit exist?" check up front to fix this, but this is relatively expensive.
This causes potential race conditions and contention between the daemons (marking commits deleted) and the discovery process (marking commits revived).
This can leave us with sparse deleted commit information: deleted commits will potentially be marked in random order, and newer deleted commits may be marked deleted before older ones. If we use the approach to revival described above, revival will stop when it encounters the first commit which is not deleted, but there may be more commits beyond it which still need to be revived.

Mark in Discovery: We can mark commits as deleted in discovery:

The refs step can record a list of previously known branch/tag/ref heads. When a ref is updated or deleted, the old value of the ref is added to the list.
The discovery step can consume this list, and identify all ancestors of the head which are known and no longer reachable.

We record commit parent relationships and can determine "known" fairly easily, but testing "no longer reachable" is tricky. A naive approach would be to run git branch --contains or similar on each commit, but this is likely to be painfully slow in large repositories (see T9898) and isn't correct anyway since ref or tag reachability should be sufficient.

I think the best way to do this is probably this mess:

Use PhabricatorGitGraphStream to pull a stream of every reachable commit.
Follow the database edges backward and test them against the stream.

This should perform well until a repository has too many commits to fit the hashes into memory.

We can use a similar strategy to implement bin/repository mark-everything-that-has-been-deleted-as-deleted.

One other thought is that there's a sort of race condition here where a commit is deleted and revived before it is processed. This could leave two sets of daemons in queue and possibly, e.g., evaluate Herald rules twice. We can mitigate this at least partially by making "natural" tasks more proactive about checking if they've already been completed, which is easy.

We should also probably resolve T6878 here.

This marginally increases our overhead in the common case, where a valid branch is amended with a small number of valid commits. I think there isn't much we can do here. This will boil down to approximately one extra git log --all, which is modest. Possibly, refs can be smart enough to avoid recording a ref on the old refs list when it's a straight amend. This largely just changes the cost to a similar git merge-base, although it's probably a little better in terms of complexity.

Overall, I plan to:

Resolve T6878 (fetch and discover all refs, not just branches -- this will become a bigger issue after we start marking things deleted).
Add a DELETED flag.
Provide UI messaging for the DELETED flag ("This commit has been deleted...").
Ignore deleted commits when filling the commit cache; revive deleted commits when re-recording them.
Add a mechanism for using GitGraphStream to efficiently test for the bulk existence of commits.
Add a bin/repository ... script for synchronizing commit deletion state in an existing repository.
Make "natural" daemons more proactive about aborting processing of already-processed commits.
Make all daemons abort processing of deleted commits.
Add an "old refs" list to refs which is populated with removed and rewritten refs.
Make discover consume the old refs and test their ancestors for deletion.

epriestley moved this task from Preflight to The Queue on the Prioritized board.Jun 16 2016, 11:44 AM

epriestley added a subtask: T6878: Tagged commits which are not ancestors of any branch head don't get imported.

epriestley added a revision: D16129: Fetch and discover all Git ref types, not just branches.Jun 16 2016, 12:59 PM

epriestley added a revision: D16130: Add an "Unreachable" flag for commits and revive them during discovery.Jun 16 2016, 2:30 PM

epriestley added a revision: D16131: Make daemons ignore "Unreachable" commits and avoid duplicate work.Jun 16 2016, 3:04 PM

epriestley added a revision: D16132: Add "bin/repository mark-reachable" for fixing commit reachability flags.Jun 16 2016, 4:04 PM

epriestley added a revision: D16133: When a ref is moved or deleted, put it on a list; later, check for reachability.Jun 16 2016, 5:48 PM

epriestley added a revision: D16134: Don't track "phabricator/" staging area tags.Jun 16 2016, 5:57 PM

epriestley mentioned this in T5715: Diffusion commit count is incorrect after forced push.Jun 16 2016, 6:01 PM

epriestley merged a task: T5715: Diffusion commit count is incorrect after forced push.

epriestley added subscribers: • JThramer, wotte, adityar7, asherkin.

epriestley mentioned this in T10742: PhabricatorRepositoryCommitParserWorker fails on deleted commits.Jun 16 2016, 6:08 PM

epriestley merged a task: T10742: PhabricatorRepositoryCommitParserWorker fails on deleted commits.

epriestley added subscribers: benjumanji, pwithnall, fooishbar.

I'm vaguely suspicious that I may have missed some weird stuff here since this pipeline is complicated, but I'm going to land this stuff now and just plan to hold the release a bit if it feels shaky tomorrow.

epriestley added a commit: rP2949905c045b: Fetch and discover all Git ref types, not just branches.Jun 16 2016, 6:20 PM

epriestley closed subtask T6878: Tagged commits which are not ancestors of any branch head don't get imported as Resolved.

epriestley added a commit: rPec89c7d63e26: Add an "Unreachable" flag for commits and revive them during discovery.

epriestley added a commit: rP77ee518d88af: Make daemons ignore "Unreachable" commits and avoid duplicate work.

epriestley added a commit: rP02d7bb8604f3: Add "bin/repository mark-reachable" for fixing commit reachability flags.

epriestley added a commit: rP1c63ac6a3adf: When a ref is moved or deleted, put it on a list; later, check for reachability.

epriestley added a commit: rP7c8f9d7ba26c: Don't track "phabricator/" staging area tags.Jun 16 2016, 6:22 PM

epriestley mentioned this in T6878: Tagged commits which are not ancestors of any branch head don't get imported.Jun 16 2016, 6:22 PM

I've pushed this here, and enabled dangerous changes on rGITTEST if anyone wants to try anything.

I pushed rGITTEST450fc0d1b09b to doomed-branch, then deleted the branch, and things seem to have worked correctly: the commit is marked unreachable/deleted, and the repository summary reverted to 87 commits with the older commit as the marquee commit.

I also pushed rGITTEST38447a6c to the tag watermelon only (it is unreachable from any branch head) and it was properly discovered.

This is a relatively high-complexity change, but these operations don't seem to have left us with any spinning daemons or other horrible side effects.

Because this is a relatively complex change I'd recommend waiting a little bit before anyone puts this into production.

When you do, you can use bin/repository mark-reachable <repository> to fix the reachability flags on an existing repository. If you aren't running into problems currently you may not need to bother, but this will let any daemons that are stuck trying to operate on unreachable commits fail gracefully.

Going forward, Phabricator should automatically detect when commits become unreachable (or are revived by being pushed as ancestors of some new head).

Additionally, Phabricator now tracks all tags (except those starting with phabricator/) and all freeform refs. We may refine these rules in the future.

(One other small change I'll probably make is marking unreachable commit handles as "closed", so they get a strikethru style.)

Beyond that, I don't have any further worked planned here unless issues crop up.

Goal	Ref	Time	Notes
Discover Everything	D16129, D16134	1 Hour	Discover commits only reachable from tags and refs.
Unreachable Behaviors	D16130, D16131	1 Hour	Make UI and daemons behave properly in the presence of unreachable commits.
Detect Unreachability	D16132, D16133	1 Hour	Detect when commits become unreachable.
Subtotal		3 Hours
Cumulative Total		3 Hours

epriestley added a revision: D16135: Mark unreachable commits handles as "closed".Jun 16 2016, 6:52 PM

If the mark-reachable command is used, is it subject to races in the sense that in-flight legit newly importing commits may be marked unreachable, or does it just trigger the process of going back and re-evaluating true reachability of all commits?

It basically runs git log --all, then checks every commit in the database against that list and marks the missing ones as unreachable.

It doesn't examine or interact with import status or the daemon pipeline (at least, directly).

Importing commits may be marked unreachable if all heads have already been deleted. That is, if a branch was pushed and then deleted, tasks trying to import those commits may be left in the queue. mark-reachable will identify those commits as unreachable and the tasks will fail permanently the next time they attempt to execute.

Not sure if that answers your question or not, let me know if I can clarify.

epriestley added a commit: rP8032a142232d: Mark unreachable commits handles as "closed".Jun 16 2016, 8:01 PM

Yeah. I just wanted to make sure whether mark-reachable was safe to run without fear of accidentally marking something unreachable that isn't (i.e., if the command were to imply "mark everything in the import queue as unreachable no matter what"). Sorry it was probably a bit of a lazy question since a more careful re-reading of the rest of the ticket would probably have answered it for me.

Thanks!

Updated our Phabricator install just now. One issue I discovered is that we are having a bunch of audits triggered for commits that are newly discovered. We have Diffusion configured to track only the master branch, but now Diffusion is importing commits from all branches, it seems.

We have Diffusion configured to track only the master branch, but now Diffusion is importing commits from all branches, it seems.

The expectation is that these commits will be imported:

all ancestors of tracked branches;
all ancestors of tags not beginning phabricator/; and
all ancestors of other refs.

You can use bin/repository discover --verbose <repository> to see exactly what we're tracking. For example:

epriestley@orbital ~/dev/phabricator $ ./bin/repository discover --verbose R55
Discovering "R55"...
Discovering commits in repository "R55".
Examining ref "master", at "a940a9a297039d7db0fa7dae6312c3a7c7e4abc2".
Skipping, HEAD is known.
Examining ref "circle-test-1", at "4ece35174bcfd1454440a3ddba03ecbf78a5e22d".
Skipping, HEAD is known.
Examining ref "branch-and-tag", at "9537e472fefbac22273bcd4d45c5b8bbbd0b2baa".
Skipping, HEAD is known.
Examining ref "lfs-self-track", at "40378a2807bd80876fb9c13b5546159ba8e264ee".
Skipping, HEAD is known.
Examining ref "featurex", at "fd88d4c54f86bdf628ec1b88f6ef346542eb4e64".
Skipping, HEAD is known.
Examining ref "notauto", at "e25dc9c7c96031623e6723005aff3e22b1dfab11".
Skipping, HEAD is known.
Examining ref "blerps", at "118c70cf3f7b0e157c5c987550d3cd2beb2e0671".
Skipping, HEAD is known.
Examining ref "feature2", at "b5ee37cce43970dcc34cfe9dc805fe01e501e8cc".
Skipping, HEAD is known.
Examining ref "feature", at "6cf5f6d0c8c06c4c73b8783666d9b3ecce138244".
Skipping, HEAD is known.
Examining ref "quack2", at "5a9c51e86615f6e1097b2a4a73ef0fe75981c1dd".
Skipping, HEAD is known.
Examining ref "quackquack", at "b74896aa087cd643118604099bb2c10958722aaf".
Skipping, HEAD is known.
Examining ref "blarp", at "ad17c19ae1d71538bdb3d24b5485fd2290d10cd0".
Skipping, HEAD is known.
Examining ref "phabricator/diff/403", at "4ece35174bcfd1454440a3ddba03ecbf78a5e22d".
Skipping, ref is untracked.
Examining ref "phabricator/diff/404", at "4ece35174bcfd1454440a3ddba03ecbf78a5e22d".
Skipping, ref is untracked.
Examining ref "phabricator/diff/402", at "1ef0e21e517a29d5b09ec9ecd89cfda19e95d5a7".
Skipping, ref is untracked.
Examining ref "phabricator/diff/401", at "02a0f69c1ffe3dfa13fe678239f0a8a42d8a0ccc".
Skipping, ref is untracked.
Examining ref "phabricator/diff/400", at "f97829f2df15202189eebf380e25f48d78a2c458".
Skipping, ref is untracked.
Examining ref "refs/pull/2/merge", at "9c56735e3bd99d2554486c7d285eebc381f6bbf1".
Skipping, HEAD is known.
Examining ref "phabricator/base/401", at "4dcd4c492b31d82f0fec6f447cf3719b9cc079e0".
Skipping, ref is untracked.
Examining ref "phabricator/base/403", at "4dcd4c492b31d82f0fec6f447cf3719b9cc079e0".
Skipping, ref is untracked.
Examining ref "phabricator/base/402", at "4dcd4c492b31d82f0fec6f447cf3719b9cc079e0".
Skipping, ref is untracked.
Examining ref "phabricator/base/400", at "4dcd4c492b31d82f0fec6f447cf3719b9cc079e0".
Skipping, ref is untracked.
Examining ref "phabricator/base/404", at "4dcd4c492b31d82f0fec6f447cf3719b9cc079e0".
Skipping, ref is untracked.
Examining ref "phabricator/diff/54321", at "8c133a4e08fc600753042805ba5662aaed3b27f1".
Skipping, ref is untracked.
Examining ref "phabricator/diff/262", at "8c133a4e08fc600753042805ba5662aaed3b27f1".
Skipping, ref is untracked.
Examining ref "phabricator/diffs/54321", at "8c133a4e08fc600753042805ba5662aaed3b27f1".
Skipping, ref is untracked.
Examining ref "tag1", at "458f403ae28fe6eabd380912e1a000a7cdcbcf4c".
Skipping, HEAD is known.
Examining ref "phabricator/diff/261", at "d4b2710c051061eabbb40adb5c73d3bf08df2e12".
Skipping, ref is untracked.
Examining ref "phabricator/diff/260", at "94bc447a5492746a9811648997ba04e63a6f4fdd".
Skipping, ref is untracked.
Examining ref "phabricator/diff/241", at "121862a764ed6bb36a031f5cf027256192d5c8ea".
Skipping, ref is untracked.
Examining ref "phabricator/diff/240", at "9ec71ab2c21aa88bc775a6ffefc5d051ce68b534".
Skipping, ref is untracked.
Examining ref "phabricator/diff/238", at "41ba4f16074a81b58e328afbdcd462d5bd8bdf4f".
Skipping, ref is untracked.
Examining ref "refs/secret/spaghetti", at "4238a0de5ef9afc262350ca87326c94afe7d2674".
Skipping, HEAD is known.
Examining ref "derp3", at "2140215815bc3d68e67f240fe43e182852a1fcc4".
Skipping, HEAD is known.
Examining ref "refs/pull/3/head", at "118c70cf3f7b0e157c5c987550d3cd2beb2e0671".
Skipping, HEAD is known.
Examining ref "refs/pull/2/head", at "6cf5f6d0c8c06c4c73b8783666d9b3ecce138244".
Skipping, HEAD is known.
Examining ref "quack", at "c551ba8a245695daf5e9a4d139beda575b801f76".
Skipping, HEAD is known.
Examining ref "qwer", at "683d658264a5f10466671c2eeda16edb652e36a7".
Skipping, HEAD is known.
Examining ref "asdf", at "683d658264a5f10466671c2eeda16edb652e36a7".
Skipping, HEAD is known.
Done.

OK, running ./bin/repository discover --verbose rX gives me a bunch of output including this:

Examining ref "refs/remotes/origin/XYZ", at "441ca44686e5afb9054f3e25b19b70e3aefd21f7".
Skipping, HEAD is known.

Running git branch --all --contains 7a048137e59cad0f1ea1feb4c6daaa45689e91da in the working directory for this repository on our repository host (7a048137e59cad0f1ea1feb4c6daaa45689e91da is one of the commits that triggered an audit`) produces the following output:

git branch --all --contains 7a048137e59cad0f1ea1feb4c6daaa45689e91da
  remotes/origin/XYZ

Interestingly though, this branch no longer exists in the remote repository. Running git branch --all from my local machine and I don't see this branch.

Hrrm -- is the repository a non-bare working copy (e.g., has files and folders in it instead of just config/, hooks/, info/, etc)?

Is there a lack of a git remote prune origin or similar?

Also: FWIW, examining every ref like that is a complete no-go for our situation (tens of thousands of refs, huge repository, hundreds of thousands of commits in the average ref). It would work as a one-off that just runs for days, but not for any kind of recurring/incremental process.

We currently have a small patch to force the fetching to only consider the refs we care about though, so hopefully we shouldn't have an actual problem with this. But only because we're filtering it out.

(We really want Phab to only ever consider the master branch no matter what, but we're limited to the autoclose functionality right now and that doesn't affect importing or fetch scope.)

In T9028#180926, @epriestley wrote:

Hrrm -- is the repository a non-bare working copy (e.g., has files and folders in it instead of just config/, hooks/, info/, etc)?

Nope, it is definitely bare.

We fetch bare working copies with git fetch origin +refs/*:refs/* --prune.

We fetch non-bare working copies with git fetch all --prune.

I would expect --prune to remove the branch in both cases, but I may be misunderstanding the behavior of --prune.

I wonder if the actual authoritative repository has a refs/remotes/origin/XYZ ref which we're mirroring?

We can ignore refs/remotes/ refs for bare repositories to "fix" this, but I'd like to understand what's going on. I'm not getting any refs/remotes/... refs in, e.g., local test repositories I'm observing from GitHub.

I think checking if git ls-remote origin in the working copy lists a refs/remotes/origin/XYZ ref in the authoritative remote is the easiest way to check, although I'm not 100% sure that refs/remotes/ isn't magic with respect to ls-remote.

Ah interesting. From our GitLab host, I see that git branch --all shows a whole bunch of local branches and only a single branch of the form remotes/origin/url_shortener.

In T9028#180932, @joshuaspence wrote:

Ah interesting. From our GitLab host, I see that git branch --all shows a whole bunch of local branches and only a single branch of the form remotes/origin/url_shortener.

Wierd... I can't find any reference to this branch in refs/heads or refs/remotes.

okay so basically ghosts?

iiam

epriestley added a revision: D16136: Ignore unrecognized refs in "refs/remotes/".Jun 16 2016, 11:01 PM

I think D16136 will "fix" the problem and I can't come up with any problems it creates, although it would be nice to figure out where the mystery ref is coming from.

OK, so I just manually removed the branch from info/refs and packed-refs and I think that maybe that worked?

epriestley added a commit: rP28eb5628999b: Ignore unrecognized refs in "refs/remotes/".Jun 16 2016, 11:03 PM

FWIW, examining every ref like that is a complete no-go for our situation

In your case, I expect that keeping your "master only" patch will be fine, and running mark-reachable will clean up any clutter left over from earlier adventures.

I generally expect this approach to hold up in tens-of-thousands of refs / millions-of-commits situations since I wasn't able to reproduce any degenerate behavior with a 10K+-ref repository in T10368, but keeping the master restriction should also be fine since this doesn't seem to hold in your environment.

(The largest upstream repository on this host, rSTAGING, has 4700 refs and costs about 500ms to repository discover vs about 200ms for rP; I'd expect this cost to be linear.)

This appears to be stable, and is now accounted for.

epriestley mentioned this in T11259: UNRECOVERABLE FATAL ERROR after upgrading to 2016 Week 27.Jul 2 2016, 4:35 PM

epriestley mentioned this in T11522: Improve support for guiding users through repository rewrites.Aug 24 2016, 1:09 AM

	F859275: Screen Shot 2015-10-02 at 4.01.27 PM.png
	Oct 2 2015, 11:06 PM

Phabricator handles commits which are published and later deleted poorlyClosed, ResolvedPublicActions

Description

Revisions and Commits

Related ObjectsSearch...

Event Timeline

Phabricator handles commits which are published and later deleted poorly
Closed, ResolvedPublic
Actions

Related Objects
Search...