Page MenuHomePhabricator

Use "git ls-remote" to guess if "git fetch" is a no-op

Authored by epriestley on Mar 14 2017, 10:25 PM.



Ref T12296. Ref T12392. Currently, when we're observing a remote repository, we periodically run git fetch ....

Instead, periodically run git ls-remote (to list refs in the remote) and git for-each-ref (to list local refs) and only continue if the two lists are different.

The motivations for this are:

  • In T12296, it appears that doing this is faster than doing a no-op git fetch. This effect seems to reproduce locally in a clean environment (900ms for ls-remote + 100ms for for-each-ref vs about 1.4s for fetch). I don't have any explanation for why this is, but there it is. This isn't a huge change, although the time we're saving does appear to mostly be local CPU time, which is good for us.
  • Because we control all writes, we could cache git for-each-ref in the future and do fewer disk operations. This doesn't necessarily seem too valuable, though.
  • This allows us to tell if a fetch will do anything or not, and make better decisions around clustering (in particular, simplify how observed repository versioning works). With git fetch, we can't easily distinguish between "fetch, but nothing changed" and "legitimate fetch".

If a repository updates very regularly we end up doing slightly more work this way (that is, if ls-remote always comes back with changes, we do a little extra work), but this is normally very rare.

This might not get non-bare repositories quite right in some cases (i.e., incorrectly detect them as changed when they are unchanged) but we haven't created non-bare repositories for many years.

Test Plan

Ran bin/repository update --trace --verbose PHABX, saw sensible construction of local and remote maps and accurate detection of whether a fetch would do anything or not.

Diff Detail

rP Phabricator
Automatic diff as part of commit; lint not applicable.
Automatic diff as part of commit; unit tests not applicable.

Event Timeline


These aren't precisely related, but bin/repository update --verbose and similar were just printing "%s" since log() doesn't actually take a pattern.

This revision is now accepted and ready to land.Mar 14 2017, 10:41 PM
This revision was automatically updated to reflect the committed changes.
joshuaspence added inline comments.

I think you can just pass --refs t git ls-remote to avoid it returning pseudorefs.



This flag appears to work, but be completely undocumented? How did you discover it?


man git ls-remote on Git 2.11.0

$ man git ls-remote | grep -- --refs
No manual entry for ls-remote
epriestley@orbital ~ $ git help ls-remote | grep -- --refs
epriestley@orbital ~ $
epriestley@orbital ~ $ man git-ls-remote | grep -- --refs
epriestley@orbital ~ $