Page MenuHomePhabricator

Use "git ls-remote" to guess if "git fetch" is a no-op
ClosedPublic

Authored by epriestley on Mar 14 2017, 10:25 PM.
Tags
None
Referenced Files
F13088257: D17497.diff
Thu, Apr 25, 1:21 AM
Unknown Object (File)
Fri, Apr 19, 6:18 PM
Unknown Object (File)
Tue, Apr 9, 6:41 PM
Unknown Object (File)
Wed, Apr 3, 2:52 PM
Unknown Object (File)
Sat, Mar 30, 4:18 AM
Unknown Object (File)
Fri, Mar 29, 5:11 AM
Unknown Object (File)
Mar 17 2024, 2:03 AM
Unknown Object (File)
Mar 12 2024, 6:37 PM
Subscribers

Details

Summary

Ref T12296. Ref T12392. Currently, when we're observing a remote repository, we periodically run git fetch ....

Instead, periodically run git ls-remote (to list refs in the remote) and git for-each-ref (to list local refs) and only continue if the two lists are different.

The motivations for this are:

  • In T12296, it appears that doing this is faster than doing a no-op git fetch. This effect seems to reproduce locally in a clean environment (900ms for ls-remote + 100ms for for-each-ref vs about 1.4s for fetch). I don't have any explanation for why this is, but there it is. This isn't a huge change, although the time we're saving does appear to mostly be local CPU time, which is good for us.
  • Because we control all writes, we could cache git for-each-ref in the future and do fewer disk operations. This doesn't necessarily seem too valuable, though.
  • This allows us to tell if a fetch will do anything or not, and make better decisions around clustering (in particular, simplify how observed repository versioning works). With git fetch, we can't easily distinguish between "fetch, but nothing changed" and "legitimate fetch".

If a repository updates very regularly we end up doing slightly more work this way (that is, if ls-remote always comes back with changes, we do a little extra work), but this is normally very rare.

This might not get non-bare repositories quite right in some cases (i.e., incorrectly detect them as changed when they are unchanged) but we haven't created non-bare repositories for many years.

Test Plan

Ran bin/repository update --trace --verbose PHABX, saw sensible construction of local and remote maps and accurate detection of whether a fetch would do anything or not.

Diff Detail

Repository
rP Phabricator
Branch
ref1
Lint
Lint Passed
Unit
Tests Passed
Build Status
Buildable 15984
Build 21182: Run Core Tests
Build 21181: arc lint + arc unit

Event Timeline

src/applications/repository/engine/PhabricatorRepositoryPullEngine.php
161

These aren't precisely related, but bin/repository update --verbose and similar were just printing "%s" since log() doesn't actually take a pattern.

This revision is now accepted and ready to land.Mar 14 2017, 10:41 PM
This revision was automatically updated to reflect the committed changes.
joshuaspence added inline comments.
src/applications/repository/engine/PhabricatorRepositoryPullEngine.php
425

I think you can just pass --refs t git ls-remote to avoid it returning pseudorefs.

iiam

src/applications/repository/engine/PhabricatorRepositoryPullEngine.php
425

This flag appears to work, but be completely undocumented? How did you discover it?

src/applications/repository/engine/PhabricatorRepositoryPullEngine.php
425

man git ls-remote on Git 2.11.0

$ man git ls-remote | grep -- --refs
No manual entry for ls-remote
epriestley@orbital ~ $ git help ls-remote | grep -- --refs
epriestley@orbital ~ $
epriestley@orbital ~ $ man git-ls-remote | grep -- --refs
epriestley@orbital ~ $

iiam