Page MenuHomePhabricator

Periodically run `git prune` on Git working copies
Open, NormalPublic

Description

See PHI497. An instance ran into an issue where the remote was complaining about too many unreachable objects:

remote: warning: There are too many unreachable loose objects; run 'git prune' to remove them.

We don't currently run git prune or git gc automatically since it has never caused issues before, but should probably start running git gc on working copies every so often.

One possible issue with this is that git gc can take a very long time to complete on large working copies. Previously, in PHI386, an unusually large repository for the same instance took about 5 hours to git gc. However, it took less than a minute to git prune, so maybe git prune is more safe to run regularly.

A possible workaround is to run these operations as part of bin/remote optimize, or some similar workflow, in the cluster, only.

Event Timeline

epriestley triaged this task as Normal priority.

Previously, in PHI386, an unusually large repository for the same instance took about 5 hours to git gc.

Is that possibly just because git gc had not been run recently before, and regular executions wouldn't accumulate so much badness?
If git prune was run after I'd be wary of the prior git gc being the reason for the speediness.

That's a reasonable point. The details of PHI386 weren't very indicative one way or another since I didn't end up GC'ing the repository multiple times. I suppose I can go GC it again during the deployment window this week and see how long it takes.

alexmv added a subscriber: alexmv.Apr 4 2018, 6:17 PM
aubort added a subscriber: aubort.Aug 9 2018, 8:29 AM
epriestley moved this task from Backlog to Do Eventually on the Phacility board.Aug 10 2018, 6:06 PM

PHI860 is a close variant of this and discusses periodically running git repack. Most concerns around git repack are likely similar to concerns around gc and prune.

When a clustered repository node is repacking, collecting, or pruning, it may make sense for it to mark itself as "lower priority" for reads and writes. See also T10884. This would let it stay in the cluster, but shed most traffic until the operation completes.