Page MenuHomePhabricator

When repositories hit pull errors, stop updating them as frequently
ClosedPublic

Authored by epriestley on Sep 20 2016, 12:28 AM.
Tags
None
Referenced Files
F13086489: D16575.diff
Thu, Apr 25, 12:31 AM
F13085457: D16575.diff
Wed, Apr 24, 11:49 PM
Unknown Object (File)
Fri, Apr 12, 3:23 PM
Unknown Object (File)
Thu, Apr 11, 8:03 AM
Unknown Object (File)
Sun, Apr 7, 10:26 AM
Unknown Object (File)
Sat, Apr 6, 4:16 AM
Unknown Object (File)
Wed, Apr 3, 7:50 AM
Unknown Object (File)
Mon, Apr 1, 9:33 AM
Subscribers
None

Details

Summary

Ref T11665. Currently, when a repository hits an error, we retry it after 15s. This is correct if the error was temporary/transient/config-related (e.g., bad network or administrator setting up credentials) but not so great if the error is long-lasting (completely bad authentication, invalid URI, etc), as it can pile up to a meaningful amount of unnecessary load over time.

Instead, record how many times in a row we've hit an error and adjust backoff behavior: first error is 15s, then 30s, 45s, etc.

Additionally, when computing the backoff for an empty repository, use the repository creation time as though it was the most recent commit. This is a good proxy which gives us reasonable backoff behavior.

This required removing the CODE_WORKING messages, since they would have reset the error count. We could restore them (as a different type of message), but I think they aren't particularly useful since cloning usually doesn't take too long and there's more status information avilable now than there was when this stuff was written.

Test Plan
  • Ran bin/phd debug pull.
  • Saw sensible, increasing backoffs selected for repositories with errors.
  • Saw sensible backoffs selected for empty repositories.

Diff Detail

Repository
rP Phabricator
Branch
daemon2
Lint
Lint Passed
Unit
Tests Passed
Build Status
Buildable 13775
Build 17795: Run Core Tests
Build 17794: arc lint + arc unit

Event Timeline

epriestley retitled this revision from to When repositories hit pull errors, stop updating them as frequently.
epriestley updated this object.
epriestley edited the test plan for this revision. (Show Details)
epriestley added a reviewer: chad.
This revision was automatically updated to reflect the committed changes.