Paths

Table of Contentst

Diffusion Phabricator 894d0dc51bc3

(stable) When repositories hit pull errors, stop updating them as frequently
894d0dc51bc3
Actions

Tags

None

Referenced Files

None

Subscribers

None

Description

(stable) When repositories hit pull errors, stop updating them as frequently

Summary:
Ref T11665. Currently, when a repository hits an error, we retry it after 15s. This is correct if the error was temporary/transient/config-related (e.g., bad network or administrator setting up credentials) but not so great if the error is long-lasting (completely bad authentication, invalid URI, etc), as it can pile up to a meaningful amount of unnecessary load over time.

Instead, record how many times in a row we've hit an error and adjust backoff behavior: first error is 15s, then 30s, 45s, etc.

Additionally, when computing the backoff for an empty repository, use the repository creation time as though it was the most recent commit. This is a good proxy which gives us reasonable backoff behavior.

This required removing the CODE_WORKING messages, since they would have reset the error count. We could restore them (as a different type of message), but I think they aren't particularly useful since cloning usually doesn't take too long and there's more status information avilable now than there was when this stuff was written.

Test Plan:

Ran bin/phd debug pull.
Saw sensible, increasing backoffs selected for repositories with errors.
Saw sensible backoffs selected for empty repositories.

Reviewers: chad

Maniphest Tasks: T11665

Differential Revision: https://secure.phabricator.com/D16575

Details

Provenance

epriestley	Authored on Sep 20 2016, 12:01 AM
epriestley	Pushed on Sep 20 2016, 12:31 AM

Differential Revision

D16575: When repositories hit pull errors, stop updating them as frequently

Parents

rP554940b33f4f: (stable) Retain repository update cooldowns across daemon restarts

Branches

Unknown

Tags

Unknown

Tasks

T11665: repo002.phacility.net is heavily loaded

Build Status

Buildable 13777
Build 17797: Run Core Tests

Event Timeline

epriestley committed rP894d0dc51bc3: (stable) When repositories hit pull errors, stop updating them as frequently (authored by epriestley).Sep 20 2016, 12:30 AM

epriestley added a task: T11665: repo002.phacility.net is heavily loaded.

Harbormaster completed building B13777: rP894d0dc51bc3: (stable) When repositories hit pull errors, stop updating them as frequently.Sep 20 2016, 12:32 AM

Changes (5)

Path

Size

resources/

sql/

autopatches/

20160919.repo.messagecount.sql

src/

applications/

diffusion/

management/

DiffusionRepositoryStatusManagementPanel.php

repository/

engine/

PhabricatorRepositoryPullEngine.php

storage/

PhabricatorRepository.php

PhabricatorRepositoryStatusMessage.php

rP894d0dc51bc3

resources/sql/autopatches/20160919.repo.messagecount.sql

Loading...

src/applications/diffusion/management/DiffusionRepositoryStatusManagementPanel.php

Loading...

src/applications/repository/engine/PhabricatorRepositoryPullEngine.php

Loading...

src/applications/repository/storage/PhabricatorRepository.php

Loading...

src/applications/repository/storage/PhabricatorRepositoryStatusMessage.php

Loading...