HomePhabricator

Unify intracluster sync and Drydock working copy construction timeouts as a…

Authored by epriestley on Nov 15 2018, 4:11 PM.

Description

Unify intracluster sync and Drydock working copy construction timeouts as a repository "copy time limit"

Summary:
Depends on D19814. Ref T13216. See PHI885. For various eldritch reasons, git fetch can hang. Although we'd probably like to fix this with git fetch --require-sustained-network-transfer-rate=512KB/5s or similar, that flag doesn't exist and we don't have a reasonable way to build it.

Short of that, move toward formalizing a repository "copy time limit": the longest amount of time anything may spend trying to make a copy of this repository.

This grows out of the existing intracluster sync limit, which is effectively the same thing. Here, apply it to git clone and git fetch in Drydock working copy construction, too. A future change may make it configurable.

Test Plan:

  • Set the limit to 0.001.
  • Tried to build and lease working copies, got sensible timeout errors (see D19815).
<Activation Failed> Lease activation failed: [CommandException] Command killed by timeout after running for more than 0.001 seconds.
COMMAND
ssh '-o' 'LogLevel=quiet' '-o' 'StrictHostKeyChecking=no' '-o' 'UserKnownHostsFile=/dev/null' '-o' 'BatchMode=yes' -l '********' -p '2222' -i '********' '127.0.0.1' -- '(cd '\''/var/drydock/workingcopy-163/repo/spellbook/'\'' && git clean -d --force && git fetch && git reset --hard)'

Reviewers: amckinley

Reviewed By: amckinley

Subscribers: yelirekim, PHID-OPKG-gm6ozazyms6q6i22gyam

Maniphest Tasks: T13216

Differential Revision: https://secure.phabricator.com/D19816