Page MenuHomePhabricator

Add more detail when failing to fetch a huge repository
AcceptedPublic

Authored by amckinley on Jun 28 2018, 7:44 PM.
Tags
None
Referenced Files
Unknown Object (File)
Thu, Mar 21, 7:48 AM
Unknown Object (File)
Sun, Mar 10, 6:46 AM
Unknown Object (File)
Feb 12 2024, 4:15 AM
Unknown Object (File)
Jan 9 2024, 1:48 PM
Unknown Object (File)
Dec 22 2023, 11:16 AM
Unknown Object (File)
Nov 30 2023, 1:57 AM
Unknown Object (File)
Nov 25 2023, 7:34 PM
Unknown Object (File)
Nov 15 2023, 7:53 PM
Subscribers

Details

Reviewers
epriestley
Summary

When fetching a large repo via a slow internet connection, the fetch will eventually receive a kill -9 (corresponding to exit status 137). Provide a little more information when this happens.

Test Plan

Loaded a repo that failed to initialize; saw more useful info. See example output:

Screen Shot 2018-06-28 at 12.43.23 PM.png (910×1 px, 145 KB)

Diff Detail

Repository
rP Phabricator
Branch
repo-status
Lint
Lint Passed
Unit
Tests Passed
Build Status
Buildable 20463
Build 27795: Run Core Tests
Build 27794: arc lint + arc unit

Event Timeline

amckinley edited the test plan for this revision. (Show Details)
amckinley edited the summary of this revision. (Show Details)

Theoretically, it would probably be somewhat cleaner to encode this specifically in PhabricatorRepositoryManagementUpdateWorkflow as a timeout when doing the writeStatusMessage(), then check some 'didTimeout' property instead of looking for English text in the error message: we have to look for English text in the key error, but we could carry explicit timeout information from the update workflow.

Realistically, this doesn't matter much and the change as written improves the state of affairs, and CommandException currently doesn't have a distinct timeout flag (although it probably should) so this would require multiple changes.

This revision is now accepted and ready to land.Aug 2 2018, 3:55 PM

Particularly, it's possible that 137 came from some other source other than a timeout (maybe oomkiller, for example?) and we'll miss (fail to detect the timeout) with this approach if you translate the error message and change the default language so the message now says "El commando failidad con erroro 137". We have no other option if the subprocess is git and we force VCS subprocesses into English, but we don't currently force our own subprocesses into English since we "should" never need to preg_match() an error string when we control the string.

Yeah I’m not wild about string matching. I was thinking about just adding an ‘exitStatus’ field to ‘CommandException’ and looking for the magic exit status, but that wasn’t obvious so I just kludged this together.