Paths

Table of Contentst

Differential D21484

Fix isValidGitShallowCloneResponse
ClosedPublic
Actions

Authored by dzduvall on Oct 24 2020, 10:43 PM.

Details

Reviewers

epriestley

Group Reviewers

Blessed Reviewers

Commits

rPc04147328fa3: Fix isValidGitShallowCloneResponse

Required Signatures

L28 Phacility Individual Contributor License Agreement

Summary

Changes the heuristic method by which non-zero exit statuses from git-http-backend are found to be due to packfile negotiation during shallow fetches, etc.

Instead of checking git-http-backend stderr for a generic "hung up" error message, see if the pack-result response contains a terminating flush packet ("0000"). This should give a greater assurance that the request was handled correctly and the response is complete.

Test Plan

Run GIT_CURL_VERBOSE=1 git fetch --depth 1 https://host.example/source/repo.git HEAD to ensure it completes and includes two successful POST requests during packfile negotiation (the last one actually receives the packfile).

Diff Detail

Repository

rP Phabricator

Lint

Lint Not Applicable

Unit

Tests Not Applicable

Event Timeline

dzduvall created this revision.Oct 24 2020, 10:43 PM

Herald added a reviewer: Blessed Reviewers. · View Herald TranscriptOct 24 2020, 10:43 PM

Herald added a required legal document: L28 Phacility Individual Contributor License Agreement. · View Herald Transcript

dzduvall requested review of this revision.Oct 24 2020, 10:43 PM

Herald added a subscriber: Korvin. · View Herald TranscriptOct 24 2020, 10:43 PM

epriestley mentioned this in T13584: Shallow Git clones fail under recent versions of Git.Oct 24 2020, 11:05 PM

Minor nitpicks inlines, I'll just tweak them locally as I land this.

(Since this wasn't submitted with arc, the Git "Author" information has been lost and the commit will be incorrectly attributed to me as an author. If you submit further patches, you can use arc to retain authorship information if you'd like.)

src/applications/diffusion/controller/DiffusionServeController.php
937	For clarity and consistency in this codebase, prefer explicit parentheses when using the result of an expression as a value. That is, this: $x = ($y > $z); ...is preferable to: $x = $y > $z; This should be a lint rule, but this change didn't come through `arc` and missed lint anyway, so it's sort of moot. The existing code also incorrectly doesn't follow this convention, and returns an expression later on (`return x && y` instead of `return (x && y)`).
937	For correctness, prefer `===` over `==`. This should also be a lint rule, but would be ambitious at this point, see also "Increase Strictness" in T2312.
937	`$stdout` may be fewer than 5 characters long. In PHP 7 and newer, `substr()` returns `""` in this case. In versions older than PHP 7, it returns `false`. It seems at least plausible that this may emit a warning in some future version of PHP (or possibly a manufactured version of PHP under T2312). Even if it doesn't, the ambiguity of the input and variance of behavior across PHP versions is probably worth avoiding by either testing for the string length before taking a slice of it or using `substr_compare()` instead, where the semantics of out-of-bounds ranges are less ambiguous.
937	The byte stream has "<length><raw bytes>" sections and may naturally contain `\n0000` outside of a flush packet (that is, none of these bytes are guaranteed to be escaped if they appear elsewhere in the protocol). I think this is still a better approach than matching error text on `stderr`. Phabricator can parse significant chunks of the Git protocol to some degree now (see D20381; this capability was built after the code here was written) and we could imagine pursuing a full-power version of this change where we parse the entire protocol. I intend to do this eventually, but I think this change is a fine stopgap in the meantime.
937	There is no guarantee that the frame prior to the 0-length frame terminates with "\n". Locally, if I fetch a `HEAD` I already have (`git fetch --depth 1 origin HEAD`), I get this over the wire (I've added visual newlines for clarity; newlines in the raw wire bytes are represented with `\r` and `\n`): Expires: Fri, 01 Jan 1980 00:00:00 GMT\r\n Pragma: no-cache\r\n Cache-Control: no-cache, max-age=0, must-revalidate\r\n Content-Type: application/x-git-upload-pack-result\r\n \r\n 0034 shallow 2d34f0e1ea69324a12f7e4a3f277b13f104370a9 0000 The documentation has examples with a reflist frame that does terminate with `\n`, but the frame format in general is just `<length><stuff>`, and not all frames terminate like this. A check against just `0000` (versus `\n0000`) seems like the simplest adjustment.

This revision is now accepted and ready to land.Oct 30 2020, 8:32 PM

Closed by commit rPc04147328fa3: Fix isValidGitShallowCloneResponse (authored by epriestley). · Explain WhyOct 30 2020, 8:46 PM

This revision was automatically updated to reflect the committed changes.

epriestley added a commit: rPc04147328fa3: Fix isValidGitShallowCloneResponse.

In D21484#272642, @epriestley wrote:

Minor nitpicks inlines, I'll just tweak them locally as I land this.

(Since this wasn't submitted with arc, the Git "Author" information has been lost and the commit will be incorrectly attributed to me as an author. If you submit further patches, you can use arc to retain authorship information if you'd like.)

Thanks for the review/land! I'll be sure to use arc next time.

FWIW, I've submitted a patch to git-upload-pack to correct the behavior, and someone suggested in the review that Phabricator/Diffusion might be better off ignoring the exit code as the client can handle any truncation on its end (since the protocol contains explicit line length headers). In testing the behavior, I noticed that Apache/mod_cgi indeed doesn't care about git-http-backend's non-zero exit so perhaps that's why this hasn't been an issue for most folks—I was certainly surprised it hadn't been more widely reported.

Here's the start of the thread.

https://lore.kernel.org/git/20201030090902.GA3268509@coredump.intra.peff.net/T/#t
https://lore.kernel.org/git/20201030223504.45978-1-dan@mutual.io/T/#u

Ah, thanks for the pointers!

I'm generally pretty hesitant to ignore CLI exit codes because they can signal a whole host of "obviously not our problem" conditions, like this recent report on Discourse:

https://discourse.phabricator-community.org/t/svn-repository-does-not-import/4074/5

The user appears to have some kind of linker error in (a copy of?) their svn binary in their third-party-container environment. This sort of thing is super easy to identify when Phabricator is strict about errors, but is can sometimes be a crazy wild goose chase (since there's basically zero chance I'll be able to reproduce the issue) if it isn't.

The bright side is that the number of cases where we currently need to trust a binary that exited with an error is very small, something like ~3-5 cases across all of Arcanist and Phabricator. Some of them are very narrow, like this one:

// NOTE: Between versions 2.1 and 2.1.1, Mercurial changed the behavior
// of "hg pull" to return 1 in case of a successful pull with no changes.
// This behavior has been reverted, but users who updated between Feb 1,
// 2012 and Mar 1, 2012 will have the erroring version. Do a dumb test
// against stdout to check for this possibility.
// See: https://github.com/phacility/phabricator/issues/101/

Testing that stderr or stdout "look right" to discard the error is gross (and, as here, often fragile) but I think it's better on the balance than just ignoring the error entirely: users have broken binaries or systems far more often than these tests legitimately break because of changes in the underlying tool.

(Philosophically, see also Command Line Exit Codes, perhaps.)

Revision Contents
Changeset List

Path

Size

src/

applications/

diffusion/

controller/

DiffusionServeController.php

20 lines

Diff 51141

View Options

src/applications/diffusion/controller/DiffusionServeController.php

Fix isValidGitShallowCloneResponseClosedPublicActions