Page MenuHomePhabricator

Switching imported repository to hosted, then cloning over HTTP in Phacility production fails
Closed, ResolvedPublic

Description

A user reports an issue with:

  • importing a repository from GitHub on a Phacility instance;
  • switching it from imported to hosted;
  • cloning over HTTP.

I can reproduce this in production, but not in development:

git clone https://test-ji2ubpx2swcl.phacility.com/diffusion/POEMS/poems.git poemhttp1
Cloning into 'poemhttp1'...
error: RPC failed; result=22, HTTP code = 500
fatal: The remote end hung up unexpectedly

Neither the web or repo nodes currently log anything relevant.

Our logging around clone issues is generally rather deficient so I'm going to start by improving that.

Event Timeline

Logging for this is now in production, but the initial data is totally useless:

mysql> select * from repository_pullevent;
+----+--------------------------------+--------------------------------+------------+--------------------------------+---------------+----------------+------------+------------+------------+
| id | phid                           | repositoryPHID                 | epoch      | pullerPHID                     | remoteAddress | remoteProtocol | resultType | resultCode | properties |
+----+--------------------------------+--------------------------------+------------+--------------------------------+---------------+----------------+------------+------------+------------+
|  1 | PHID-PULE-to3l26kkvy6grc3eym6d | PHID-REPO-xupsohlap33awie3agus | 1454197482 | PHID-USER-votbrdt4fwqfwyn3q7c2 |    XXXXXXXXXX | http           | wild       |        200 | null       |
+----+--------------------------------+--------------------------------+------------+--------------------------------+---------------+----------------+------------+------------+------------+
1 row in set (0.00 sec)

Slightly more useful:

{"response.message":"Error 1: fatal: protocol error: bad line length character: \u001f\ufffd\b\n"}

Alright, I made some progress on this.

In production, the failing request is being sent gzipped with Content-Encoding: gzip. I'm not sure why git chooses to gzip in production and not in development. I can see no obvious difference between the server responses. The trigger may actually be the request exceeding an arbitrary 1KB threshold:

remote-curl.c
...
	} else if (use_gzip && 1024 < rpc->len) {
...

When we receive a gzipped response, we aren't decoding it correctly.

I merged these to stable and upgraded web001 to fix the immediate issue. The utf8 fix technically needs to go to the repo hosts, but we don't need it there immediately since it only aided in identifying and debugging the gzip issue.