Page MenuHomePhabricator

Fix handling of gzip in VCS responses
ClosedPublic

Authored by epriestley on Feb 12 2016, 12:25 PM.
Tags
None
Referenced Files
Unknown Object (File)
Tue, Dec 31, 3:17 PM
Unknown Object (File)
Mon, Dec 30, 12:44 AM
Unknown Object (File)
Sun, Dec 29, 1:37 AM
Unknown Object (File)
Thu, Dec 26, 8:04 PM
Unknown Object (File)
Thu, Dec 26, 7:09 AM
Unknown Object (File)
Mon, Dec 23, 12:44 AM
Unknown Object (File)
Mon, Dec 23, 12:43 AM
Unknown Object (File)
Mon, Dec 23, 12:43 AM
Subscribers
None

Details

Summary

Fixes T10264. I'm reasonably confident that this is the chain of events here:

First, prior to 8269fd6e, we would ignore "Content-Encoding" when reading inbound bodies. So if a request was gzipped, we would read a gzipped body, then give git-http-backend a gzipped body with "Content-Encoding: gzip". Everything matched normally, so that was fine, except in the cluster.

In the cluster, we'd accept "gzip + compressed body" and proxy it, but not tell cURL that it was already compressed. cURL would think it was raw data, so it would arrive on the repository host with a compressed body but no "Content-Encoding: gzip". Then we'd hand it to git in the same form. This caused the issue in 8269fd6e: handing it compressed data, but no "this is compressed" header.

To fix this, I made us decompress the encoding when we read the body, so the cluster now proxies raw data instead of proxying gzipped data. This fixed the issue in the cluster, but created a new issue on non-cluster hosts. The new issue is that we accept "gzip + compressed body" and decompress the body, but then pass the original header to git-http-backend. So now we have the opposite problem from what we originally had: a "gzip" header, but a raw body.

To fix this, we could do two things:

  • Revert 8269fd6e, then change the proxy request to preserve "Content-Encoding" instead.
  • Stop telling git-http-backend that we're handing it compressed data when we're handing it raw data.

I did the latter here because it's an easier change to make and test, we'll need to interact with the raw data later anyway, to implement repository virtualization in connection with T8238.

Test Plan

See T10264 for users confirming this fix.

Diff Detail

Repository
rP Phabricator
Branch
httpgz
Lint
Lint Passed
Unit
Tests Passed
Build Status
Buildable 10659
Build 13089: Run Core Tests
Build 13088: arc lint + arc unit

Event Timeline

epriestley retitled this revision from to Fix handling of gzip in VCS responses.
epriestley updated this object.
epriestley edited the test plan for this revision. (Show Details)
epriestley added a reviewer: chad.

I created this mess in the first place by not remembering that we pass Content-Encoding to git-http-backend, although in retrospect we must do this and it couldn't work if we didn't.

chad edited edge metadata.
This revision is now accepted and ready to land.Feb 12 2016, 4:05 PM
This revision was automatically updated to reflect the committed changes.