Page MenuHomePhabricator

git unable to pull after upgrading to latest
Closed, ResolvedPublic

Description

I upgraded our phabricator install last night. (Commit 1e4be36484af6) All is working fine on the web side of things, but git is now unable to connect to the server.
This is the error message we get:

reuben@devworkstation1:~/konekt-rest-api$ git pull
Username for 'https://phabricator.konektdata.com': reuben
Password for 'https://reuben@phabricator.konektdata.com':
fatal: unable to access 'https://phabricator.konektdata.com/diffusion/API/konekt-rest-api.git/': GnuTLS recv error (-9): A TLS packet with unexpected length was received.

We're using a self-signed cert, but it's added to the system cert store and that hasn't changed. This error occurs even when I turn off cert checking in git.

Event Timeline

rbalik raised the priority of this task from to Needs Triage.
rbalik updated the task description. (Show Details)
rbalik added a project: Diffusion.
rbalik added a subscriber: rbalik.

In case it's helpful, these are my apache logs when trying to do a pull:

104.130.174.45 - - [18/Mar/2015:17:25:05 +0000] "GET /diffusion/API/konekt-rest-api.git/info/refs?service=git-upload-pack HTTP/1.1" 401 2384 "-" "git/1.9.1"
104.130.174.45 - - [18/Mar/2015:17:25:09 +0000] "GET /diffusion/API/konekt-rest-api.git/info/refs?service=git-upload-pack HTTP/1.1" 401 555 "-" "git/1.9.1"
104.130.174.45 - reuben [18/Mar/2015:17:25:09 +0000] "GET /diffusion/API/konekt-rest-api.git/info/refs?service=git-upload-pack HTTP/1.0" 200 1179 "-" "git/1.9.1"

Oh, and one other change I had made was enabling local file storage. I turned that off just to be sure and it's still broken.

Thanks for any help you can give me on this. Phabricator is pretty central to our operation (and thanks for building this awesome tool) so the sooner I can get it resolved, the better. I may just have to rollback but we'll see.

One more update on this. Sorry for spamming you guys.

It appears to still work on sourcetree for Windows, but fails with Git on linux. Have you guys ever seen anything like that before?

What version of git are you using locally? and did you compile it?

From a basic google it seems there was a broken version of GnuTLS

It's git 1.9.1 distributed via Ubuntu package.
I saw some of those pages too and some people are saying that if you recompile with a different lib it makes that error go away but if it was that problem it seems odd that it would just start happening now.

That repo only supports ssh but I'm doing https. Can you enable https for it?

It's not just pulls btw. Any git server operation (clone, fetch, push, etc) fails

What error do you get if you try this?

git clone https://www.pcwebshop.co.uk/

Different one:

Cloning into 'www.pcwebshop.co.uk'...
fatal: unable to access 'https://www.pcwebshop.co.uk/': server certificate verification failed. CAfile: /etc/ssl/certs/ca-certificates.crt CRLfile: none

That's one we've seen and it's fixed by simply adding the cert to the store.

Hmm, try this one?

git clone https://secure.phabricator.com/diffusion/GITTEST/git-test.git

I changed it to require authentication, just in case that's the issue -- set a VCS password in Settings > VCS Password and try again?

Hmm looks like that works fine too. Weird problem.

I'm assuming the current secure.phabrication.com install is at HEAD?

Yeah, I'm not sure what's up. We're slightly ahead of you (rPb7fa55ff) but basically running the same code.

I get a similar error against your server on OSX:

$ git -c http.sslVerify=false clone https://phabricator.konektdata.com/diffusion/API
Cloning into 'API'...
Username for 'https://phabricator.konektdata.com': 
Password for 'https://phabricator.konektdata.com': 
fatal: unable to access 'https://phabricator.konektdata.com/diffusion/API/': SSLRead() return error -9806

...but it works fine from this machine (Amazon Linux) -- 403 is the expected result, because I don't have valid credentials:

$ git -c http.sslVerify=false clone https://phabricator.konektdata.com/diffusion/API
Cloning into 'API'...
Username for 'https://phabricator.konektdata.com': 
Password for 'https://phabricator.konektdata.com': 
fatal: unable to access 'https://phabricator.konektdata.com/diffusion/API/': The requested URL returned error: 403

...although I can get a similar error from a cluster host (Ubuntu 14):

$ git -c http.sslVerify=false clone https://phabricator.konektdata.com/diffusion/API
Cloning into 'API'...
Username for 'https://phabricator.konektdata.com': 
Password for 'https://phabricator.konektdata.com': 
fatal: unable to access 'https://phabricator.konektdata.com/diffusion/API/': GnuTLS recv error (-9): A TLS packet with unexpected length was received.

My best guess is that this is a config issue on your server that's interacting badly with some clients (some Googling suggests that clients mitigating POODLE may be causing issues like this?), but I'm not sure exactly what the issue is and I haven't been able to turn anything very helpful up by searching. The openssl dump for your server (openssl s_client -connect phabricator.konektdata.com:443) looks similar to ours, too.

A possible POODLE-related issue is that SSLv3 is enabled on your server:

Screen_Shot_2015-03-18_at_9.10.36_PM.png (209×782 px, 24 KB)

Screen_Shot_2015-03-18_at_9.11.12_PM.png (669×758 px, 115 KB)

This is bad independent of the functional bug.

Ok, it looks like we're rejecting SSLv3 now. Still no success with git though.

So the instructions on this page (adjusted for 1.9.1) fixed my problem: http://askubuntu.com/questions/186847/error-gnutls-handshake-falied-when-connecting-to-https-servers/187199#187199

It's a little annoying to have to rebuild git on every affected machine and it doesn't answer the question of why our server is doing this, but I guess it'll do for now.

To the new subscribers, are you also having issues? Do you have additional information to add?

@chad I'm having the same issues after updating 2 days ago. What I find strange is that this is only an issue with phabricator. No issues with GitHub and there was no change to the git installed on the server (that worked fine so far). I checked to see if there are any permission issues with phd-user and vcs-user, but there are none.

Sorry, for having nothing of substance to add.

This is becoming more a pain for us because one of our devs is checking in from a Mac and there doesn't seem to be a client for OSX that doesn't hit a similar error.
I'll update again if he finds something that works. If anyone has any other suggestions, please let me know.
It seems like Phabricator wouldn't have much to do with the TLS connection closing early (which seems to be what some posts on the gnutls board would suggest cause this) but nothing else coincides with this breaking. I would think it would be an Apache thing, but it's a mystery at the moment.

All the Phabricator devs use Macs, and haven't seen this issue at least.

Would anyone be up for git bisect-ing this for us if you have a repro case? At least, determining the commit might shed some light. Daemon refactors, maybe?

I'd be happy to help out there, though I may need a bit of handholding.
Is there any risk of us losing data due to rolling back to a previous schema?

Try reverting rP81d8898 with:

$ git revert 81d8898

...and see if the issue still reproduces? That's the only thing in recent history that I can think of which could plausibly interact here.

Whoa, that was it!
I reverted and it works now!

Okay. There are two relevant parts in that diff, both in src/aphront/sink/AphrontPHPHTTPSink.php:

  • Undo the revert first (git reset --hard origin/master).
  • Does commenting out the flush() which was added on line 28 fix it, without other changes?
  • Does making isWritable() on lines 31-33 always return true; fix it, without other changes?
  • If neither of those are sufficient on their own, does making both changes fix it?

Getting rid of the flush seems to fix it on it's own.

BTW, what server software are you running here? Ours is on Apache. I wonder if that might explain the difference.

We use nginx here, but Apache in the Phacility cluster and I use Apache locally, and I can't reproduce in either of those environments.

Gotcha. Need any more info from me for this?

No, it appears that we can remove the flush() call with no general loss of functionality.

epriestley triaged this task as Normal priority.