Page MenuHomePhabricator

Does ELB/ALB need extra configuration to proxy git ssh?
OpenPublic

Asked by joshma on Dec 1 2016, 1:51 AM.

Details

We're currently using AWS ALB (the newer LB) to terminate SSL for HTTPS, and since the same hostname is being used for git it forwards git as well:

22 (TCP) forwarding to 22 (TCP)

  • idle timeout: 3600 seconds
  • cross-zone load balancing: enabled

We use Jenkins for builds, and we get git pull errors sporadically. My current guess is that we trigger a bunch of parallel builds, and there's some connection management weirdness going on:

ERROR: Error fetching remote repo 'origin'
09:35:21 hudson.plugins.git.GitException: Failed to fetch from git@ourphabricator.com:diffusion/AST/aurelia-staging.git
09:35:21     at hudson.plugins.git.GitSCM.fetchFrom(GitSCM.java:797)
09:35:21     at hudson.plugins.git.GitSCM.retrieveChanges(GitSCM.java:1051)
09:35:21     at hudson.plugins.git.GitSCM.checkout(GitSCM.java:1082)
09:35:21     at hudson.scm.SCM.checkout(SCM.java:495)
09:35:21     at hudson.model.AbstractProject.checkout(AbstractProject.java:1278)
09:35:21     at hudson.model.AbstractBuild$AbstractBuildExecution.defaultCheckout(AbstractBuild.java:604)
09:35:21     at jenkins.scm.SCMCheckoutStrategy.checkout(SCMCheckoutStrategy.java:86)
09:35:21     at hudson.model.AbstractBuild$AbstractBuildExecution.run(AbstractBuild.java:529)
09:35:21     at hudson.model.Run.execute(Run.java:1720)
09:35:21     at hudson.model.FreeStyleBuild.run(FreeStyleBuild.java:43)
09:35:21     at hudson.model.ResourceController.execute(ResourceController.java:98)
09:35:21     at hudson.model.Executor.run(Executor.java:401)
09:35:21 Caused by: hudson.plugins.git.GitException: Command "git fetch --tags --progress git@ourphabricator.com:diffusion/AST/aurelia-staging.git +refs/heads/*:refs/remotes/origin/*" returned status code 128:
09:35:21 stdout: 
09:35:21 stderr: ssh_exchange_identification: Connection closed by remote host
09:35:21 fatal: Could not read from remote repository.

I suspected idle timeout at first, but bumped that to 3600s, and we're still seeing it. Has anyone else run git over ssh and behind ELB/ALB with success? Is there anything else that needs to be configured?

Answers

joshma
Updated 2,950 Days Ago

Empirically, it seems like MaxStartups 1024 in /etc/ssh/sshd_config helps, but we're still seeing errors when a bunch of builds start at once:

ERROR: Error fetching remote repo 'origin'
00:00:00.788 hudson.plugins.git.GitException: Failed to fetch from git@phab-host.com:diffusion/R/our-repo.git
00:00:00.788 	at hudson.plugins.git.GitSCM.fetchFrom(GitSCM.java:797)
00:00:00.788 	at hudson.plugins.git.GitSCM.retrieveChanges(GitSCM.java:1051)
00:00:00.788 	at hudson.plugins.git.GitSCM.checkout(GitSCM.java:1082)
00:00:00.788 	at hudson.scm.SCM.checkout(SCM.java:495)
00:00:00.789 	at hudson.model.AbstractProject.checkout(AbstractProject.java:1278)
00:00:00.789 	at hudson.model.AbstractBuild$AbstractBuildExecution.defaultCheckout(AbstractBuild.java:604)
00:00:00.789 	at jenkins.scm.SCMCheckoutStrategy.checkout(SCMCheckoutStrategy.java:86)
00:00:00.789 	at hudson.model.AbstractBuild$AbstractBuildExecution.run(AbstractBuild.java:529)
00:00:00.789 	at hudson.model.Run.execute(Run.java:1720)
00:00:00.789 	at hudson.model.FreeStyleBuild.run(FreeStyleBuild.java:43)
00:00:00.789 	at hudson.model.ResourceController.execute(ResourceController.java:98)
00:00:00.789 	at hudson.model.Executor.run(Executor.java:401)
00:00:00.790 Caused by: hudson.plugins.git.GitException: Command "git fetch --tags --progress git@phab-host.com:diffusion/R/our-repo.git +refs/heads/*:refs/remotes/origin/*" returned status code 128:
00:00:00.790 stdout: 
00:00:00.790 stderr: ssh_exchange_identification: Connection closed by remote host
00:00:00.790 fatal: Could not read from remote repository.
00:00:00.790 
00:00:00.790 Please make sure you have the correct access rights
00:00:00.790 and the repository exists.

It does look different, though!

epriestley
Updated 2,950 Days Ago

I don't know of any other magic config options.

Since you're still seeing ssh_exchange_identification I think sshd is still killing you for some reason. I would not expect that error to be associated with idle timeouts or anything we're doing beyond sshd, except maybe initial auth stuff.

You could try running sshd with -d -d -d in the foreground (or maybe there are equivalent sshd_config options for the normal logs, I just don't know them offhand) and it might give you more insight. (Be careful about stopping sshd or running it in the foreground if you rely on SSH to administrate the machine, though, of course.)

New Answer