Page MenuHomePhabricator

Phacility does not currently offer instances in the Sydney region
Closed, InvalidPublic

Description

We are experiencing slow cloning of large repositories on Phacility. We are based in Australia and have a 1Gbps down / 1Gbps up synchronous internet connection.

Reproduction steps:

  • Create a new repository on a Phacility instance (for this test, I created the test instance test-fiycs7awfp4z.phacility.com).
  • Obviously, set up your SSH keys and everything.
  • Wait like, 5 minutes apparently for Phacility to actually create the new repo before you can clone it.
  • Use the attached script in the repository to generate some large commits with big files.
  • For comparison purposes, optionally push the repository to GitHub over SSH. I have pushed the same repo to both the Phacility instance above and GitHub here: https://github.com/hach-que/BigRepo.
  • Push the repository to Phacility over SSH.
  • Clone a new copy from Phacility over SSH and observe the slow speeds.
  • For comparison, do the same with the GitHub repository, over both SSH and HTTPS.

Actual results:
Keep in mind these results are from Australia. If you are not based in Australia, you should probably spin up an EC2 instance in the Sydney region to do comparison tests.

Test repository size based on push: 400.13 MiB
Actual repository size that I was attempting to clone prior to filing this report: 1.2 GiB

Push speed (not important as we don't often push large files, but do need to clone them; here for informational purposes):

Phacility (SSH)GitHub (SSH)
8.31MiB/s (wat?)2.46MiB/s

Pull speed:

Phacility (SSH)GitHub (SSH)GitHub (HTTPS)
99 KiB/s98 KiB/s3.87MiB/s

I did notice that GitHub HTTPS pull got faster as the transfer went, starting out at 64KiB/s and gradually accelerating over the whole transfer up to around 2-5MiB/s. This was not the case for SSH, which roughly stayed around the same speed.

Additionally for reference, here are some results from speedtest.net which show the connectivity speed:

Melbourne ServerCalifornia Server
94.46 Mbps down20.77 Mbps down
93.51 Mbps up96.65 Mbps up

This demonstrates the actual internet connection to either region is not the limiting factor in clone speed.

Expected results:
I expected that Phacility should be able to serve repository data at least as fast as GitHub - HTTPS for some reason appears to be much faster on GitHub, but Phacility doesn't offer HTTPS cloning. The difference is pretty drastic too, we're talking a couple of minutes vs hours in clone time.

I can't explain why HTTPS got faster though. This was the first clone, and the data is random, so it's unlikely that caching or cached requests played any role here. It's possible that Git's HTTPS protocol is just naturally faster at transferring large files, but that also seems unlikely.

Attached script:

#!/bin/bash

for ((a=0;$a<20;a=$[$a+1])); do
    for ((i=0;$i<10;i=$[$i+1])); do
        dd if=/dev/urandom of=$i.bin bs=1048576 count=2
        git add $i.bin
    done
    git commit -m "Change binary files (commit #$a)"
done

Command raw output for reference:
Push GitHub SSH:

jrhod@DESKTOP-4MQ2MPG  /d/Projects/big-repo (master)
$ git push git@github.com:hach-que/BigRepo.git master:master
Counting objects: 240, done.
Delta compression using up to 8 threads.
Compressing objects: 100% (240/240), done.
Writing objects: 100% (240/240), 400.13 MiB | 2.46 MiB/s, done.
Total 240 (delta 1), reused 0 (delta 0)
remote: Resolving deltas: 100% (1/1), done.
To github.com:hach-que/BigRepo.git
 * [new branch]      master -> master

Push Phacility SSH:

jrhod@DESKTOP-4MQ2MPG  /d/Projects/big-repo (master)
$ git push ssh://test-fiycs7awfp4z@vault.phacility.com/diffusion/1/big-repo.git master:master
# Push received by "web.phacility.net", forwarding to cluster host.
# Waiting up to 120 second(s) for a cluster write lock...
# Acquired write lock immediately.
# Waiting up to 120 second(s) for a cluster read lock on "repo007.phacility.net"...
# Acquired read lock immediately.
# Device "repo007.phacility.net" is already a cluster leader and does not need to be synchronized.
# Ready to receive on cluster host "repo007.phacility.net".
Counting objects: 240, done.
Delta compression using up to 8 threads.
Compressing objects: 100% (240/240), done.
Writing objects: 100% (240/240), 400.13 MiB | 8.31 MiB/s, done.
Total 240 (delta 1), reused 0 (delta 0)
remote: Resolving deltas: 100% (1/1), done.
# Released cluster write lock.
To ssh://vault.phacility.com/diffusion/1/big-repo.git
 * [new branch]      master -> master

Pull GitHub HTTPS:

jrhod@DESKTOP-4MQ2MPG  /d/Projects
$ git clone https://github.com/hach-que/BigRepo big-repo-github
Cloning into 'big-repo-github'...
remote: Counting objects: 240, done.
remote: Compressing objects: 100% (239/239), done.
remote: Total 240 (delta 1), reused 240 (delta 1), pack-reused 0
Receiving objects: 100% (240/240), 400.13 MiB | 3.87 MiB/s, done.
Resolving deltas: 100% (1/1), done.

Pull GitHub SSH:

jrhod@DESKTOP-4MQ2MPG  /d/Projects
$ git clone git@github.com:hach-que/BigRepo.git big-repo-github-2
Cloning into 'big-repo-github-2'...
remote: Counting objects: 240, done.
remote: Compressing objects: 100% (239/239), done.
remote: Total 240 (delta 1), reused 240 (delta 1), pack-reused 0
Receiving objects: 100% (240/240), 400.13 MiB | 98.00 KiB/s, done.
Resolving deltas: 100% (1/1), done.

Pull Phacility SSH:

jrhod@DESKTOP-4MQ2MPG  /d/Projects
$ git clone ssh://test-fiycs7awfp4z@vault.phacility.com/diffusion/1/big-repo.git big-repo-phacility
Cloning into 'big-repo-phacility'...
# Fetch received by "web.phacility.net", forwarding to cluster host.
# Waiting up to 120 second(s) for a cluster read lock on "repo007.phacility.net"...
# Acquired read lock immediately.
# Device "repo007.phacility.net" is already a cluster leader and does not need to be synchronized.
# Cleared to fetch on cluster host "repo007.phacility.net".
remote: Counting objects: 240, done.
remote: Compressing objects: 100% (239/239), done.
remote: Total 240 (delta 1), reused 240 (delta 1)
Receiving objects: 100% (240/240), 400.13 MiB | 99.00 KiB/s, done.
Resolving deltas: 100% (1/1), done.

Event Timeline

hach-que updated the task description. (Show Details)
hach-que updated the task description. (Show Details)
hach-que renamed this task from Slow cloning on Phacility to Slow cloning over Phacility SSH for large repository.Jul 3 2017, 6:25 AM
hach-que updated the task description. (Show Details)

Final statistics are in after letting it clone overnight. Let me know if you need any more information to diagnose speed issues.

It looks like it's the same speed?

It is for SSH - but compared with GitHub HTTPS it's much slower.

So probably, the actionables here are "Allow HTTPS in Phacility"?

That's the obvious solution, but I'm not sure it's practical given Phacility's infrastructure.

In any case, I didn't want to give possible solutions; I figured I'd give information about the slow speeds and let the Phacility engineers come up with the appropriate solution to the issue (which given my actual tested internet connection speed, I don't think it's unreasonable to be concerned about the performance).

Have you tried changing SSH options on your end, such as disabling compression or using weaker encryption ciphers (such as arcfour, no encryption cant be done without agreement of the server)? This can make a significant difference with SSH transfer speeds, and may function as a decent workaround for now.

I can try checking if SSH compression is enabled later today (but I doubt it; I'm just using the defaults). Keep in mind the upload speed to Phacility is 8 MiB/s, it's just the download speed that's 100 KiB/s.

Also, have you tried looking around the git community? I was always under the impression that SSH should be faster than HTTPS cloning, or possibly using the same protocol.

epriestley triaged this task as Wishlist priority.Jul 27 2017, 2:17 PM
epriestley added a subscriber: epriestley.

What is the network speed of transferring a similar file (e.g., a 400MB file from /dev/urandom) via ssh cat <file>?

Phacility doesn't offer HTTPS cloning

Why do you believe this to be the case?

epriestley renamed this task from Slow cloning over Phacility SSH for large repository to Phacility does not currently offer instances in the Sydney region.Jul 27 2017, 2:18 PM
epriestley edited projects, added Feature Request; removed Bug Report.

When you click Clone in the Phacility UI on a repository, it doesn't show any HTTPS URLs. It's possible it works if you copy the URL from the address bar, but the UI in Phacility itself doesn't give any kind of indication that it will work.

Oh, I didn't even expect that option to be configurable on Phacility given it's a security related setting. I'll turn it on and do some speed tests next week to see if I get any measurable difference in cloning.

Doesn't look like the repository will even attempt to clone over HTTPS:

$ git clone https://redpoint.phacility.com/source/minute-of-mayhem-ue4.git
Cloning into 'minute-of-mayhem-ue4'...
error: RPC failed; HTTP 504 curl 22 The requested URL returned error: 504 GATEWAY_TIMEOUT
fatal: The remote end hung up unexpectedly

Yet works fine over SSH, albeit really slow.

We'll consider offering instances in the Sydney region in the future, but this isn't really a valid feature request or bug report. Feel free to continue discussion on Discourse.