IMPORTANT: I'm still running clone speed tests for this report (it's taking hours), but I wanted to save this draft because of the amount of information already in it and I didn't want to close it; I'll post a comment on this issue once the final clone speeds are in
We are experiencing slow cloning of large repositories on Phacility. We are based in Australia and have a 1Gbps down / 1Gbps up synchronous internet connection.
**Reproduction steps:**
- Create a new repository on a Phacility instance (for this test, I created the test instance test-fiycs7awfp4z.phacility.com).
- Obviously, set up your SSH keys and everything.
- Wait like, 5 minutes apparently for Phacility to actually create the new repo before you can clone it.
- Use the attached script in the repository to generate some large commits with big files.
- For comparison purposes, optionally push the repository to GitHub over SSH. I have pushed the same repo to both the Phacility instance above and GitHub here: https://github.com/hach-que/BigRepo.
- Push the repository to Phacility over SSH.
**Actual results:**
Keep in mind these results are from //Australia//. If you are not based in Australia, you should probably spin up an EC2 instance in the Sydney region to do comparison tests.
Test repository size based on push: 400.13 MiB
Push speed (not important as we don't often push large files, but do need to clone them; here for informational purposes):
| Phacility (SSH) | GitHub (SSH) |
| 8.31MiB/s (//wat?//) | 2.46MiB/s |
Pull speed:
| Phacility (SSH) | GitHub (SSH) | GitHub (HTTPS) |
| TBA | 100 KiB/s | 3.87MiB/s |
I did notice that GitHub HTTPS pull got faster as the transfer went, starting out at 64KiB/s and gradually accelerating over the whole transfer up to around 2-5MiB/s. This was not the case for SSH, which roughly stayed around the same speed.
Additionally for reference, here are some results from speedtest.net which show the connectivity speed:
| Melbourne Server | California Server |
| 94.46 Mbps down | 20.77 Mbps down |
| 93.51 Mbps up | 96.65 Mbps up |
This demonstrates the actual internet connection to either region is not the limiting factor in clone speed.
**Expected results:**
I expected that Phacility would be on-par with GitHub. I know 100KiB/s vs 38KiB/s (**what we currently see on our real repo, will update this with results from the reproduction test repo soon**) might not seem like a huge difference, but it's the difference between a clone taking 1 hour and taking ~3 hours.
Additionally, I'm not sure what's going on with GitHub's HTTPS there - I can't explain why it gradually got faster. This was the first clone, and the data is random, so it's unlikely that caching or cached requests played any role here. It's possible that Git's HTTPS protocol is just naturally faster at transferring large files, but that also seems unlikely.
**Attached script:**
```
#!/bin/bash
for ((a=0;$a<20;a=$[$a+1])); do
for ((i=0;$i<10;i=$[$i+1])); do
dd if=/dev/urandom of=$i.bin bs=1048576 count=2
git add $i.bin
done
git commit -m "Change binary files (commit #$a)"
done
```
**Command raw output for reference:**
Push GitHub SSH:
```
jrhod@DESKTOP-4MQ2MPG /d/Projects/big-repo (master)
$ git push git@github.com:hach-que/BigRepo.git master:master
Counting objects: 240, done.
Delta compression using up to 8 threads.
Compressing objects: 100% (240/240), done.
Writing objects: 100% (240/240), 400.13 MiB | 2.46 MiB/s, done.
Total 240 (delta 1), reused 0 (delta 0)
remote: Resolving deltas: 100% (1/1), done.
To github.com:hach-que/BigRepo.git
* [new branch] master -> master
```
Push Phacility SSH:
```
jrhod@DESKTOP-4MQ2MPG /d/Projects/big-repo (master)
$ git push ssh://test-fiycs7awfp4z@vault.phacility.com/diffusion/1/big-repo.git master:master
# Push received by "web.phacility.net", forwarding to cluster host.
# Waiting up to 120 second(s) for a cluster write lock...
# Acquired write lock immediately.
# Waiting up to 120 second(s) for a cluster read lock on "repo007.phacility.net"...
# Acquired read lock immediately.
# Device "repo007.phacility.net" is already a cluster leader and does not need to be synchronized.
# Ready to receive on cluster host "repo007.phacility.net".
Counting objects: 240, done.
Delta compression using up to 8 threads.
Compressing objects: 100% (240/240), done.
Writing objects: 100% (240/240), 400.13 MiB | 8.31 MiB/s, done.
Total 240 (delta 1), reused 0 (delta 0)
remote: Resolving deltas: 100% (1/1), done.
# Released cluster write lock.
To ssh://vault.phacility.com/diffusion/1/big-repo.git
* [new branch] master -> master
```
Pull GitHub HTTPS:
```
jrhod@DESKTOP-4MQ2MPG /d/Projects
$ git clone https://github.com/hach-que/BigRepo big-repo-github
Cloning into 'big-repo-github'...
remote: Counting objects: 240, done.
remote: Compressing objects: 100% (239/239), done.
remote: Total 240 (delta 1), reused 240 (delta 1), pack-reused 0
Receiving objects: 100% (240/240), 400.13 MiB | 3.87 MiB/s, done.
Resolving deltas: 100% (1/1), done.
```
Pull GitHub SSH:
```
```
Pull Phacility SSH:
```
```