Page MenuHomePhabricator

Large (~750MB) git push fails over SSH
Open, WishlistPublic

Description

Pushing a large commit to a GIT repository hosted on Phabricator produces errors, resulting in the push failing.

Similar errors seem to have occurred with others on T4801, although those are much smaller commit sizes.

Reproduction

  • Create new instance of phabricator
  • Create a GIT repo
  • Create a large commit.
    • In my example I've used the android-studio-bundle for windows due to it's large size. It's available at this URL
    • This has been broken up into 50MB chunks with 7zip.
  • Push with HTTP: Receive the following error
git push phaby master
Counting objects: 36, done.
Delta compression using up to 4 threads.
Compressing objects: 100% (36/36), done.
Writing objects: 100% (36/36), 1.63 GiB | 9.47 MiB/s, done.
Total 36 (delta 0), reused 0 (delta 0)
remote: fatal: early EOF
error: unpack failed: unpack-objects abnormal exit
To https://test-szm42ymuboj3.phacility.com/diffusion/2/bigtest.git
 ! [remote rejected] master -> master (unpacker error)
error: failed to push some refs to 'https://test-szm42ymuboj3.phacility.com/diffusion/2/bigtest.git'
  • HTTP has scalability limits T4369, so use SSH instead
  • Push with SSH: Receive the following error
git push phaby master
# Push received by "web.phacility.net", forwarding to cluster host.
# Waiting up to 120 second(s) for a cluster write lock...
# Acquired write lock immediately.
# Waiting up to 120 second(s) for a cluster read lock on "repo002.phacility.net"...
# Acquired read lock immediately.
# Device "repo002.phacility.net" is already a cluster leader and does not need to be synchronized.
# Ready to receive on cluster host "repo002.phacility.net".
Counting objects: 36, done.
Delta compression using up to 4 threads.
Connection to vault.phacility.com closed by remote host.
fatal: The remote end hung up unexpectedly
Compressing objects: 100% (36/36), done.
error: failed to push some refs to 'ssh://test-szm42ymuboj3@vault.phacility.com/diffusion/2/bigtest.git'

Versions

Local Instance

phabricator 58c857a6816d44687b12a51a86331e52c7a20005 (Thu, Jan 19)
arcanist ade25facfdf22aed1c1e20fed3e58e60c0be3c2b (Thu, Jan 5) 
phutil 9d85dfab0f532d50c2343719e92d574a4827341b (Thu, Jan 12)

phacility
This has also been tested on a test instance on phacility on 2017-02-01

phabricator 2604c5af55f654d36f8db2f080b96486c4572216 (Fri, Jan 27) (branched from 1be3ef02276812296c01e41122f19d6ea8077f81 on origin) 
arcanist 9503b941cc02be637d967bb50cfb25f852e071e4 (Sat, Jan 7) (branched from ade25facfdf22aed1c1e20fed3e58e60c0be3c2b on origin) 
phutil 10963f771f118baa338aacd3172aaede695cde62 (Fri, Jan 13) (branched from 9d85dfab0f532d50c2343719e92d574a4827341b on origin) 
libcore 9dcb27732cd2718924efe5102d3ce3554a06ede1 (Sat, Jan 21) 
services 982d8a71bfc6178744eb670f96c00d9420a50a99 (Nov 18 2016) (branched from b5cef1ac31ffa392b3562f9a0bbefc238f212430 on origin)

Event Timeline

  • For HTTP, this is broadly expected in some sense until T4369 is resolved.
  • For SSH, this is not expected but the linked file is 1.75GB. Did you encounter this in a more realistic scenario and the size of that file is just for demonstration purposes, or are you trying to store ~1GB binary assets in a Git repository? Git is generally not well suited to serve as a binary asset store. GitLFS (T7789) or possible future changes like T11367 are likely better fits for this use case.

I can not immediately reproduce this for a 64MB file over SSH:

epriestley@orbital ~/dev/scratch/large-push $ head -c67108864 /dev/urandom > random_64MB.data
epriestley@orbital ~/dev/scratch/large-push $ ls -alh random_64MB.data 
-rw-r--r--  1 epriestley  staff    64M Feb  2 06:57 random_64MB.data
epriestley@orbital ~/dev/scratch/large-push $ git add random_64MB.data 
git commit -epriestley@orbital ~/dev/scratch/large-push $ git commit -m 'Add 64MB of random data.'
[master (root-commit) f7a748f] Add 64MB of random data.
 1 file changed, 0 insertions(+), 0 deletions(-)
 create mode 100644 random_64MB.data
epriestley@orbital ~/dev/scratch/large-push $ git push origin master
# Push received by "web.phacility.net", forwarding to cluster host.
# Waiting up to 120 second(s) for a cluster write lock...
# Acquired write lock immediately.
# Waiting up to 120 second(s) for a cluster read lock on "repo004.phacility.net"...
# Acquired read lock immediately.
# Device "repo004.phacility.net" is already a cluster leader and does not need to be synchronized.
# Ready to receive on cluster host "repo004.phacility.net".
Counting objects: 3, done.
Delta compression using up to 8 threads.
Compressing objects: 100% (2/2), done.
Writing objects: 100% (3/3), 64.02 MiB | 780.00 KiB/s, done.
Total 3 (delta 0), reused 0 (delta 0)
# Released cluster write lock.
To ssh://vault.phacility.com/diffusion/1/test.git
 * [new branch]      master -> master
epriestley@orbital ~/dev/scratch/large-push $

If the minimum push size required to trigger this behavior is significantly larger than that this is still something we should look into and fix, but it will be difficult for us to prioritize because Git is inherently a poor fit as a binary asset store.

The linked file was used for testing this bug in a repeatable way. Based on some internal testing my (very rough) estimate is that this occurs with commits that are at least above the 750MB mark. I haven't tested below this size, but some more debugging could be done to try and find the limit and to confirm that it's related to commit sizes rather than some other metric like time.

We originally encountered this on our local phabricator instance with an internal project while committing actual working data. Since we couldn't find a way around this, our solution was to move to a non-phabricator hosted GIT repo until this can be resolved.

Agreed that using GIT as a binary store isn't it's ideal use case, and we would be excited to make use of T7789 or T11367 as it would eliminate the need for our current workarounds.

epriestley renamed this task from Large GIT push fails over SSH or HTTP to Large (~750MB) git push fails over SSH.Feb 2 2017, 7:45 PM
epriestley triaged this task as Wishlist priority.
epriestley added a project: Diffusion.