We may be able to implement T8092 by proxying the protocol, without needing to embed an implementation of Git. We do this to some degree in Mercurial and SVN already, with success. Although this is complex, it's potentially much less complex than embedding a Git implementation.
Revisions and Commits
|Open||None||T8090 Allow Harbormaster to perform change handoff in a defensible way|
|Open||None||T8092 Evaluate the viability of virtualizing Git refs in hosted repositories|
|Open||None||T8093 Evaluate virtualizing Git refs by proxying the protocol|
|Open||None||T4369 Phabricator HTTP repository hosting has fairly severe scalability limits|
- Mentioned In
- T13584: Shallow Git clones fail under recent versions of Git
T13278: Improve repository Staging Areas
T13277: In repositories, realign "Track Only", "Autoclose", and "Publish/Notify" toward "Permanent Refs"
T10691: Support GitHub-like forking of repositories
T8089: Unprototype Harbormaster (v1)
- Mentioned Here
- T4369: Phabricator HTTP repository hosting has fairly severe scalability limits
T8092: Evaluate the viability of virtualizing Git refs in hosted repositories
exciting new wire protocol
My plan for now is to do v1 support only, since: (a) we'll need v1 for 15 years anyway for everyone running Ubuntu 3 on original Xbox hardware in their corporate enterprise cluster; and (b) I can't immediately tricky my git into v2 anyway; and (c) it looks easier.
The v1 protocol looks like it's pretty one-shot and straightforward: whether we're running upload-pack or receive-pack, the server immediately sends a complete list of refs to the client when the client connects. This is sort of a weird way for the protocol to work for 10+ years (?), also considering that this is the "smart" protocol, but it makes our job easier, since it looks like we can (as a starting point, at least) just parse the first few frames of the protocol, delete/rewrite some refs, and then drop into passthru mode.
This will just hide the refs from the client. A "malicious" client could still use want commands to fetch the underlying commits. However, this is fine: we aren't planning to treat different views of the same repository as having different permissions.
The want/need stuff seems ref-independent, so editing the initial list of refs looks like it fixes the whole read pathway with no other changes.
The "push" part is a little messier since the client sends what it's pushing, then sends PACK data, then the server acknowledges what was written. We need to parse all of that so we can rewrite refs in the first part (client thinks it's pushing A, tell the server it's pushing secret/A) and the last part (server acknowledges a write to secret/A, we tell the client the server acknowledge a write to A).
When there are no refs in a repository, the server does not appear to send a capabilities frame:
! git-upload-pack -- '/Users/epriestley/dev/core/repo/local/12/' < Write [4 bytes] < 30303030 0000 > Read [4 bytes] > 30303030 0000 _ <End of Session>
This makes our job a lot easier but also is absolutely bananas?