Page MenuHomePhabricator

Implement replication of repositories to remote URIs
Closed, ResolvedPublic

Description

Something useful I thought up; currently Phabricator can pull from other repositories, but it would also be ideal to be able to push to other repositories when Phabricator is hosting.

This has various benefits:

  • Users can push their hosted repositories to previous locations like GitHub
  • Users can backup repositories to other machines (useful when scaling in AWS)

Event Timeline

hach-que raised the priority of this task from to Needs Triage.
hach-que updated the task description. (Show Details)
hach-que added a project: Diffusion.
hach-que added subscribers: hach-que, epriestley.

Yeah, I think we want this ourselves (to mirror to GitHub) and it gives us an easy answer to backups.

Well, an easy-ish answer, at least. I'm still on the fence about building a "put every repository in S3 automatically every night" feature, since I believe many installs will never configure any sort of backups if it takes more than one click.

Is there any reason that both features can't be implemented? One mechanism that uses git push to mirror it to other Git repositories and uploading each Git object in the .git directory to S3 as they are added / changed. I know if there was a mechanism to store a copy of the Git repository in S3 that's make some people a lot more comfortable than "it's sitting on a web box" (which is the case at the moment).

Oh, definitely -- we should 100% implement this one. I'm less sure if we should additionally implement the S3 thing. I'd probably want to do it as tarball backups, though, not sync, to cover the "I force-puhshed and destroyed master, then force gc'd the source of truth" kind of case, as anyone who would know not to do this should also know to configure real backups, maybe.

We could just prevent deletions from syncing; it would mean that S3 never gets garbage collected, but it's reasonably low-cost and if someone wanted to garbage collect the S3 bucket they could just delete it and let Phabricator resync the whole thing.

Alternatively we could also use git bundle to create bundles and store them in S3, instead of using tarballs (advantage being that we can store incremental bundles frequently and full bundles occasionally).

epriestley triaged this task as Normal priority.
epriestley edited this Maniphest Task.