Commit into repository directly from differential
OpenPublic

Description

Just opening up a conversation about how to do this. This would be a really nice feature for deffirential.

From Facebook Tasks:

D255317 fixes (or at least should fix) the latest branch, which means we can get a working build out for testing.

Right now though, we need to get it reviewed, committed, and requested for merging in to latest.

I'd like to be able to request a diff to be merged, prior to it actually being committed. This is a valid thing to do as long as we're not using shadow branches in production (and only for testing), though I guess that might be debatable.

tuomaspelkonen assigned this task to epriestley.Via Old WorldMay 26 2011, 11:57 PM
epriestley added a comment.Via Old WorldMay 28 2011, 3:03 PM

I definitely want to explore this, let me get back to this task once I've cleared some other stuff off my plate.

Girish added a comment.Via Old WorldMay 28 2011, 6:17 PM

@tuomas, why is this in phabricator, I think we might be able to do this in our shadow branch + merge tool. Since phabricator doesn't have release management, I don't think it makes sense here.

Unless we are planning on building release management in phabricator.

epriestley added a comment.Via Old WorldMay 30 2011, 10:50 AM

I think there are two separate things going on here:

  • 1) Merging unreviewed code into the release prior to it being reviewed and landing in trunk. I think this is a Facebook policy issue and mostly a change to the merge tool, e.g. one possible implementation is that you specify "git://dev103.snc6/home/epriestley/www/:13f813a90af9f231" to merge instead of "rE12345" and the existing code which does the cherry-pick just picks the patch out of a random remote instead of out of trunk. I personally think this feature isn't a great idea, but you can choose to pursue it or not outside of the context of Differential/Phabricator. This is more in line with the task description.
  • 2) Allowing Phabricator to perform commits once code has been accepted, similar to how GitHub pull requests have a button that says "merge". I think this is a great idea -- especially for GitHub open source projects like Phabricator itself -- and definitely want to pursue it. This is more in line with the task title.

There's a little overlap here -- if we have (2), the implementation of (1) could just be a button that says "merge this unreviewed code straight into the release right now" that uses the same channel.

I think we'll end up implementing (2) something like this:

  • Add the ability to allocate "working copy pools", which basically say "on this machine, run up to 8 working copy operations at once and store the working copies in /var/repos/" or whatever. This is essentially generic infrastructure for asynchronous operations on working copies.
  • Build unit testing / linting on top of working copy pools. One thing that makes this hard in the Phabricator case is that it may depend on the versions of multiple repositories (libphutil, arcanist, phabricator).
  • Build merging on top of working copy pools.
  • (Maybe?) Build Sandcastle on top of working copy pools -- this is easy for Phabricator but tricky for Facebook because of the huge size of the repository.
  • (Eventually?) Perflab and such could be built on top of working copy pools.

There's some overlap here with other Facebook efforts so I should probably touch base with whoever is doing asynchronous unit testing (slawekbiel?) -- can you point them here if you know who it is?

epriestley triaged this task as "Normal" priority.Via Old WorldMay 30 2011, 10:54 AM

I'm probably at least a couple of weeks out from making any moves on this stuff and all the arc binary patch things are blockers here anyway. I'm mostly focused on feedback from a couple of other installs for the moment, which is primarily still setup/install issues and some Maniphest stuff.

Girish added a comment.Via Old WorldMay 30 2011, 9:25 PM

@evan, yea, the (2) makes complete sense. You should take a look at Gerrit, this does exactly the same thing. The idea is, if you can apply the diff on top of the current HEAD without any conflicts, then you should merge it. If there are conflicts, you send it back to the author who will restart the process again after merging the conflicts in (and getting another approval).

I think doing this with SVN will be tricky, because you can't easily fake the author in SVN. In git, this is much easier to do.

toulouse added a subscriber: toulouse.Via Old WorldJun 13 2011, 2:37 AM
btrahan added a project: Differential.Via Old WorldMar 5 2012, 3:54 AM
Korvin added a subscriber: Korvin.Via WebMar 9 2012, 12:46 AM
edward added a subscriber: edward.Via WebMar 20 2012, 7:36 PM
edward added a comment.Via WebMar 20 2012, 7:45 PM

The current iteration of the tool I'm working on (I've called it "releeph") will be able to pick git commits (with git-cherry-pick) into a release branch. I have a daemon that does this in the background, reporting pick failures back up to the releeph tool's database.

It's possible that instead of sourcing the patch from an existing git commit, it could take the patch from Differential and apply-and-commit it.

If "trunk" was then considered to be the same sort of thing as a release branch, it could take patches from Differential and apply them to trunk as well.

joe added a subscriber: joe.Via WebMay 21 2012, 2:03 AM
Korvin added a comment.Via WebMay 24 2012, 8:48 PM

@edward, would that preserve blame?

epriestley added a comment.Via WebMay 24 2012, 8:50 PM

cherry-pick preserves author/blame, yes.

edward added a comment.Via WebMay 30 2012, 11:04 PM

I'd like to factor some stuff out of my project and create a more general differential-revision committing service. Ideas for this would involve:

  • creating a request-to-commit, that points at a differential revision, along with some repository-specific information about where to commit it (e.g. a git branch, or a subversion directory)
  • maintaining some lockable working-copies for repositories where committing can occur
  • recording whether a commit was successful, or how and why it failed, probably alongside the request-to-commit

Differential-revisions can be committed to more than one place. The Facebook release process is like this, where we cherry-pick differential-revisions that have been made to trunk, to the release branch (except that we refer to them by their trunk revision number, when most of the time we actually care about the differential revision id, as that is where all the discussion occurred).

Patches to release branches would be submitted to differential, and then committed automatically from there.

Again, from a release tool perspective, it is useful to group differential-revisions together to form whole features. If one of a set doesn't commit, we shouldn't commit the other ones.

Committing fails if there are conflicts, either when committing a brand new differential revision to trunk, or picking it across to a release branch. Given that a revision can be committed to multiple places, I think it would be better to store those errors outside of differential (but still be able to render it there) as I don't think it fits in with the linear-series-of-actions that differential renders.

Committing to a "branch" is a bit nebulous -- I'd currently just refer to a branch using a string, and rely on the VCS-specific implementation of the thing-that-commits to understand that string in the context of the VCS type.

If a differential revision hasn't been committed anywhere yet, it could be applied to the trunk/master with arc patch.

If a differential revision has one existing commit (the release branch use case), then git and hg (and Subversion, with git-svn) can use that commit ID to do a cherry-pick / graft.

The mutable-working-copy-pools that @epriestley mentions will be very big for a lot of Facebook repositories (and probably most non-trivial projects) so efficiently keeping them up to date is important if the committing tool should be fast. One current implementation I have mutates a PhabricatorRepository's local-path to point at the path of a writable copy, and locks it with flock() for the duration of the process. If Drydock is here to solve this problem too then I'd prefer to use that, especially if it can re-use the writable repositories and get them back up-to-date, prior to committing, nice and quickly.

edward added a subscriber: epriestley.Via WebMay 30 2012, 11:04 PM
emiraga added a subscriber: emiraga.Via WebJun 2 2012, 2:06 AM
epriestley added a comment.Via WebJun 4 2012, 2:50 PM

Yeah, the rough flow of work looks something like this in my imagination:

  • A task is inserted into the task queue that says "commit D123 to repository X on path/branch Y, yadayada".
  • A worker pulls the task and makes a Drydock request for a write lock on a working copy of "X".
  • Drydock has blueprints to allocate machines, disk space, pull working copies and update them, and has resource locking mechanisms. Magic happens behind the scenes and the working copy is allocated somewhere (reusing an existing one, if possible; building a new one if not) and leased to the worker. It gets an object which has a command execution API on the resource.
  • The worker attempts the commit, either succeeds or fails, and writes results somewhere. It releases the lease.
  • Drydock can now lease the working copy to another process. If nothing requests a lease for long enough, the resource gets garbage collected eventually.

Currently Drydock can only do like 20% of the "magic" step though, and does that 20% in a relatively opaque, fragile way. I think this is just because I need to make it work and not because it's a crazy astronaut nonsense abstraction, although we'll see. Provided it does work, the recipe looks nearly identical for CI/async test/build/sandcastle stuff:

  • A task is inserted into the task queue to <do something> on <repository X at path Y branch Z etc>.
  • A worker pulls the task and requests a <read/write/nonexclusive> lease on <that resource>.
  • Drydock does <black magic> and gives the worker a capability object.
  • The worker does <something> and records results <somewhere>.
  • The worker <releases/retains/sets the duration of> the lease.
dereckson added a subscriber: dereckson.Via WebJul 6 2012, 1:56 PM
epriestley added subscribers: ptarjan, orange.Via Old WorldAug 17 2012, 4:16 PM

◀ Merged tasks: T1669.

epriestley removed a subscriber: orange.Via WebAug 17 2012, 4:16 PM
epriestley edited this Task.Via Old WorldNov 5 2012, 4:58 PM
ipalaus added a subscriber: ipalaus.Via WebNov 5 2012, 5:03 PM
arturmartins added a subscriber: arturmartins.Via WebNov 6 2012, 12:09 PM
btrahan removed a project: Differential.Via Old WorldDec 4 2012, 4:10 PM
btrahan added a project: Differential.
epriestley edited this Task.Via Old WorldDec 18 2012, 11:24 PM
AnhNhan added a subscriber: AnhNhan.Via WebJan 7 2013, 8:57 PM
wederbrand added a subscriber: wederbrand.Via WebJan 21 2013, 11:47 AM
jevripio added a subscriber: jevripio.Via WebJan 21 2013, 10:51 PM
frozendevil added subscribers: ifraimow, frozendevil.Via WebFeb 26 2013, 8:08 AM
colegleason added a subscriber: colegleason.Via WebJul 9 2013, 11:49 PM
avivey added a subscriber: avivey.Via Old WorldJul 10 2013, 4:56 PM
emiraga removed a subscriber: emiraga.Via Old WorldJul 10 2013, 4:57 PM
edward removed a subscriber: edward.Via Old WorldJul 22 2013, 4:49 PM
asherkin added a subscriber: asherkin.Via Old WorldSep 9 2013, 5:17 PM
spicyj added a subscriber: spicyj.Via WebOct 9 2013, 12:29 AM
epriestley edited this Task.Via LegacyOct 9 2013, 1:00 AM
brent added a subscriber: brent.Via WebOct 13 2013, 1:00 AM
avivey edited this Task.Via LegacyNov 3 2013, 1:06 AM
avivey edited this Task.Via LegacyNov 13 2013, 6:07 AM
dctrwatson added a subscriber: dctrwatson.Via WebDec 31 2013, 7:20 PM
wotte added a subscriber: wotte.Via WebJan 17 2014, 11:37 PM
hwinkel added a subscriber: hwinkel.Via WebFeb 17 2014, 12:30 PM
epriestley edited this Task.Via LegacyApr 8 2014, 12:41 PM
epriestley edited this Task.Via LegacyApr 8 2014, 3:05 PM
turadg added a subscriber: turadg.Via WebApr 18 2014, 3:22 PM
roylou added a subscriber: roylou.Via WebApr 28 2014, 3:00 PM
sascha-egerer added a subscriber: sascha-egerer.Via WebMay 5 2014, 12:42 PM
pburka added a subscriber: pburka.Via WebMay 12 2014, 8:11 PM
epriestley added a subscriber: igorgatis.Via WebJun 3 2014, 12:43 PM

◀ Merged tasks: T5252.

epriestley added a subscriber: lianghu.Via WebJun 27 2014, 2:30 PM

◀ Merged tasks: T5498.

ndbroadbent added a subscriber: ndbroadbent.Via WebJul 9 2014, 12:19 AM
joshuaspence added a subscriber: joshuaspence.Via WebJul 9 2014, 1:15 PM
epriestley changed the visibility of this Task from "All Users" to "Public (No Login Required)".Via WebJul 28 2014, 2:32 PM
vm added a subscriber: vm.Via WebAug 11 2014, 6:24 PM
halfsea awarded a token.Via WebAug 18 2014, 10:31 PM
epriestley added a subscriber: tomm.Via WebAug 22 2014, 12:45 AM

◀ Merged tasks: T5944.

qgil added a subscriber: qgil.Via WebSep 3 2014, 8:51 PM
dctrwatson removed a subscriber: dctrwatson.Via WebSep 3 2014, 9:04 PM
ipalaus removed a subscriber: ipalaus.Via WebOct 2 2014, 1:01 PM
lucaro added a subscriber: lucaro.Via WebOct 20 2014, 3:50 PM
colmdoyle added a subscriber: colmdoyle.Via WebOct 20 2014, 8:56 PM

Any sign of this In upstream?

papapasan added a subscriber: papapasan.Via WebNov 1 2014, 11:13 AM
jcarrillo7 added a subscriber: jcarrillo7.Via WebNov 16 2014, 12:29 AM
dereckson added a comment.Via WebNov 27 2014, 8:35 AM

I tried the Land to GitHub feature and got the following issue:

  • It creates /var/repo/....__WORK folder, populated with the master branch content.
  • It shows a Permission denied (publickey) / fatal: Could not read from remote repository. public key denied during the git fetch step.

Repository credentials are stored in passphrase and locked. On the GitHub side, it's set as deploy key. But then, I see in D7555 some OAuth and API token generation code?

avivey added a comment.Via WebNov 29 2014, 11:05 PM

@dereckson - try setting the remote URI if the repository to either https:// or git://, instead of ssh://github.com/..., and remove the __work dir.
In https://secure.phabricator.com/diffusion/P/browse/master/src/applications/differential/DifferentialGetWorkingCopy.php, it uses the repo's remote URI as the one to fetch from, but it doesn't support authentication.

This whole feature is "experimental" and not really supported right now (A real implementation is blocked by T2015); In fact, it shouldn't work at all for most installs.

If you're willing to hack it locally, maybe remove the "remote set-uri origin" part; It will then try to fetch locally from your copy, instead of from github again.

The OAuth generation in D7555 is only used for the push flow, so github sees the right user trying to actually push (The user that clicked the button); It's not used for cloning/fetching.

dereckson added a comment.EditedVia WebNov 30 2014, 12:25 AM

Thank you. I've tested on a small personal installation, and indeed that works (the repo were configured with git:// URL). That's also really interesting to be able to close the revision directly in the code review app, then be able to directly move to another element of the interface. So, the feature is welcome.

Yet, I'm not in an hurry to deploy that in larger installations. To use different authentication workflows according the feature or requesting every committer to link a GitHub account isn't desirable. So in the future implementation, where, if I understand well the Drydock roadmap, workflows will be configurable, it would be nice if we can choose the different strategies to do the landing (I see three ways: deploy keys, machine account's ssh key, OAuth token).

Finally, there is an UTF-8 issue, a → character in the commit were mangled during the operation. But that will have to be tested on final implementation.

avivey added a comment.Via WebNov 30 2014, 3:37 AM

The way to customize the flow (When ready) would be to write a custom LandingStrategy class, that works in the exact way that matches your setup.

You might be interested in hosting the repository on Phabricator, and mirroring to GitHub (As this install does) - that way users only need Phabricator credentials for everything.

Regarding the UTF-8 issue - it may be related related to T4045; Does arc patch for the same diff shows the same issue?

dereckson added a comment.Via WebNov 30 2014, 8:45 AM

Indeed, the mirroring makes more sense.

arc patch applies it correctly.

ndbroadbent removed a subscriber: ndbroadbent.Via WebNov 30 2014, 1:19 PM
nickz added a subscriber: nickz.Via WebDec 17 2014, 7:11 PM
angerman added a subscriber: angerman.Via WebJan 29 2015, 2:47 PM
dpepper removed a subscriber: dpepper.Via WebJan 30 2015, 7:03 PM
chad moved this task to v2 (UI/Mobile) on the Differential workboard.Via WebWed, Mar 4, 5:04 PM
epriestley moved this task to Future Work on the Differential workboard.Via WebWed, Mar 4, 5:09 PM

Add Comment