Page MenuHomePhabricator

Doesn't seem possible to integrate code review into git flow branching model
Closed, ResolvedPublic

Assigned To
Authored By
cpa199
Sep 10 2014, 1:53 PM
Tags
None
Referenced Files
F201534: pasted_file
Sep 10 2014, 3:00 PM
F201528: pasted_file
Sep 10 2014, 3:00 PM
F201530: pasted_file
Sep 10 2014, 3:00 PM

Description

For the life of me I cannot find a suitable way to integrate the Phabricator code review workflow into the gitflow branching model without squashed commits, and trust me I really want to as I love the tool, however this is causing me real issues. In this branching model it is normal for multiple commits and pushes to take place on a single feature branch and for multiple people to work on a single feature branch before merging it back in (making pushes a necessity, let alone for backup purpose). Phabricator just doesn't seem to have a way to deal with this using differential or audit.

Using "arc diff develop" works wonderfully to create the initial review vs the develop branch but I note that if I turn on immutable history (which you have to otherwise it thinks there will only be a single commit) there is no way to update this existing code review with changes made as a results of comments from what I can see. I'd be very happy to be wrong, but if I'm right then immutable diffs are fairly unusable in reality as you'd have to create a whole new code review for any update commits/pushes.

It also seems that Audit is based on single commits, meaning I can't create an Audit that is a diff between all commits in my branch and develop. Again I hope I'm wrong, but if not then it makes post push Audit review workflow unsuitable as a solution too for this model.

I know I've raised something related in T6019: Oddity with git branch/merge when using arc diff, however even the advice in there hasn't got me much closer to a possible solution. I'd love any further ideas/advice!

Event Timeline

cpa199 raised the priority of this task from to Needs Triage.
cpa199 updated the task description. (Show Details)
cpa199 added a subscriber: cpa199.

If squashing is the only way to do this then I will still consider it, it's just that it's not always desirable to lose all commit history on a feature branch that has been going a little while and force pushing (which is the only way to update remote with the squashed commit) is something I'd prefer not to recommend doing as people who get into the habbit of it can mistakenly use it and can end up doing this (https://news.ycombinator.com/item?id=6713742)

So, I just want to try to understand your question. You want immutable history for local work and logging, but you want to pretend its squashed for the purpose of review?

It wouldn't have to pretend it was squashed if differential understood that diffs could be additive instead of replacements of each other. If I could add another commit (or more than one) that differential knew was an update to a review, and thus the diff display in the review knew that the commit was additional changes. At the moment when you do this anyway (you have to have immutable turned off) it really messes up the diffs view. I wonder if I can create any screenshots of what it does at the moment to illustrate. I'll take a look.

I've nearly solved the problem during getting you those screenshots, teddy bear effect going for me there! Though it's not quite perfect it's nearer now.
I noticed that if you leave immutable off and do your updates via "arc diff develop --update D8" rather than "arc diff --update D8" it now correctly treats the commits additively in the diff view of the review. I assume that is because it creates a whole diff each time you do an update, so giving it the same starting point makes this work. If I'm guessing correctly then I wonder if develop changes between the original review and the update would this still break the diff view as the diffs would effectively have different baselines as develop code would be different in each case? If this is the case then even doing it this way will be a problem I think.

The bit that isn't quite perfect still when doing it like this is that when I merge in via git flow feature finish it correctly auto-closes the review but does this due to two commits rather than one, the last of which is displays the wrong diff. Screenshots below:

Two closing commits

pasted_file (540×1 px, 92 KB)

Diff correct between base and first closing commit

pasted_file (1×1 px, 155 KB)

Diff incorrect between base and last closing commit. Clearly this is just the final change I made rather than the diff between base and all changes.

pasted_file (773×1 px, 111 KB)

Audit is based on single commits, and not suitable for this workflow. T5722 discusses improvements here, but they are very low on our priority list.

via "arc diff develop --update D8" rather than "arc diff --update D8" it now correctly treats the commits additively

Broadly, yes. arc diff is never additive. To update a revision, you need to submit all of the commits which contain the changes.

In general, arc diff develop will do this. The first time you run it, it will submit, say, commits A and B (on your local "feature1" branch -- the differences between your branch and the develop branch). If you run it again later, it will submit commits A, B, C and D this time. It doesn't need to be additive because you do the addition locally, by selecting the entire relevant range each time. (There a variety of technical and product reasons why we want to know the entire range at arc diff time.)

You generally should not need to --update D8 explicitly; arc diff should be able to figure out what you mean by examining the commit and tree hashes. It's expected that, configuration permitting, arc diff develop will always do what you want (create or update a revision using the changes between the develop branch and HEAD).

You can use arc which develop to understand what arc diff develop will do (which commits it will select, and which revision it will create or update), and why it will do that.

two commits rather than one, the last of which is displays the wrong diff

This is a bug, see T4453.

Broadly, we encourage a pre-publish, squash-based workflow, where changes are not published to the remote before they have been accepted. This resolves the --force issue. The narrative is that only reviewed code leaves the developer's machine.

If you don't want to do this (you don't want to squash, or you do want to publish unreviewed code), setting history.immutable to true and running arc diff develop each time should work as you expect, except that the final update won't reflect the correct changes (at least, some of the time).

When I set immutable to true and ran "arc diff" for the second time it won't update the review. However if I use "arc diff develop" it correctly updates, so that was the mistake I made when trying to update a review in immutable mode (the error was something like can't update a non-existent diff, I can't remember exactly). I just didn't get that you needed to use the same base point for the updating call.

I do think there may still be an issue here if develop has changed between my starting of a review and the updated changes as the baseline will have changed if any files I've changed also changed in develop. Am I right? If so this kind of thing is actually quite common in large teams when focussing on certain areas of a product prior to release for example in this model.

Your point about only reviewed code leaving a machine is a fair one in some circumstances however does that not mean in large teams this means huge numbers of reviews (as you don't want to keep code un-pushed and backed up for any length of time) and lots of context switching so as to unblock people all the time? I understand that arc diff actually provides backup via patch however that would mean there will be lots of reviews all the time.

Thanks for pointing out the bug too.

I do think there may still be an issue here if develop has changed between my starting of a review and the updated changes as the baseline will have changed if any files I've changed also changed in develop. Am I right?

When you type arc diff develop, it actually means "compute the merge-base of develop and HEAD, then diff against that". This should always produce the correct changes.

If you've merged develop into your local branch between invocations of arc diff, the changes will be correct but the selected commit range will start with the most recent merge commit. For example, if your branch has A, B, <Merge>, C, D, the update will include <Merge>, C, D. This is correct, because the diff between the parent of A and HEAD includes changes made to develop while you've been working on your branch, which is not what you want. Meanwhile, the diff between the develop-side parent of <Merge> and HEAD includes all local changes only, which is what you want.

huge numbers of reviews (as you don't want to keep code un-pushed and backed up for any length of time)

Yes -- we think this is a very good thing, and works well at scale.

https://secure.phabricator.com/book/phabflavor/article/recommendations_on_revision_control/
https://secure.phabricator.com/book/phabflavor/article/writing_reviewable_code/

I believe Facebook is somewhere in the millions of reviews. We perform one review per commit (and one commit per idea) in this upstream.

lots of context switching so as to unblock people all the time

Authors should not be blocked by review. A specific change will be, but they can work on something else while awaiting review (either a different feature, or the next piece of whatever they're building).

At least personally, I try to break changes into small pieces which I can write, test, and send for review in no more than a couple of hours (and often much less), then look at reviews and other things (and, say, grab a coffee or some food) in between changes. Review is a bit of a context switch, but it fits into my schedule at times when I'm maintaing a low amount of context mentally, so the switch is cheap. It's not an interrupt.

Thanks for the clarification, looks that once that bug is fixed then the whole integrated with gitflow workflow will work.

I get your point on one review per commit, I'm thinking I'll document this as an alternate way to working and give the option to users between classic gitflow with feature review and much smaller blocks with review. When I'm next working on a project I'll see if I can introduce the methodology and see how it works for myself, not that I don't trust you :P
I do however have to cater for projects that will want to continue existing methodologies as this isn't just for my projects, hence the real need to make things work the way I have been trying. Looks like I'll be good on that now too though, so thanks for your help.

I wonder if it is worth updating the Arc Diff user guide to explicitly say that updates to diffs with user defined commit ranges also need to include the same range? It might be obvious to some but I blindly followed the instructions (I know, my fault for not trying to really understand the details of how it worked) and they say:

To update a revision, just do the same thing:

$ nano source_code.c  # Make more changes.
$ git commit -a       # Commit them.
$ arc diff            # This prompts you to update revision information.

Note, just arc diff. This is also immediately after the bit that talks about creating the diff from a specific commit.

I filed T6072 to take a look at refreshing some of the documentation and behavior to cover this and other cases, but the topic is generally a complex one and it's difficult to balance the flexibility we provide (i.e., support for many different workflows) with simple documentation (which can not just describe every possible workflow and remain simple).

cpa199 claimed this task.

Thanks, I've resolved this and I do understand it's difficult with the docs.