Page MenuHomePhabricator

`arc land` creates unnecessary local `master` branch in Git
Closed, DuplicatePublic

Description

In my workflow, I don't ever use or have a local master branch. I create new branches with git checkout -b my-feature-branch origin/master. After I've created a revision on this branch, executing arc land finishes with the active branch at a newly created master branch, which I have no use for. I find a local master branch annoying to keep up to date -- I may as well just branch from origin/master.

Feature request: arc land finishes with the active branch pointing to origin/master, or at least it finishes this way if there is no local master branch before executing arc land.

Event Timeline

dcsommer raised the priority of this task from to Needs Triage.
dcsommer updated the task description. (Show Details)
dcsommer added a project: Arcanist.
dcsommer added a subscriber: gregprice.
dcsommer added a subscriber: dcsommer.

I'd like to see reports from at least a few other users before pursuing this. I believe this workflow is highly unorthodox, and accommodating it would require special casing almost every step of arc land, make the workflow more complex for everyone else, and replace some errors with potentially confusing behaviors under normal use.

This workflow also doesn't seem very good, since there's no neutral location where we can leave you and you'll have to go manually delete the local branch eventually. Surely it's easier to keep a local master and just ignore it?

What are the specific costs that having a local master imposes on you? My expectation is that you don't need to keep it up to date as long as you don't branch from it. Am I misunderstanding your workflow, or missing something about how git works?

Particularly, I would expect the total cost of ignoring master to be that git branch has one extra line. At least for me, git branch is useless as a routine command anyway:

phabricator/ $ git branch
  apolicy6
  appsearchx
  arcpatch
  arcpatch-D13276
  arcpatch-D8205
  arcpatch-D9458
  arcpatch-D9760
  arcpatch-D9760_1
  arcpatch-D9928
  arcpatch_1
  arcpatch_2
  asub
  auditgroup
  auth1
  auth9
  authprovmig
  batch
  board3never
  boardlink
  cachcex
  cache7
  calcss
  calres
  cfield1
  cfields1
  chat4
  cleanupn
  comp
  compontent
  compositequery
  contract
  cprofile1
  crop1
  cscroll
  csp
  csrfemail
  cstorage
  ctokencurltemp
  dagtree
  dash6
  datecontrolsearch
  ddblue3
  ddhost1
  ddx1
  derp
  detectzoom
  div1
  div4
  draggy2
  dropdown
  ducks
  dui1
  dui2
  dui3
  dutf1
  dx10
  dx11
  dx12
  dx13
  dx14
  dx15
  dx16
  dx17
  dx18
  dx19
  dx20
  dx21
  dx22
  dx23
  dx24
  dx25
  dx26
  dx27
  dx28
  dx29
  dxn4
  dxn5
  dxn6
  dxn7
  envfix
  feature
  filename
  fixie
  ftesty
  fullcomm
  fullscreen
  fxform1
  fxform10
  fxform11
  fxform12
  fxform2
  fxform3
  fxform4
  fxform5
  fxform6
  fxform7
  fxform8
  fxform9
  generators
  ghost9
  globalslug
  hand4
  hand4x
* harbor2
  harbor3x
  hashalg1
  hashalg2
  hashalg3
  hashalg4
  hashalg5
  hbody3
  hcolor1
  heraldscreen2
  highlightlimit
  host3x
  hrulebreak
  hunk1
  hunk2
  hunk3
  hunk4
  hunk5
  hunk6
  hunk7
  hunk8
  hunksey
  icount
  idrevision
  infouri
  inst2
  lock2x
  logic17
  maction1
  mailkey
  master
  menustuff
  mfact
  mlist2
  mobilemenu
  morehookdata
  need4x
  newfromdownlaod
  newfromdownload
  nosite
  offbyone
  owners1x
  paccount
  paccountx
  param1x
  pass1
  perf2b
  perf2x
  pheader
  phor3
  phox1
  phriction1
  policy1
  pquery1
  projnotify1
  projsub12
  protoproxy1
  psub
  ptype
  pub4
  publish1
  pushreview
  qperf
  quackers
  readyou
  redirectmore2
  rel11
  remarkuporder
  repodoc
  repoq
  resolv8
  rm6icon
  scroll9
  search1
  search10
  search11
  search2
  search3
  search4
  search5
  search6
  search7
  search8
  search9
  servicelog
  settings1
  setupwarn1
  shortcut2
  slimmer
  sound1
  spaces14
  sparser
  sscrit
  sshkey
  stable
  storedvalue
  summary
  taskedge2
  tasksub
  texception
  tiles1
  tmp.differential
  tok6x
  twitter
  uni16
  uni16x
  uni32x
  upload
  utfx12
  utfx18
  visibility2
  woff
  words2
  xlate1
  xlate2
  xparser1
  xparser2
  xparser3
  xparser4
  xparser5
  xpolicy1
  xpolicy2
  xpolicy3
  xpolicy4
  xpolicy5
  xpolicy6
  xss

If this is the issue you're running into, you might want to try arc branch instead if you aren't already aware of it.

@epriestley I very much doubt most users of phabricator have that many branches. Personally, I restrict my branches to code I plan to work on in the ~1 month time scale. The rest I create differential revisions for and then delete locally. I can arc patch them back in later. I would say the fact that git branch is useless for you is probably an indication that you aren't using branches in the expected manner.

One issue creating branches based off a local master is that master easily diverges from origin/master. For example, you start on master, as by default. You forget to create a new branch before beginning a new project. So you have a commit A on master. Now you remember you want to work on something else, so you git checkout -b branch-for-A so you can come back to it later. But now master is polluted and can't be branched from without a git reset --hard HEAD~ on the local master branch first. If work had started on origin/master, then git checkout -b branch-for-A would have left everything pristine and in sync.

In practice, the local master can easily get out of sync with origin/master. It is always better to create branches based off origin/master rather than a local master.

The feature request is quite small: switch to origin/master and delete the just-created master branch (as well as the feature branch, as normal) at the end of arc land if there was no master branch before the command started. I haven't looked at the implementation of arc land, but I'm curious why it uses a master branch at all. Why can't it just do git push origin HEAD:master under the covers?

I'll ask around and see if more people would benefit from this.

Would this solve your problem?

.git/hooks/pre-commit
#!/bin/sh

# This is 'arc' doing something, let it do its thing.
if [ "$ARCANIST" != "" ]; then
  exit 0;
fi

if [ $(git symbolic-ref HEAD 2>/dev/null) == "refs/heads/master" ]; then
  echo "Oops! You're about to commit to master and you prefer not to do that."
  exit 1
fi

I run into similar issues with our mercurial repositories, but I attack this problem as a training issue - for a smaller engineering department this works for me.

What I see is that it's very easy to make commits on local master causing the master bookmark to advance beyond upstream - I'm not extremely familiar with Git but this sounds like what @dcsommer is describing. In Mercurial it's easily fixable but it trips up the "commit-to-trunk" workflow which most of our developers have been doing for years. For developers who are not used to working with branches, this becomes confusing as they perceive the error coming from Phabricator/Arcanist because that's where they run into roadblocks -- "I made a commit already why won't arc let me land/push".

I would like to look into making hg/arc alias/hook which would throw up if attempting to commit directly to master (exactly what @epriestley has above), but I haven't had a chance yet to dive into the topic. I think it would work best as a Mercurial hook that prevents advancing master unless explicitly through hg book - though it might cause arc land to throw up too depending how it works internally.

I would also be interested to know how prevalent "commit-to-trunk" type of workflow is.

Simply put, I don't see the purpose of master as a branch if you never commit to it. It's just clutter and an implementation detail of arcanist.

In my workflow, master isn't helpful either. I usually just

  1. make my change in master
  2. either arc-diff it or branch it away for later (using "git branch foo")
  3. reset master to origin/master

When pushing with git-push, I skip master completely (I type "git push origin HEAD:master"). When using arc-land, I would have liked it better if it did something similar, instead of making another stop in master.

Some team members with a habit of unclean master were surprised by this behaviour too at the time.

The broad issue with making this process have fewer side effects is that it potentially makes it much harder for less experienced users to understand what's happening and what state they're in. My experience at Facebook was that a majority of engineers learned as little about Git as they could in order to do their jobs, and I would not have expected many of them to understand concepts like a "detached HEAD". I'd guess less than 20% knew about or were comfortable with git reflog.

My attempt with arc is to make it as friendly for less-experienced users as possible. My thinking is that advanced users know what they're doing and can more easily make adjustments (like shell aliases and pre-commit hooks) to fine-tune their workflow. Novice users don't have the same set of tools available, and can not move forward if it puts them in a state they do not understand. As a novice user, having arc seem to totally break your work is a really negative and upsetting experience -- and a much worse experience than an experienced user being inconvenienced with an annoying or unnecessary effect which they fully understand and can work around and recover from.

Despite this goal, arc is already very, very hard for users to use. Users routinely have difficulty understanding what it is doing and why, predicting its behavior, getting it to do what they want, etc. Here's an example from a few minutes ago from Twitter:

https://twitter.com/jasonbrennan/status/629279006467203072

I don't know what this user hit, but my guess is that arc did something they didn't expect and they have no idea what to do about it, and that this is extremely frustrating for them.

One workflow discussed here is the same as the workflow in T3855: if you're on master with local commits and run arc land, it should work. This is better for new users and also better for experienced users.

Other workflows discussed here are better for experienced users but I think they make the effects of arc much less understandable, consistent, predictable, or some combination. I am very hesitant to move in this direction given how frequently new users have difficulty with arc today. I think one of the reasons arc is so difficult to use is that I accommodated too many of these advanced workflows earlier in the development process; it is much harder to remove complexity than it is to add it.

Ultimately, users come into arc with a hugely variant set of expectations and assumptions and workflows. As often as possible, I'd like arc to do something that makes sense to novices and isn't frustrating or confusing for them. I think it currently does this far too infrequently, and changes in this vein which cater to very experienced users at the cost of novice users make arc worse on the balance.

Simply put, I don't see the purpose of master as a branch if you never commit to it. It's just clutter and an implementation detail of arcanist.

This is a reasonable approach to take, but it sounds like you're an experienced Git user and this behavior is just slightly annoying, and that spending a few moments installing a pre-commit hook to prevent accidental commits would mitigate the majority of this annoyance.

Suppose I make changes to accommodate this workflow, and X experienced users worldwide are slightly less annoyed by the behavior of arcanist as a result and it saves them several minutes, but it causes Y novice users to have a really frustrating experience with arc that's so negative they feel like they have to tweet about it. If you knew this was the result, would you still advocate this change for all values of X and Y?

I'm not sure the detached HEAD state would be more confusing to inexperienced engineers. If they are only using arc, they will never need to know. If they accidentally commit to master the inexperienced engineers later run into problems as described by @cspeckmim in https://secure.phabricator.com/T9082#130289. So both detached HEAD and finishing on master could be confusing for inexperienced engineers.

Thinking about inexperienced engineers is a good approach though, so I'd be interested to hear what pitfalls either option presents.

I'm going to merge this into T9537, which has more discussion and an additional use case that I find slightly more compelling as a motivator (numerous targets in the remote).