Page MenuHomePhabricator

Differential Revision's repository is reassigned when the commit is pushed to a separate repository
Open, Needs TriagePublic

Description

For minor version releases of the software we release we clone the repository as a version branch. This process is a holdover of using mercurial years back when branching wasn't a solid workflow process. On the server that hosts the repository we have push hooks to prevent pushing commits to version repos which haven't also been pushed to newer version repos (ex: cannot push a commit to version 6.1 if it's not in 6.2).

Due to this setup, when working on patch versions (6.1.x) the code will be modified and tested on the 6.1 repo, and have the Differential Revision published on that repo. However when ready to land, we arc land --hold on the 6.1 repo, merge the commit into 6.2 and push upstream into 6.2, followed by pushing 6.1 upstream. As a result of that push, the Differential Revision will have it's repository field updated to the 6.2 repo. The second push into 6.1 upstream does not update the Differential Revision.

In this situation it would convenient to have a way to turn off this behavior, though I'm not sure how that would work - whether it would need to be an option on the originating repo to "lock-in" the revisions, on the other repos to "not-steal" revisions, or an arc flag even. The current behavior is undesirable for our workflow as all revisions end up on the repo for the current/stable version - looking up what reviews have taken place for work developed on a released version requires looking at revisions by date-range and checking the list of commits to see if they exist in version repos - then manually switching the repository field back to the right repo.

Event Timeline

cspeckmim raised the priority of this task from to Needs Triage.
cspeckmim updated the task description. (Show Details)
cspeckmim added a subscriber: cspeckmim.

Hmm maybe I'm repeating myself, I think this could be resolved by T7773.

Can you disable Autoclose in the version-branch repository? Differential(Select a Repository)Edit RepositoryEdit ActionsAutoclose? Or do you want these commits to associate, just more weakly than they currently do?

Commits from all version repositories appear to associate correctly. When the change is first developed it is on the prior version repository and creation of the revision properly associates it (through arc diff). But when doing our land process, it looks like since the repository which receives the commit first is the current version branch instead of the prior version branch, the revision's repository field is automatically changed to the current version's repository and closed before the commit makes it up to the prior version branch.

When doing a query against revisions you can search by repository, which I planned to use in order to create documentation of all reviews for a patch release. However in the current state, the only revisions which appear in a search result for prior-version repository are the active ones. All the closed ones had been associated to the current-version repository. I can look through the closed revisions and see the associated commits, and if any of the commit IDs have the prior-version repo callsign then I update the repository field. This starts to become unwieldy however.

Here's a screenshot of the details for a revision which is already closed. The commit appears in three different repositories (rMS, rMJENAD, rMJLOGVIEW). The rMS repo is the current version repo which version repos are branched from. The code change and revision were developed on the rMJENAD repo which is the prior-version. I highlighted the repo field - for this revision I have already corrected it to be rMJENAD, but when the commit was pushed first to rMS (followed by push to rMJENAD), this field automatically switched to be rMS.

revision details.png (538×1 px, 117 KB)

Right -- can you disable Autoclose on the version-branch repository to work around this? Disabling it should stop commits to it that repository from associating with revisions, so the repository will never "steal" the revision (basically, Differential will just ignore the commit, even though it will still show up in Diffusion).

It doesn't sound like any of the things you want to do depend on this association, so just disabling the association for the version-branch repositories might be good enough?

I can try this, but wouldn't I want to disable auto-close on the current-version branch? The majority of work is being done on the current-version, but a significant amount of work is being done on prior-version in-tandem. Or are you referring to the prior-version having it's Auto-close turned off, or both? Unfortunately I won't be able test any of this till next week.

I may be misunderstanding, but here's what I think is happening from your description:

  • You perform work in repository "A", but sometimes copy commits from "A" to other repositories ("B", "C", etc).
  • When you copy commits to B, C, etc., Differential recognizes them and closes revisions.
  • You don't want the copies in repositories B, C, etc., to cause this to happen.
  • An acceptable solution is for Differential to completely ignore commits in B, C, etc.

If this is the case, turn off Autoclose in B, C, etc.

Yes that is a more succinct description. Using that same description, repository 'B' would be where primary development happens, and hence more Differential revisions are created. Ideally revisions on repository 'B' would still auto-close. I was hoping for an option that had similar behavior but was a little inverse -> instead of configuring auto-close for repositories 'B', 'C', etc. there would be some option on 'A' to ignore commits on 'B', 'C', etc.

I will try turning off the auto-close feature on the main repository for the next revision closure on the 'A' repository.

I turned off auto-close on the main repository, and revisions pushed to 'A' no longer associate with the main repository. However none of the revisions for the main repository auto-close.

@epriestley: I'm considering looking at solutions for this, as this is a minor-but-frequent nuisance as a result of our development workflow. I think my approach will be to check during Autoclose that the commit seen is in the same repository associated with the revision before continuing. My gut tells me this is not a change you want upstream - and that if there is a plan for this, the change might be a bit more complex than what I can stomach at the moment. I would like to get an idea of what you have in mind however, that I can keep an eye out while digging around. Below I'm trying to better verbalize our workflow process that results in us seeing these problems.


The closest description I've found for our workflow is from this article:
http://paulhammant.com/2013/04/05/what-is-trunk-based-development/

This is the background of our actual development:

  1. Our products are desktop/server applications which means we support several years worth of older versions.
  2. All development for a "current" version happens in the stable repository
  3. We don't use Mercurial's branch/bookmark features for anything other than local development work
  4. When branching for release, we clone the repository for the minor version (ex: release-6.6 will be branched soon off stable as we get nearer to finishing all major work on it)
  5. Continued work and fixes for a patch version are developed on the release-x.x branch which we then enforce merging into into stable prior to allowing pushed upstream to release-x.x -- this is a means of enforcing that a bug we need fixed in a patch version is also fixed upstream for newer versions.
  6. After a branch, there is still a significant amount of work being put into the release-x.x version, and typically remains that way until the next minor version release (ex: 6.5.0, 6.5.1, 6.5.2... 6.5.7 work is all done in the release-6.5 repository).

Differential revisions created for work on release-x.x properly associate to the release-x.x repository in Diffusion, however due to step 5 it looks like Diffusion sees the commit first in stable then automatically re-associates the revision with the stable repository. Normally we would regularly ignore this as we've not had much reason to want to list all revisions on non-stable, however we currently are managing work in stable, release-6.5, release-6.5-oem, and soon branching a release-6.6 -- which we're anticipating some work being done in all 4 branches concurrently for at least a short while. Particularly with the release-6.5-oem release we're interested in being able to report the code review coverage. Additionally, in the event of all these changes, the path of a fix in 6.5 will need merged through: release-6.5 -> release-6.6 -> stable, with cherry picking for release-6.5-oem.

At the moment we're manually adjusting the associated repository for the release-6.5-oem version after they've closed. Turning off Autoclose for stable would fix this issue right now, however that means developers working in stable would need to manually close their revisions -- unfortunately they are the majority, but the work happening in release-6.5-oem is not insignificant; ~6 revisions a day which is not much to manually do, but because this happens automatically behind the scenes it's easily forgotten.

The major difference between our workflow and the "Trunk-Based Development" is that in TBD bugfixes are developed in`stable` and then cherry-picked back into release-x.x. As for reasons why we clone repositories instead of using branches, this is largely due to the state of Mercurial's branching model being unstable/not-robust back when we switched from SVN, and works fairly well for us. According to this dated article (which is typically one of the first results when duck-searching for Mercurial Branching), the Mercurial development team uses this same model:
http://stevelosh.com/blog/2009/08/a-guide-to-branching-in-mercurial/

I think that change is tentatively within the realm of reason, but I have some concerns.

By default, the proposed change (abort if repositories differ) will make things appear to fail silently for users who do not expect or want the behavior. If we did pursue this, I'd like to continue forward and just write a different type of relationship (something more like a mention) instead of an autoclose, so that users could look at the commit and revision and see a transaction like "alincoln pushed this commit to a different repository, as rXYZqqqq.". Generally, silent aborts raise our support costs because users who do not expect them usually can't guess the rule, and we have to rule them out manually if users do hit a real bug.

I also think this use case is very rare, and that this feature might essentially be a feature for only your install. I was previously aware of only one other install using copy-the-repo-based branching -- they were also using Mercurial, and I believe were motivated to use this model partially by Fogcreek's Kiln. I think they've moved partially or fully away from this model (at least, I haven't heard of any more issues related to it), although I'm not sure. This workflow is essentially unheard of in Git or Subversion and feels like it holds no advantage over other workflows in a modern environment -- would you still choose this approach in 2015, with modern Mercurial branches/bookmarks? Would you still want to use this approach in 2020? That said, I can understand that the costs of this approach may be smaller than the cost of switching to a new model, and that this model isn't actively painful.

The strongest argument for inclusion that I can come up with is that our current behavior when multiple copies of a commit are pushed to different repositories ("the first one wins, the others are ignored" -- I think? You may actually know better than I do) is fairly arbitrary and I think there's a reasonable argument that it's not very good, particularly the "ignore" part. That said, I think it's arguably getting the "right" result in some sense even here: the commit is appearing on stable first, and it's not crazy to look at that externally and say "this is the first place it showed up and authoritative, other copies are copies of this". If we did refine this rule to not-ignore those subsequent pushes I'm not sold on "first wins" being less natural than "same repository as diff came from wins". (Developing a patch which will hit stable first in release-6.5 also seems a little weird to me, but that's neither here nor there.)

Maybe a better way to look at this is to try to find other approaches for the problems the behavior creates, but I'm not sure what those problems are. There's no way to search for revisions by repository right now, anyway, right? Is this just a problem of revisions closing too soon? I'm not sure I understand the coverage issue, either: how does the repository for a revision impact your ability to report coverage?

I think the change you want is very easy to implement as a patch (totally untested):

diff --git a/src/applications/repository/worker/commitmessageparser/PhabricatorRepositoryCommitMessageParserWorker.php b/src/applications/repository/worker/commitmessageparser/PhabricatorRepositoryCommitMessageParserWorker.php
index c93810c..9c5ccb8 100644
--- a/src/applications/repository/worker/commitmessageparser/PhabricatorRepositoryCommitMessageParserWorker.php
+++ b/src/applications/repository/worker/commitmessageparser/PhabricatorRepositoryCommitMessageParserWorker.php
@@ -185,6 +185,12 @@ abstract class PhabricatorRepositoryCommitMessageParserWorker
         $should_close = ($revision->getStatus() != $status_closed) &&
                         $should_autoclose;
 
+        // Don't close revisions associated with a different repository.
+        if (($revision->getRepositoryPHID() !== null) &&
+            ($revision->getRepositoryPHID() !== $repository->getPHID())) {
+          $should_close = false;
+        }
+
         if ($should_close) {
            $commit_close_xaction = id(new DifferentialTransaction())
             ->setTransactionType(DifferentialTransaction::TYPE_ACTION)

...but I think this patch increases the spookiness of the interaction overall if not coupled with additional changes to publish some replacement relationship.

I'm not exactly sure what problems this creates, but some of them might be mitigated with a Herald rule like this, conceivably:

When:
[ Repository ][ is ][ release-6.6.x ]
Take actions:
[ Add projects ][ release-6.6.x ]

Then operate on projects instead of repositories? I don't know if that helps with anything or not. It seems like it should have no impact on the coverage stuff, but maybe the connection there is just different than I'm imagining.

This workflow is essentially unheard of in Git

In my mind I imagined it similar to GitHub's forking, with the exception that the Pull Request happens a little differently and is seen first in the upstream repository, unless that is specifically what you're referring to here.

... would you still choose this approach in 2015, with modern Mercurial branches/bookmarks?

I want to say that we would go with branches/bookmarks. I think my hesitations in going all-in are based around my narrow perspective -- the largest hurdle in getting the developers at my company moving over to Phabricator was learning about local branch/bookmark development that I would be wary of trying to push out working with multiple heads, a mix of public and private.

... and feels like it holds no advantage over other workflows in a modern environment

All-in-all I can't really argue for the workflow other than "pre-existing" and "comfort". In fact, we used to clone for every patch version, meaning release-6.5.1, release-6.5.2, etc. which meant a lot of merging. We've moved off that a few years ago and hopefully someday we'll move off cloning forever.

a different type of relationship (something more like a mention)

I'm fond of this idea. Would this mean the repository field would be a multiple-value instead of single-value?

There's no way to search for revisions by repository right now, anyway, right?

This is possible right now, see "Repository" search field:
https://secure.phabricator.com/differential/query/advanced/

Screen Shot 2015-10-13 at 10.21.32 PM.png (1×2 px, 147 KB)

This is how I initially discovered the repository field and consequently found this behavior.

how does the repository for a revision impact your ability to report coverage?

Since our repositories correspond to a version, being able to search by repository gives us an instant listing of all the reviews done for that version. We typically don't do this, but in the case of an OEM version we might.

I'm not exactly sure what problems this creates, but some of them might be mitigated with a Herald rule like this...

DAMNIT! - had I thought to do this I would have! I think when I first poked around with revisions and being able to search by repository I had just locked that solution into my head, then only later learned about Projects/Herald and never put it together. I'm going to spend an afternoon putting together some Projects and a Conduit script to update the current stuff (assuming that's do-able, I only started with Conduit the other day).

Thank you for the response.