Page MenuHomePhabricator

"Explain Why" isn't properly explaining why revisions were closed
Open, NormalPublic

Description

(Just observed on Phacility)

I just pushed a series of commits on master to our Git repo in Diffusion. Phabricator closed an unrelated Differential because of one of these commits. That differential was on a topic branch in my local repo which is not even in Diffusion.

Looks like Phabricator is also confused as to why this differential was closed:

PS: The offending commit message is simply [Runtime] Added URL::Parse() API.

Event Timeline

Is it OK if I look at some of the data on your instance to diagnose this? I expect to look at this information:

  • Commit hash, tree hash, and commit message of rHYPERce186613.
  • Local commit metadata of the diffs associated with whatever revision is pictured in that sceenshot.
  • Edge and revision metadata connected to rHYPERce186613 so that I can figure out which revision is pictured in the screenshot (or just tell me the revision number).

Feel free to look at this information and anything else in the Hyper instance you need to.

Here are the commit hash and tree hash of the commit:

$ git log -n 1 --format='%H %T' ce1866
ce186613... 7cea432c...

I examined repository_commit to identify the PHID of this commit, then followed the edge to the revision. I selected the diffs attached to the revision. I see five of them. The last one is the automatic diff created by Diffusion, so I looked at the second-most recent diff.

The second-most recent diff has five commits:

e1532b6a WIP ...
a6114bd2 Fix ...
a53667c8 Fix ...
baaa69d7 Fix ...
ce186613 [Runtime] Added ...

The last commit is a match for the commit which closed the revision (the tree hash is also identical).

The expected behavior is that when we discover a new commit, we search for any open revision which has an active diff with a commit that has the same commit hash or tree hash as one of the discovered commits. In this case, it appears that this revision did.

One possible explanation is that you might have made an overbroad update before splitting it off onto a new topic branch? Does the existence of this change as part of the revision make sense given whatever context you have about how the revision was updated? You can review the range of commits that were uploaded and the total effective change by selecting "Base vs Diff 4" (i.e., the second most-recent diff) in the "Revision Update History" panel in Differential. This should show that ce186613 was part of the revision at the time it was closed.

The "Explain Why" not explaining this is a bug. I'll retarget this to fix it.

See also vaguely related T4453.

epriestley renamed this task from Diffusion closed an unrelated differential task after pushing a commit to "Explain Why" isn't properly explaining why revisions were closed.May 30 2016, 1:55 PM
epriestley triaged this task as Normal priority.

Here and in one other report, revisions which are closed by matching a commit or tree hash with a discovered commit are not explaining this in "Explain Why". "Explain Why" is expected to be able to provide details about this:

$body_why = pht(
  'This commit and the active diff of %s had the same %s hash '.
  '(%s) so we linked this commit to %s.',
  $diff_link,
  $hash_type,
  $revision_match_data['matchHashValue'],
  $diff_link);

At a glance, that should be $body_why[], so maybe this just never actually worked or got broken in some unrelated change.

The expected behavior is that when we discover a new commit, we search for any open revision which has an active diff with a commit that has the same commit hash or tree hash as one of the discovered commits. In this case, it appears that this revision did.

Say I have a local topic branch off commit T on master with a number of commits: A, B, C. I called arc diff while on commit A and C but not B.

First question

I was under the impression Differential was only storing patches (and the associated base commit). So in this example, Differential would have:

  • Commit T (which is already in Diffusion by definition)
  • Patch T to A but not the commit A itself
  • Patch T to C but not the commit C itself

Are you saying that what's in Differential is actually the A, B and C commits? If so, Differentials are kind of like remote topic branches...

Second question

If push to master any of the A, B or C commit for whatever reason, it will close the Differential, even if the Differential is still in review or the pushed commit is not the latest one in the Differential?

When you run arc diff, you provide (explicitly or implicitly) a commit range. Phabricator stores:

  • a patch, containing the changes from the base of that range to the tip of that range; and
  • metadata about all the local commits in that range.

Your ABC example depends entirely on what your "base" rule is, and/or what arguments you provided to arc diff.

You can preview the commit range by running arc which first, with the same arguments you are about to run arc diff with.

If you were on C, and ran arc diff master (implicitly or explicitly), that would generate a diff showing changes between master and C (including C), with A, B and C as local metadata. The local metadata would include B, and the patch (and thus visible changes) would also include B.

If you were on C and ran arc diff HEAD^ (implicitly or explicitly), that would generate a diff showing only the changes in C, with only C as local metadata. The patch (and thus visible changes) would include only C.

You can see which commits are in the range by examining the "Local Commits" table in Differential:

You can examine this value historically by using the "Revision Update History" to compare "Base" vs the diff you want to examine information for.

If you push any commit with the same commit hash as any value in column 1, or a commit with the same tree hash as any value in column 2, we interpret that as landing the revision.

If push to master any of the A, B or C commit for whatever reason, it will close the Differential, even if the Differential is still in review

Yes.

or the pushed commit is not the latest one in the Differential?

No, sort of. Only the active diff on each revision is examined. That diff may have associated metadata for zero or more commits. If any of them match, the revision is closed.

If the local commits are "A, B, C", pushing only A or only C (or B) is sufficient to close the revision.

But if you make a revision with "A ,B, C", update it with a diff containing different commits ("D, E"), then push "A, B, C", that won't close it.

Got it, thanks for taking the time to explain. When you say "metadata" for the commits, do you mean the commits are also in Diffusion (even if invisible)? If so, wouldn't this behavior affect Herald / Harbor Master?

For instance:

  • I create a topic branch off master with 1 commit, call arc diff (with default args, so off master as well), the diff is reviewed and approved.
  • Then instead of doing arc land (which rebases by default), I FF merge and push (*).
  • Because that commit hash has not changed once on master, Herald doesn't run the rules again and Harbor Master doesn't fire either.

(*) I never do this in practice, but for the sake of the argument, let's I do. Something kind of like that must have happened when I moved commits around on my topic branches, leading to the issue reported in this very task.

The commits are not in Diffusion. Here's a real example of the metadata we store for a commit:

{
   "23375b9732399900247188a08c81288839fc3208" : {
      "commit" : "23375b9732399900247188a08c81288839fc3208",
      "time" : "1425765276",
      "authorEmail" : "git@epriestley.com",
      "author" : "epriestley",
      "parents" : [
         "7705f452d238f85d7e3f39391ea0701696679236"
      ],
      "tree" : "b08101fecb4c56073e01b8776745029da3b7521c",
      "summary" : "WIP",
      "message" : "WIP"
   }
}

The actual stored metadata will depend on which VCS, client version, extensions, fields, etc., you're using.

This metadata does not affect Herald or Harbormaster. It's just a JSON dictionary of metadata attached to the DifferentialDiff object.

This data is stored in the differential_diffproperties table.

Only metadata is stored (and only a subset of possible metadata) -- the actual changes in the commit are not.

Basically, the metadata is sufficient to generate the "Local Commits" table and little else.

After D15989, this dialog should properly include information about tree hash and commit hash matches. It could still be more clear, but appears to at least be working as intended now.

This could still be improved, but the immediate confusion we encountered appears fixed.

eadler added a project: Restricted Project.Aug 5 2016, 5:23 PM

Caught another case of this: