Page MenuHomePhabricator

Detect file renames in incremental diff display
Open, LowPublic

Description

My general workflow when reviewing code is (1) request changes with inline comments, (2) wait for new revision, (3) load incremental diff view with inline comments, (4) examine the changes made in response to each comment.

This breaks down when a new file gets renamed between revisions because Differential considers the two files to be fully independent. Incremental diff view is not able to show me any of the changes between the two versions. Using an external diff tool is a poor substitute because it doesn't allow me to see my inline comments next to the changes.

It would be awesome if Differential could detect this case and show a normal incremental diff.

Event Timeline

We currently rely on the VCS to determine when files are moved. There's at least one case where the VCS clearly can't provide anything useful in this regard:

  • Diff 1 adds A.cpp;
  • Diff 2 renames A.cpp to B.cpp in response to reviewer feedback.

In this case, we can't get anything useful out of the VCS in most reasonable circumstances. So implementation probably looks like:

  • Implement Git's moved/copied file heuristics (or broadly similar moved file heuristics) in Phabricator.
  • Then do the other bits.

(This is clearly desirable, but has relatively high complexity and fairly low impact.)

You don't necessarily need to implement rename detection yourself. If you have the full git commit for each revision, you can just run "git diff HASH1 HASH2" and look for renames. If you don't have the full commit, you could try "arc patch"ing each revision then running diff on those.

I suspect git diff HASH1 HASH2 will not yield desirable results in Mercurial or Subversion working copies.

There are a bunch of ways we can fake this for a subset of cases, but they'll all break down in a large array of other common cases (they'll get us 30% of the way there, not 95%).

We don't currently have the ability to perform working copy operations from Phabricator; it's on the roadmap but complicated in the case of large, active repositories (like Facebook).

chad changed the visibility from "All Users" to "Public (No Login Required)".Jul 3 2015, 4:43 AM

Mercurial does have some support for detecting moves, though I believe it requires analyzing a working directory's changes, and can't operate on specified revisions already committed, so I'm not sure of the value here.

$ hg help addremove
hg addremove [OPTION]... [FILE]...

add all new files, delete all missing files

    Add all new files and remove all missing files from the repository.

    New files are ignored if they match any of the patterns in ".hgignore". As
    with add, these changes take effect at the next commit.

    Use the -s/--similarity option to detect renamed files. This option takes
    a percentage between 0 (disabled) and 100 (files must be identical) as its
    parameter. With a parameter greater than 0, this compares every removed
    file with every added file and records those similar enough as renames.
    Detecting renamed files this way can be expensive. After using this
    option, "hg status -C" can be used to check which files were identified as
    moved or renamed. If not specified, -s/--similarity defaults to 100 and
    only renames of identical files are detected.

    Returns 0 if all files are successfully added.

Then roughly it might look like

  1. Apply Diff 1 with --no-commit
  2. Run hg addremove -s
  3. Parse output of hg status --change to read whether status is A/R/M for the files.
  4. Finalize commit, repeat to step 1 for Diff 2..
  5. Strip all commits?
eadler added a project: Restricted Project.Sep 15 2016, 6:11 PM