Page MenuHomePhabricator

Subversion directory anchoring rules are unclear and underpowered
Open, Needs TriagePublic

Description

Consider D14294, where the final diff has two lines added.

If you look at the first incremental diff (https://secure.phabricator.com/D14294?vs=on&id=34503), it correctly shows only the single line added.

If you look at the second incremental diff (https://secure.phabricator.com/D14294?vs=34503&id=34504), it incorrectly shows both lines as added instead of showing just the second line.

Unless I'm misunderstanding, I should be able to get a diff which just shows frisee, but all the diffs I seem to be able to generate with that line also include peanut butter.

Event Timeline

epriestley claimed this task.
epriestley added a subscriber: epriestley.

When we generate a synthetic diff between diffs "A" and "B", we just use diff and do not apply any move or rename detection on top of it.

The diff is correct, it just doesn't recognize that you moved the file between the diffs. Neither diff "A" nor diff "B" independently encode this move, and no VCS tracks this particular kind of move (a new, uncommitted file moved around locally).

We could theoretically apply a git-like move/rename detection algorithm to diffs between diffs, but this would sometimes produce a different result than what you expected (i.e., different than the sequence of svn operations you performed). The svn operations you performed locally on uncommitted files are lost forever. For example, if you do this:

$ svn add x.txt
$ svn mv x.txt y.txt
$ svn mv y.txt z.txt

...there is no command we can run which produces a diff showing what you did, and as far as Subversion is concerned this sequence of actions is the same as this one:

$ mv x.txt z.txt
$ svn add z.txt

I don't have any plans to try to apply move/rename heuristics between diffs. This would be complex and I think it would rarely be especially valuable.

Hmmm, I didn't actually move any files. Maybe this is a derivate of the issue in T9188 then?

I ran the following:

  1. svn checkout 'svn+ssh://sshannin@secure.phabricator.com/diffusion/SVNTEST/dir2' dir2
  2. cd dir2
  3. pico yums (add first line)
  4. svn add yums
  5. arc diff
  6. pico yums (add second line)
  7. arc commit

I agree with you that the moving is a whole different game, but none of that was going on here.

With that in mind, I hope it's not too forward of me to reopen this. Perhaps it should instead be closed as a dupe of T9188? Or maybe they're not really related.

Aah, okay, I misunderstood.

Before digging into this in more detail, I'd expect putting a .arcconfig file at dir2/ to fix this: at least at one point, this would cause arc to anchor all changes from subdirectories to be relative to dir2/. In the real-world scenario where you've hit this problem, do you have an .arcconfig file in some parent directory?

Ahh, looks like not. We'd been slowly phasing out .arcconfigs since the removal or arcanist projects.

Adding one is probably the easiest workaround for now.

This is probably similar to or identical to T9188, but I'd imagine we might "fix" that by just showing a warning ("These diffs come from different roots, put an .arcconfig at the root of your project if you want a stable root directory.") rather than trying to adjust the paths (which may not always be possible, in the general case).

There's a broader version of this problem where diff 1 is from phabricator/ and gets feedback like "This looks good, but I think a better fix is in the underlying library." and diff 2 is from libphutil/. The intra-diff diff is totally meaningless since the sources of the changes are entirely different libraries. This is usually more obvious from context than the SVN case, but we can't plausibly do anything about it except show the user some kind of warning.

Seems like that didn't quite get it yet (see D14301). Below is my transcript:

seth@phab-vm:~$ svn co svn+ssh://sshannin@secure.phabricator.com/diffusion/SVNTEST/dir2 dir2
A    dir2/lines
A    dir2/yums
Checked out revision 4.
seth@phab-vm:~$ cd dir2
seth@phab-vm:~/dir2$ pico .arcconfig
seth@phab-vm:~/dir2$ svn add .arcconfig 
A         .arcconfig
seth@phab-vm:~/dir2$ pico yums 
seth@phab-vm:~/dir2$ arc diff


    You have not specified any reviewers. Continue anyway? [y/N] y

Linting...
No lint engine configured for this project.
Running unit tests...
No unit test engine is configured for this project.
 SKIP STAGING  Phabricator does not support staging areas for this repository.
Created a new Differential revision:
        Revision URI: https://secure.phabricator.com/D14301

Included changes:
  A       .arcconfig
  M       yums
seth@phab-vm:~/dir2$ pico yums 
seth@phab-vm:~/dir2$ arc commit
Usage Exception: Unable to identify the revision in the working copy. Use '--revision <revision_id>' to select a revision.
seth@phab-vm:~/dir2$ arc commit --revision D14301


    Revision 'D14301: another incremental test for svn' has not been
    accepted. Commit this revision anyway? [y/N] y

Committing 'D14301: another incremental test for svn'...
Adding         .arcconfig
Sending        yums
Transmitting file data ..
Committed revision 5.
Done.
seth@phab-vm:~/dir2$

As with previous diff, I added one line the first time I edited yums and another line the second time.

It looks like it still has trouble properly rooting the files. Maybe a good way forward here is to always use their absolute path in the repo instead of trying to tie them to some subdir?

It looks like it still has trouble properly rooting the files. Maybe a good way forward here is to always use their absolute path in the repo instead of trying to tie them to some subdir?

We don't do that because in some reasonable setups it would show every file as /repo/projects/tfb/www/trunk/main/production/master/path/to/file.c, which would make the experience much worse overall than occasionally getting diffs-of-diffs wrong.

Yeah. gotcha. At risk of suggestion solutions and vs. describing problems, certainly the way I think about it is:

  1. What is the common prefix of all these file changes (i.e. /repo/projects/tfb/www/trunk/main/production/master/) aka where in the tree did I invoke arc diff?
  2. What are the sub-paths w.r.t. the prefix of each file changed (i.e. path/to/file.c)?

This is seems close to what phab already does for the most part (it presumably knows the absolute path of the files in the repo since it can point to them in diffusion). It just seems to be inconsistent about where it uses which and ends up comparing the absolute and relative paths at a commit time. Maybe if things were always phrased in terms of the absolute path (and you can strip out the prefix when displaying each file path) it would make things simpler? I don't know.

Let me know if there's anything information I can provide or test for you that helps here.

epriestley renamed this task from wrong incremental change with subversion commit to Subversion directory anchoring rules are unclear and underpowered.Oct 24 2015, 3:11 PM
epriestley removed epriestley as the assignee of this task.