A common feature request is for inline comments to move forward across diffs as a revision is updated.
For example, if a user leaves a comment on line 10 of "example.c" in diff 1, it's desirable for that comment to still appear on line 10 of "example.c" in diff 2.
Why is this hard?
In the general case, it is impossible to port inline comments forward.
Diff 1 and Diff 2 can affect completely different files. In a more extreme case, the files they affect can be in completely different repositories.
For example, Diff 1 might affect "example.js", making a behavioral change. After discussion, perhaps a decision is made to document the behavior instead of changing it, so Diff 2 affects only "feature.md". We clearly can't bring comments on "example.js" forward to "feature.md".
In many common cases, it is difficult to port inline comments forward.
The lines may not exist in updates. For example, Diff 1 may change a behavior in "example.js". A user makes a comment on line 115, in the middle of the changed lines. After discussion, a decision is made to simply remove the entire feature. In Diff 2, lines 50-300 are deleted and the total file is now only 55 lines long. There is no great place to put this inline, and implementing a heursitic to place it at the beginning of the removed block (which is only arguably the right behavior) is nontrivial.
The lines may also move or change between diffs, including moves or changes caused by rebases. It's not clear how well we can track lines across arbitrary changes with additional intermediate changes, especially if diffs are missing context. This is likely difficult and complex to do well. In extreme cases, it's ambiguous: if a line is replaced with three lines, and none have the same text as the original line, we can not know which of the new lines is the "new" version of the old line. We are unlikely to be able to guess nearly as well as a human can, either, which will make the result seem wrong.
Why not just do a best effort?
We could try our best, get it right in the easy cases and accept suboptimal results in the harder cases.
However, we're worried this feature will be confusing and frustrating if the behavior is inconsistent and hard to predict. If inlines sometimes port forward and sometimes disappear, that's potentially a very bad user experience.
We're also worried that this feature will frequently produce results which a human thinks are wrong (because the human reviewer understands the underlying document), even though a computer can not reasonably figure it out. For example, Diff 1 might have a change like this:
+ if (call(a, b, c)) {
Diff 2 might look like this:
+ var = call(a, b, c); + if (var) { + log(var);
Our heuristic might identify the new line 1 as the most similar to the old line 1, but a human reviewer would likely prefer the new line 2, which retains the "if" statement. For humans, the "if" statement is probably a stronger indicator of the line being "the same", even though the call has more similar text. We can't know about this rule without encoding a lot of programming language details into the heursitic.
But, worse than that, humans would actually choose a new line based on the natural language text content of the inline comment. For example, if the text is "consider using a temporary variable to make this clause easier to read", humans would choose line 2. If the text is "the parameter b should actually be q", humans would choose line 1. In 2015, we can not possibly hope to get this right.
So this potentially leaves us with a feature which appears to human users like it frequently produces the wrong results.
We are very hesitant to build features like this. They make the software seem bad because it feels janky and buggy, and they create a high support burden for administrators and for the upstream: users believe the program behavior is wrong, and file bugs about it. Because the underlying algorithms for comment placement will necessarily be complex, it will not be easy for users to intuitively understand why comments are placed where they are, nor will it be easy for us to explain the placement.
Moving Forward
Implementing this feature is mostly about coming up with ways to mitigate the unpredictability, so the user experience is more understandable and consistent and the feature feels intentional rather than random and wrong.
Some ways to do this might include:
- We can create some sort of "appendix" on changesets where inline comments end up if we can't figure out where they go. This guarantees that they show up somewhere, so they never disappear into thin air.
- Similarly, we can create some sort of appendix for the diff as a whole, where comments which don't port forward into any of the files end up.
- We might be able to have some kind of explanation of why comments were placed where they are, but it's hard for me to imagine what this might look like.
- We could let users move comments which get placed "wrong", although this is a lot of work.
- We can visually make it clear that forward-ported inlines are forward-ported.
Blockers
- These changes are substantial and it most likely depend on completing T1460 first or concurrently, as that also implies a number of UI changes to how inlines work.
- We should also fix outstanding inline bugs before making inlines more complex: T6464, T3669 (I have some plans for this as part of T1460), T4999.
- Some amount of T2009 represents a technical blocker (particularly, we need to get test coverage for the giant ball of heuristics we're going to produce) and may have to happen first or concurrently, too.