Page MenuHomePhabricator

Improve prose diffs (was: description changes don't generate usable diffs)
Closed, ResolvedPublic


Remaining concerns:

  • Show Old / Show New: In the web UI, have a way to get the raw old and raw new text, in case you want to revert things or copy some of it or whatever else.
  • Mail Section Titles: In mail, all sections have an "EDIT DETAILS" header, but they should have per-field headers instead ("CHANGES TO SUMMARY", "CHANGES TO TEST PLAN" or similar).
  • Paste should Code-Diff: Paste got converted here somewhat accidentally, but is probably better as a code diff. The old behavior wasn't really particularly great (it was "half-a-code-diff", more or less) but it should probably be swapped to become more of a pure code diff. See some additional context in T11743. Prose diffing ".txt" might be OK.
  • Config should Code-Diff?: I think we have at least one "JSON BLOB" thing in Config (and maybe in Almanac?). These would be better as code diffs.
  • Plain Text Mail?: ~~Maybe try to do something with plain text mail, if we can figure out something reasonable. (No apparent interest in this or suggestions for "something reasonable", and I don't have any real ideas myself.)

Original Description

As a user, when I get an email that remarks:

Alice edited the task description

I usually want to see what the change was. It's currently a complete mystery, that requires me to:

  1. open the page in a browser,
  2. scroll down to the line where the edit was made
  3. click on "(Show details)"

There are 3 possible ways to solve this. In the email notification:

Thank you! w(O>O)w

Related Objects

Event Timeline

There are a very large number of changes, so older changes are hidden. Show Older Changes

Yes. I think this has two general components:

  • Embed diffs in email.
  • Linking directly to changes from email.

On embedding diffs, I want to improve them before we embed them, because the current diffs are often quite bad. T3353 discusses improving them, which mostly boils down to wanting diffs of human-readable prose text to look more like this:

Currently, our algorithm produces a reasonable result in only a fairly small number of cases, and results are particularly bad when a large paragraph has a small number of changes.

For example, here I've added only one word ("hyperspace") but the algorithm has butchered the diff:

I want to fix that before putting it in email. It doesn't have to be perfect, but it shouldn't be horribly broken in like 60% of common cases.

Fixing this will fix the web UI diffs, and also allow us to put reasonable diffs in mail.

Separately, we can improve linking transactions to comments. This should wait for changes in T10694 to complete (particularly, design on D15884) but I expect them to happen shortly.

Of the two link suggestions, linking directly to the transaction is slightly tricky for various reasons. Probably slightly easier is linking to the diff dialog, and improving that page so it renders better in standalone mode (particularly, links back to the object). Basically, this would put these "(Show Detail)" links into mail:

When clicked, you'd get to a similar page to where you end up today, but it would have more context about the object:

For example, the crumbs would show the parent object and the button would have appropriate text like "Return to object" or similar (or maybe we'd make this page fully render as a complete page instead of as a dialog).

We can do these separately. I'd estimate:

  • Rough cut of prose diffs + prose diffs in email: 4 hours.
  • Detail links on edits and better destination pages: 1 hour.
NOTE: I'd expect both of these mail changes to only affect HTML mail, not plain text mail. I don't have any ideas for a reasonable way to render prose diffs in plain text email, and I think the "Details" links would add too much clutter in plain text email since they have to render as instead of (View Details). After D15885, HTML mail will be the default.
NOTE: This prose diff estimate is upgrading the current "works maybe 30% of the time" terrible garbage algorithm to a new "works 90%-95% of the time" reasonably usable algorithm, not a "works 99% of the time" really nice algorithm. You should expect that there may still be significant limitations to prose diffs where output is not as good as a human could do, e.g. edits to tables and other complex embedded elements probably won't render ideally on the first pass. We can improve this over time but I don't expect to be able to get it perfect in 4 hours, just much less bad.
urzds added a comment.May 18 2016, 2:24 PM

I think the "Details" links would add too much clutter in plain text email

Could you maybe extend the "TASK DETAILS" part of the email and link to the transaction via an #anchor? That would be sufficient for me.

We can't easily link to an #anchor from email because the transactions that render in email are different from the transactions that render on the web. For example, T10493 is a case that specifically shows a transaction in email but not on the web. Since it doesn't exist on the web, we can't easily generate an anchor to the corresponding transaction because no web version of that transaction exists. This behavior is not unique to that transaction type. Anchor generation is also more complex than this for other reasons (for example, transaction are grouped, aggregated and reordered at display time, and these changes affect which anchors are available).

This is tractable (we can send the user to some link which looks up the "most similar" anchor to an email transaction and then redirects them) but significantly more complex than linking directly to details.

A group of transactions may also have an arbitrarily large number of transactions, and an arbitrarily large number of detail links (tasks may have an unlimited number of remarkup custom fields, all of which may be edited simultaneously). Any approach we take needs to accommodate this. Adding a single link to "TASK DETAILS" does not.

epriestley moved this task from Backlog to The Queue on the Prioritized board.Jun 6 2016, 10:31 PM
epriestley triaged this task as Normal priority.
epriestley updated the task description. (Show Details)

This is pretty rough (particularly, the actual diff part), but wired up everywhere. HTML email includes a link:

This link takes you to the change details, directly:

The details are also embedded in HTML mail:

Text mail is unchanged. D16063 discusses some ideas for improving text mail, but none of them seem very good. If you have thoughts on how it could work, see that for what I came up with and let me know if you have something better.

The change details in the web UI also use the new prose diffing:

The actual quality of the prose diff is very uneven right now. This particular edit is an example where it does very poorly. It does better with some other types of edits:

Next, I'll work on identifying and refining the behavior for cases where it does fairly poorly (the diff is technically correct, but a human would prefer that it did not detect that the "a" in "says" and "remarks" was technically retained across the edit). Once I've nailed down everything I can find, we can start collecting weird cases it still misses and I can see what we can do to improve them.

The old diff was frequently so bad that I think this is essentially always better than it was, even given the current cases where it doesn't do a great job of producing human-readable diffs, but I expect to be able to improve it to get pretty good output about 90-95% of the time.

Diffs should be less choppy now, here's the improved version of the above change:

Diffs should be less choppy now, here's the improved version of the above change:

Very nice! Thank you for your work in this area.

I'm sure there are still some cases where we don't do a particularly good job, but let me know when you run into them.

Some other stuff I'm thinking about:

  • Make unchanged text grey to de-emphasize it, particularly in email?
  • Possibly fold/hide large chunks of unchanged text, particularly leading and trailing text, and particularly in email?
  • In the web UI, have a way to get the raw old and raw new text, in case you want to revert things or copy some of it or whatever else.
  • In mail, all sections have an "EDIT DETAILS" header, but they should have per-field headers instead ("CHANGES TO SUMMARY", "CHANGES TO TEST PLAN" or similar).
  • Maybe try to do something with plain text mail, if we can figure out something reasonable.

It would also be nice to develop an inline version of this for title changes like this:

T9233 is vaguely related.

This one feels a bit un-human-like, although it might be tricky to fix:

Oh, one more thing:

  • Paste got converted here somewhat accidentally, but is probably better as a code diff. The old behavior wasn't really particularly great (it was "half-a-code-diff", more or less) but it should probably be swapped to become more of a pure code diff.
  • We currently use white-space: pre-wrap to handle linebreaks in prose diffs, but at least one client (Airmail) doesn't handle that properly. We could use explicit <br /> to improve behavior.
epriestley updated the task description. (Show Details)Jun 10 2016, 3:57 PM

Large diffs like wiki pages should render better in email now:

Catching a couple more funky diffs:

This one looks like an excessive smoothing after a comma was removed:

This one is okay but it's a single deleted space that would be nice to mark as diffed less aggressively:

epriestley renamed this task from Email notification for "[user] edited the task description" does not contain a link to the edit to Improve prose diffs (was: description changes don't generate usable diffs).Jul 16 2016, 1:51 PM
epriestley removed a project: Mail.
epriestley updated the task description. (Show Details)Nov 7 2016, 11:09 PM

We still do a pretty questionable job on the first case in T7643#180264, where a comma is removed from the middle of a sentence.

I fiddled with this for a while and wasn't able to make much progress by just tweaking the existing algorithm. I have some ideas for changing how the algorithm works a little bit that I think may improve this case, and also make the algorithm a bit easier to debug. I'm going to take a shot at these improvements and see how far I get.

That seemed to just work? Slightly disconcerting.

chad added a comment.Nov 10 2016, 8:49 PM

unit tests ftw

epriestley moved this task from The Queue to Paused on the Prioritized board.Nov 10 2016, 8:57 PM

I believe this is now in reasonable shape, in the sense that the overwhelming majority of prose inputs I'm aware of produce reasonable, human-readable diffs. They aren't perfect in all cases and aren't necessarily the diffs that a human would produce, but they're useful and fairly sensible.

If you're aware of cases which we still get wrong, please let us know (after updating to include D16839, which should fixed several of the cases we did the worst on). I imagine we'll also continue tweaking the algorithm over time as weird cases arise.

If there don't seem to be any major remaining quality issues for a little while, I'll consider this resolved and move remaining work to followups. Particularly:

  • Paste (and likely Config/Almanac?) should go back to code diffs at some point, but prose diffs usually aren't too bad for them.
  • We don't link to the comment/transaction in mail, because this is difficult in the general case. This is discussed in greater detail is T11529. (This isn't really specific to prose diffs, anyway.)
epriestley closed this task as Resolved.Nov 16 2016, 5:15 PM
  • D16853 tweaked one issue that I caught.
  • T11881 is a followup for restoring code diffs in a small set of scenarios.
  • I haven't seen other issues for a while, so I'm going to consider this resolved until they crop up. If you do run into cases where the new algorithm isn't doing a great job, let us know.

@epriestley One issue that is still open is adding a change to the beginning and to the end of a large prose text. The diff of that becomes unusable. This is still reproducable with the current stable branch )(fc71a7e92dc26e0d93ec33d709391411d2bcd827 (Mon, Nov 14)). For details / reproduction steps please see T11825.

Ah. I believe I fixed a simpler version of that case earlier, but not the specific case you describe. D16881 should fix that case.

Note that the algorithm gives up eventually, because computing a diff would be too expensive. In the general case, adding "a" at the beginning and "z" at the end of a very large block of unchanged text will eventually hit the internal limits and render an "everything changed, good luck" diff. However, the amount of text required to hit that limit should now be more reasonable (substantially more than 2,000 words of lorem ipsum):

@epriestley thank you very much for the fast update and fix. It is working now with my example, and should be much better with a lot of pages.

I am a bit worried however about the 128 paragraph limit you mention in D16881. Though this should indeed cover a lot of pages, it explicitly leaves out those for which this issue is the most severe - very long pages :-) I did a quick check in our wiki and I find a lot of pages with more than 128 newlines, e.g. tutorial pages or pages with lots of lists (meeting minutes, check lists, etc).

Some examples that still trigger the problem:

PageByte CountWord CountTotal linesNon-empty Lines
Some meeting notes5976890212147
Phabricator Starmap132732019479259

With the current approach I guess there will always be a limit. I don't know about the resource consumption of the algorithm and whether the default limit could be increased. But maybe it makes sense to have the default user-configurable? I.e. I feel multiplying it by two would be a good initial value for our setup.

(Please let me know if I should open a new task for this topic (or reopen T11825).

Feel free to file a new task when you encounter actual problems with real text and we can look at adjusting the limit, adding an additional block-level difference phase, rewriting PhutilEditDistanceMatrix in C (see T2312), or other approaches. We can make a better decision based on concrete examples of where the current solution is falling short than by guessing where issues might arise.

I think we are unlikely to add an option for two reasons:

  • I am generally hesitant to add options, for reasons discussed in T8227.
  • Because runtime complexity is explosive, I worry it would be difficult for users to select a value for this option which is low enough to avoid performance issues.

If you are confident that adjusting the limit is the best solution for you, you can modify the constant in src/utils/PhutilProseDifferenceEngine.php, near line 14, by changing the value passed to setMaximumLength(...).

Twilight removed a subscriber: Twilight.Jul 16 2017, 2:09 PM
joshuaspence added a subscriber: joshuaspence.EditedNov 21 2018, 5:44 AM

I do have a real-life example where the prose diff engine rendered a suboptimal diff. I was able to fix it by changing the maximum length of the edit distance matrix from 128 to 256.