Page MenuHomePhabricator

Make prose diff algorithm more iterative, to improve prose diffs for (among other things) removed commas
ClosedPublic

Authored by epriestley on Nov 10 2016, 8:35 PM.
Tags
None
Referenced Files
Unknown Object (File)
Sun, Oct 2, 11:30 PM
Unknown Object (File)
Sun, Oct 2, 10:52 PM
Unknown Object (File)
Sun, Oct 2, 11:51 AM
Unknown Object (File)
Tue, Sep 27, 6:56 AM
Unknown Object (File)
Mon, Sep 26, 2:33 PM
Unknown Object (File)
Mon, Sep 26, 2:14 PM
Unknown Object (File)
Sat, Sep 24, 6:16 PM
Unknown Object (File)
Mon, Sep 19, 5:41 PM
Subscribers
None

Details

Summary

Ref T7643. This is a little hard to explain but before we would do this:

  • Diff paragraphs.
  • For each different paragraph, diff sentences
  • For each different sentence, diff characters.

Now, we do this:

  • Diff paragraphs.
  • Collect all the identical, purely added, and purely removed paragraphs and set them aside. We know we should have good diffs for these already.
  • What's left over is sequences of removed/added/changed paragraphs, which we may not have great diffs for yet. Smush these together into big diff blocks.
  • Now, for these blocks, diff sentences.
  • Repeat all of that to diff characters.

This seems to pass all the existing unit tests, and pass new unit tests which I was previously unable to make pass by fiddling with things without changing the algorithm.

Test Plan

Passed existing unit tests. Passed new unit tests.

Diff Detail

Repository
rPHU libphutil
Lint
Lint Not Applicable
Unit
Tests Not Applicable

Event Timeline

epriestley retitled this revision from to Make prose diff algorithm more iterative, to improve prose diffs for (among other things) removed commas.
epriestley updated this object.
epriestley edited the test plan for this revision. (Show Details)
epriestley added a reviewer: chad.
chad edited edge metadata.

I bet all the new select UI looks great in Safari!

This revision is now accepted and ready to land.Nov 10 2016, 8:37 PM
epriestley edited edge metadata.
  • Smaller diff.

Seems fine to me? Maybe my special monitor calibration is hiding the badness?

(I made them all look like Chrome)

This revision was automatically updated to reflect the committed changes.