Page MenuHomePhabricator

Improve prose diffs for changes spanning very large blocks of intermediate text
ClosedPublic

Authored by epriestley on Nov 16 2016, 6:05 PM.
Tags
None
Referenced Files
F13090931: D16881.diff
Thu, Apr 25, 2:39 AM
Unknown Object (File)
Fri, Apr 19, 5:42 PM
Unknown Object (File)
Wed, Apr 17, 9:16 AM
Unknown Object (File)
Thu, Apr 11, 9:55 AM
Unknown Object (File)
Sun, Mar 31, 9:42 PM
Unknown Object (File)
Sun, Mar 31, 9:25 AM
Unknown Object (File)
Sun, Mar 31, 9:24 AM
Unknown Object (File)
Sat, Mar 30, 3:13 AM
Subscribers
None

Details

Summary

Ref T7643. The failure case described in T7643#200778 is a change, followed by more than 128 sentences, followed by another change.

Because the most coarse level is "split on sentences", this hits maximum length guards and just gives up, marking the whole diff as changed.

Add a new level 0 for splitting on paragraphs. This allows us to accommodate a greater range of reasonable input texts.

This will still fail for a change, followed by more than 128 paragraphs, followed by another change. But hopefully that's outside the realm of cases which we reasonably need to handle.

(Because a "paragraph" here is "text between newlines", some types of text may have a lot of "paragraphs" and we may need to continue tweaking this: for example, remarkup tables or inline code blocks.)

Also, reduce the amount of work we do after hitting an internal limit.

Test Plan

Added failing unit test; made it pass.

Diff Detail

Repository
rPHU libphutil
Lint
Lint Not Applicable
Unit
Tests Not Applicable

Event Timeline

epriestley retitled this revision from to Improve prose diffs for changes spanning very large blocks of intermediate text.
epriestley updated this object.
epriestley edited the test plan for this revision. (Show Details)
epriestley added a reviewer: chad.
chad edited edge metadata.
This revision is now accepted and ready to land.Nov 16 2016, 6:07 PM
This revision was automatically updated to reflect the committed changes.