Improve prose diffs for changes spanning very large blocks of intermediate text

Authored by epriestley on Nov 16 2016, 6:00 PM.

Description

Improve prose diffs for changes spanning very large blocks of intermediate text

Summary:
Ref T7643. The failure case described in T7643#200778 is a change, followed by more than 128 sentences, followed by another change.

Because the most coarse level is "split on sentences", this hits maximum length guards and just gives up, marking the whole diff as changed.

Add a new level 0 for splitting on paragraphs. This allows us to accommodate a greater range of reasonable input texts.

This will still fail for a change, followed by more than 128 paragraphs, followed by another change. But hopefully that's outside the realm of cases which we reasonably need to handle.

(Because a "paragraph" here is "text between newlines", some types of text may have a lot of "paragraphs" and we may need to continue tweaking this: for example, remarkup tables or inline code blocks.)

Also, reduce the amount of work we do after hitting an internal limit.

Test Plan: Added failing unit test; made it pass.

Reviewers: chad

Reviewed By: chad

Maniphest Tasks: T7643

Differential Revision: https://secure.phabricator.com/D16881