Page MenuHomePhabricator

Don't compute intraline diffs if the input fails a coarse check for being huge
ClosedPublic

Authored by epriestley on Oct 7 2016, 2:32 PM.
Tags
None
Referenced Files
Unknown Object (File)
Sat, Dec 21, 6:13 PM
Unknown Object (File)
Tue, Dec 17, 5:11 AM
Unknown Object (File)
Mon, Dec 9, 12:22 PM
Unknown Object (File)
Mon, Dec 9, 11:41 AM
Unknown Object (File)
Thu, Dec 5, 4:11 AM
Unknown Object (File)
Wed, Dec 4, 1:07 PM
Unknown Object (File)
Tue, Dec 3, 5:18 AM
Unknown Object (File)
Sun, Dec 1, 2:31 AM
Subscribers
None

Details

Summary

Fixes T11744. Because intraline diffs are expensive to generate, we already bail out and decline to generate them for very long lines.

However, we currently split the inputs into lists of characters first, then check how long they are and make a decision to bail. For huge inputs (e.g., 1MB+), this is too late: just splitting them has a large CPU/RAM cost.

(These inputs are rare in normal source, but can appear in, e.g., JSON files written without newlines.)

Instead, add an extra "are the inputs really huge?" check first, and bail early if they are.

Test Plan
  • Generated a 1MB "change a file full of Q to a file full of R" diff.
  • Before change: purged changeset cache; took about 7 seconds to load.
  • After change: purged changeset cache; took about 1 second to load.
  • Viewed some normal diffs to make sure intraline edits still displayed correctly.

Diff Detail

Repository
rARC Arcanist
Lint
Lint Not Applicable
Unit
Tests Not Applicable