Improve performance of remarkup block splitting
Closed, ResolvedPublic
Actions

Assigned To

Authored By

	epriestley
	Feb 4 2020, 9:23 PM

Description

See PHI1628, which provides a 4MB block of remarkup which takes a fairly long time to render.

Although this use case is bogus (attaching a hundred pages of details to Harbormaster unit results), we take a fairly long amount of time to process a relatively reasonable 4MB input. The profile makes it look like much of this time can be saved.

Revisions and Commits

rP Phabricator
	D20968	rPfdbe9ba149b6 Improve Remarkup parsing performance for certain large input blocks

Related Objects

Mentioned In: D21866: Addressing some PHP 8 incompatibilities - Remarkup

Event Timeline

epriestley triaged this task as Wishlist priority.Feb 4 2020, 9:23 PM

epriestley created this task.

Herald added a subscriber: amckinley. · View Herald TranscriptFeb 4 2020, 9:23 PM

To attempt to justify micro-optimizing this loop:

foreach ($blocks as $key => $block) {
  $lines = array_slice($text, $block['start'], $block['num_lines']);
  $blocks[$key]['text'] = implode('', $lines);
}

...we can avoid an array_slice() and an implode() by looping manually. I'd expect this to be a little faster (somewhat less function call overhead) when many blocks are small (particularly, one line long). This is common in the particular corpus in PHI1628 and "probably" common in general.

I believe array_slice() doesn't have any interesting performance characteristics. Some quick testing suggests it's ~2x faster than constructing an array via iteration at all array sizes, but the cost is 21ms at N=1M so this isn't significant.

Long ago, implode() was much better than concatenation when repeatedly concatenating a large string, but I can't really reproduce that in modern PHP. Even when concatenating ~250K elements, concatenation is only ~50% slower.

This kind of optimization is "dangerous" in the sense that XHProf tends to over-report function call overhead so it's probably doing a lot less good than profiling suggests (and maybe even doing harm) but it looks like "unrolling" this loop is unlikely to introduce any new pathologically bad cases, even with silly pathological inputs.

epriestley added a revision: D20968: Improve Remarkup parsing performance for certain large input blocks.Feb 4 2020, 10:37 PM

epriestley closed this task as Resolved by committing rPfdbe9ba149b6: Improve Remarkup parsing performance for certain large input blocks.Feb 4 2020, 11:07 PM

epriestley added a commit: rPfdbe9ba149b6: Improve Remarkup parsing performance for certain large input blocks.

epriestley mentioned this in D21866: Addressing some PHP 8 incompatibilities - Remarkup.May 27 2023, 12:26 AM

Improve performance of remarkup block splittingClosed, ResolvedPublicActions

Description

Revisions and Commits

Related Objects

Event Timeline

Improve performance of remarkup block splitting
Closed, ResolvedPublic
Actions