Page MenuHomePhabricator

Speed up DiffusionBrowseFileController by removing call to array_merge
ClosedPublic

Authored by jhurwitz on May 10 2014, 12:00 AM.
Tags
None
Referenced Files
F14371753: D9032.id21455.diff
Fri, Dec 20, 7:15 PM
Unknown Object (File)
Thu, Dec 12, 7:07 PM
Unknown Object (File)
Sun, Dec 8, 6:09 PM
Unknown Object (File)
Sun, Dec 8, 11:51 AM
Unknown Object (File)
Sat, Dec 7, 12:24 PM
Unknown Object (File)
Wed, Dec 4, 9:59 PM
Unknown Object (File)
Wed, Dec 4, 4:13 PM
Unknown Object (File)
Wed, Dec 4, 4:13 PM
Subscribers
Tokens
"Doubloon" token, awarded by btrahan."Mountain of Wealth" token, awarded by epriestley.

Details

Summary

Some profiling using XHProf in the Dark Console showed me that Diffusion was wasting a ton of time on array_merge. This change sped up the loading of a large file in Diffusion from 16.8 seconds to 2.4 seconds.

Test Plan

Load files in Diffusion. They all look good. Also, use a PHP shell to try to manually verify that I still kinda remember some PHP and, yes, this is functionally equivalent to what was there before.

Diff Detail

Repository
rP Phabricator
Lint
Lint Skipped
Unit
Tests Skipped

Event Timeline

jhurwitz retitled this revision from to Speed up DiffusionBrowseFileController by moving array_merge behind a conditional.
jhurwitz updated this object.
jhurwitz edited the test plan for this revision. (Show Details)
jhurwitz added a reviewer: epriestley.

Really nice catch!

Update libphutil/ to clear lint. Those symbols were introduced in a more recent version. If you can't update /usr/local/libphutil, you can put development copies of arcanist/ and libphutil/ next to one another somewhere, then run that version of arc to get clean lint.

This will probably perform even faster:

$rows = array();
foreach ($stuff as $thing) {
  // ...

  $rows[] = $this->renderInlines(...);

  // ..
}

return array_mergev($rows);

Can you try that?

(Note that the final call is array_mergev() not array_merge().)

Oh look at this helpful documentation that some kind soul (who must spend all his time frustrated with PHP) wrote! https://secure.phabricator.com/book/phabflavor/article/php_pitfalls/

Thanks, will update this diff.

Wait, how does your suggestion work? This line means that $rows would be a mix of array objects and phutil_tag objects, so (I think that) you can't pass that into array_mergev.

I also don't know anything about the internals of PHP, but I wouldn't be surprised if no-op'ing on empty arrays is faster than building a giant array where half the entries are empty arrays, and then calling array_mergev. (In general, I'd expect very few lines to have inlines in Diffusion -- those should just be stuff like lint warnings, which should hopefully be rare.)

Oh, you're right -- a cleaner construction is probably:

  foreach ($cur_inlines as $cur_inline) {
    $rows[] = $cur_inline;
  }
  // ..
}

return $rows;

The problem with this "merge rarely" is that array_merge() copies both arguments, so if we have a pathological case (say, 100 inlines) we'll copy every row 100 times (or maybe 50 times on average or whatever). Not as bad as copying every row 5000 times (or however many lines are in the file) but still not good.

jhurwitz edited edge metadata.

epriestley's suggestion

jhurwitz retitled this revision from Speed up DiffusionBrowseFileController by moving array_merge behind a conditional to Speed up DiffusionBrowseFileController by removing call to array_merge.May 10 2014, 12:56 AM
jhurwitz updated this object.
jhurwitz edited the test plan for this revision. (Show Details)
epriestley edited edge metadata.
This revision is now accepted and ready to land.May 10 2014, 1:06 AM
epriestley updated this revision to Diff 21455.

Closed by commit rP9d0d1ac42f8a (authored by Jacob Hurwitz <jhurwitz@dropbox.com>, committed by @epriestley).