HomePhabricator

Fix LiskMigrationIterator scaling in O(n^2)

Description

Fix LiskMigrationIterator scaling in O(n^2)

Summary:
When running migrations that insert more than 1000 rows at a time the PhutilChunkedIterator becomes exponentially slower at generating chunks. The code below was where I encountered this issue originally.

<?php

$message_table = new HarbormasterBuildUnitMessage();

$unit_test_table = new CIUnitTestHarbormasterBuildUnitMessage();
$conn_w = $unit_test_table->establishConnection('w');

$size = 5000;
$row_iter = id(new LiskMigrationIterator($message_table))->setPageSize($size);
$chunk_iter = new PhutilChunkedIterator($row_iter, $size);

// point 1
foreach ($chunk_iter as $chunk) {
  // point 2
  $sql = array();
  ...
}

where the time between the point 1 and point 2 comment exceed 1 second around the 5k mark and become unresponsive around 15k. PHP seems to do a very poor job at converting large arrays to boolean values as can be seen with this code sample. As I was testing this I noticed it's completely data dependent so it's possibly for arrays containg scalar values this is much more performant.

<?php

$root = dirname(__FILE__);
require_once $root.'/integrator/scripts/__init_script__.php';

$bigbig = $build_unit_messages = id(new HarbormasterBuildUnitMessage())
  ->loadAllWhere('id > %s LIMIT 1000000', 0);

$start = microtime(true);
(bool)$bigbig;
echo microtime(true) - $start;

This code on my computer causes 1.88 seconds to be output as opposed to 0.0000006 seconds when wrapping it in count.

My title claim is somewhat bold. In my case that was the issue but your millage may vary.

Test Plan: As seen above.

Reviewers: Blessed Reviewers, epriestley

Reviewed By: Blessed Reviewers, epriestley

Subscribers: epriestley

Differential Revision: https://secure.phabricator.com/D15869

Details

Provenance
michaeljs1990Authored on May 9 2016, 7:40 PM
industryPushed on May 9 2016, 7:40 PM
Reviewer
Blessed Reviewers
Differential Revision
D15869: Fix LiskMigrationIterator scaling in O(n^2)
Parents
rPHU5d47fb1027f5: Allow the default Remarkup <p> tag to be inline-styled for mail
Branches
Unknown
Tags
Unknown
Build Status
Buildable 12100
Build 15249: Run Core Tests