Page MenuHomePhabricator

Daemon overseers may have a slow memory leak, possibly in FuturePool
Closed, ResolvedPublic

Description

See PHI1849. See PHI1847. In these cases, a shard which is not overallocated or under any apparently unusual workload ran into the OOM killer.

Coarsely, phd-daemon processes are using 0.5% of host memory after launch and 0.9% after a couple days of uptime.

Narrowly, FuturePool almost certainly has a small leak via FutureIterator, since FutureIterator does not discard futures after iterating over them.

Event Timeline

epriestley created this task.

This synthetic script demonstrates the conceptual problem with FutureIterator:

<?php

require_once 'support/init/init-script.php';

$iterator = new FutureIterator(array());

$iterator->addFuture(new_future());
$iterator->addFuture(new_future());

foreach ($iterator as $future) {
  echo '<MEM> '.sprintf('%.2f', (memory_get_usage() / (1000 * 1000)))."\n";
  $iterator->addFuture(new_future());

  sleep(1);
}

function new_future() {
  return new ImmediateFuture(str_repeat('x', 1024 * 1024));
}

This sort of thing should be able to happen with the FuturePool in Overseer, too. I was able to get a reasonable reproduction case for the leak by doing this:

  • Adding error_log(... memory_get_usage() ...) to the main overseer loop to monitor memory usage.
  • Changing shouldWakePool() to return true.
  • Running bin/phd start.

This produces a state where:

  • The PullLocal daemon is always running, so the FuturePool can never empty and reset its FutureIterator.
  • The Taskmaster and other hibernating daemons cycle in and out of hibernation rapidly (about once per second, instead of once every few minutes).

This is otherwise a normal state, the hibernation cycle is just ~300x faster than normal under an idle queue.

My expectation is that this will cause the FutureIterator inside the FuturePool to grow without bound, holding an ExecFuture for each child ever launched. This appears to be confirmed in the log, which sees the overseer memory usage grow slowly and steadily from 15.70M to 15.93M over the course of a minute.

This corresponds to a leak rate of 312MB/day. Given that this leak is artificially 300x faster, this isn't nearly fast enough to explain the production leak (which would be only ~1MB/day). However, the "300x faster" is based on an idle queue. If the queue is routinely injected with small tasks, pools may wake every few seconds and the reproduction case might only be a bit faster than production.

Upshot: I'm not 100% confident this fully explains observed behavior, but it seems like a real leak, and plausible that it accounts for all observations in production.

epriestley claimed this task.

I believe D21466 resolved this, since it hasn't cropped up again even though daemon runtimes between restarts have been very long at various periods in the last year.