HomePhabricator

Implement clock/trigger infrastructure for scheduling actions

Description

Implement clock/trigger infrastructure for scheduling actions

Summary:
Ref T6881. Hopefully, this is the hard part.

This adds a new daemon (the "trigger" daemon) which processes triggers, schedules them, and then executes them at the scheduled time. The design is a little complicated, but has these goals:

  • High resistance to race conditions: only the application writes to the trigger table; only the daemon writes to the event table. We won't lose events if someone saves a meeting at the same time as we're sending a reminder out for it.
  • Execution guarantees: scheduled events are guaranteed to execute exactly once.
  • Support for arbitrarily large queues: the daemon will make progress even if there are millions of triggers in queue. The cost to update the queue is proportional to the number of changes in it; the cost to process the queue is proportional to the number of events to execute.
  • Relatively good observability: you can monitor the state of the trigger queue reasonably well from the web UI.
  • Modular Infrastructure: this is a very low-level construct that Calendar, Phortune, etc., should be able to build on top of.

It doesn't have this stuff yet:

  • Not very robust to bad actions: a misbehaving trigger can stop the queue fairly easily. This is OK for now since we aren't planning to make it part of any other applications for a while. We do still get execute-exaclty-once, but it might not happen for a long time (until someone goes and fixes the queue), when we could theoretically continue executing other events.
  • Doesn't start automatically: normal users don't need to run this thing yet so I'm not starting it by default.
  • Not super well tested: I've vetted the basics but haven't run real workloads through this yet.
  • No sophisticated tooling: I added some basic stuff but it's missing some pieces we'll have to build sooner or later, e.g. bin/trigger cancel or whatever.
  • Intentionally not realtime: This design puts execution guarantees far above realtime concerns, and will not give you precise event execution at 1-second resolution. I think this is the correct goal to pursue architecturally, and certainly correct for subscriptions and meeting reminders. Events which execute after they have become irrelevant can simply decline to do anything (like a meeting reminder which executes after the meeting is over).

In general, the expectation for applications is:

  • When creating an object (like a calendar event) that needs to trigger a scheduled action, write a trigger (and save the PHID if you plan to update it later).
  • The daemon will process the event and schedule the action efficiently, in a race-free way.
  • If you want to move the action, update the trigger and the daemon will take care of it.
  • Your action will eventually dump a task into the task queue, and the task daemons will actually perform it.

Test Plan:
Using a test script like this:

<?php

require_once 'scripts/__init_script__.php';

$trigger = id(new PhabricatorWorkerTrigger())
  ->setAction(
    new PhabricatorLogTriggerAction(
      array(
        'message' => 'test',
      )))
  ->setClock(
    new PhabricatorMetronomicTriggerClock(
      array(
        'period' => 33,
      )))
  ->save();

var_dump($trigger);

...I queued triggers and ran the daemon:

  • Verified triggers fire;
  • verified triggers reschedule;
  • verified trigger events show up in the web UI;
  • tried different periods;
  • added some triggers while the daemon was running;
  • examined phd debug output for anything suspicious.

It seems to work in trivial use case, at least.

Reviewers: btrahan

Reviewed By: btrahan

Subscribers: epriestley

Maniphest Tasks: T6881

Differential Revision: https://secure.phabricator.com/D11419