Page MenuHomePhabricator

Worker tasks should have priorities
Closed, ResolvedPublic

Description

It would be nice if daemon worker tasks had priorities. Some actions such as sending emails are (in my opinion) more important than other actions such as importing a new repository.

Event Timeline

joshuaspence assigned this task to epriestley.
joshuaspence raised the priority of this task from to Normal.
joshuaspence updated the task description. (Show Details)
joshuaspence added a project: Daemons.
joshuaspence added a subscriber: joshuaspence.

I think this is not too difficult to implement.

Currently, tasks are dequeued in two phases. All the logic lives in PhabricatorWorkerLeaseQuery. The first phase ("unleased") examines tasks which have never failed before. The second phase ("expired") is a cleanup phase, and examines tasks which have previously been tried, have failed, and now have expired leases.

The first phase is roughly FIFO and uses ORDER BY id ASC. The second phase is roughly ordered by lease expiry, and uses ORDER BY leaseExpires ASC.

We only try the second phase if the first phase produces no results.

I don't think the second phase needs to be adjusted -- more on that later. To make the first phase priority-ordered:

  • Add a priority column.
  • Add a (leaseOwner, priority, id) key (I think this is the correct key).
  • Let PhabricatorWorker::scheduleTask() take a $priority (or array $options?) parameter, which defaults to null.
  • When a task queues followup tasks (one in PhabricatorWorker, one in PhabricatorWorkerActiveTasks), set their priority to be the same as the parent priority if no explicit priority is specified.
  • Expose the column in the table view and detail view in the web UI.
  • Add a priority column to the "ArchivedTask" table, too.
  • Change the ORDER BY id ASC clause in the first pass to ORDER BY priority DESC, id ASC.
  • Assign priorities to existing tasks at their queue sites, maybe something like this?
TaskPriority
Alerts (Email and SMS)4000
Normal Tasks (Harbormaster, commits in imported repositories, indexing)3000
Bulk tasks queued from the web UI by future work in T51662000
Initial repository imports and search engine rebuilds1000

To test:

  • Make sure the column survives archiving (I think it will).
  • You can unit test queue behavior explicitly, see PhabricatorWorkerTestCase.

Now, there's one caveat to all of this: yielding tasks.

Tasks can throw new PhabricatorWorkerYieldException($seconds) to yield for $seconds seconds. Currently, this fails these tasks without increasing their failure count, essentially pushing them to the second ("expired") selection phase. The only task type which does this right now is "Wait for Previous Build Steps" in Harbormaster.

With the priority column, these tasks would be requeued after all unfailed tasks, which potentially means that repository imports can hold up a build queue.

At least for now, I think this is OK. If we don't do it like this, the queue can deadlock if it fills up with high-priority tasks which are blocked by low-priority tasks and continually yield. By ignoring priority on failed tasks, the queue can always make progress. At least for now, guaranteeing progress seems more important than improving responsiveness in this unusual edge case. We could re-examine this decision later (if we, e.g., add more yielders) but I don't see a simple way to both guarantee progress and respect priorities.

I don't see a simple way to both guarantee progress and respect priorities.

Okay, I did think of something: we could decrease the priority of a task when it yields. In the worst case, tasks would gradually drop to 0 priority and then get a progress guarantee. I think this is too complex and specialized to deal with for now, though.

If we did eventually implement this, I think it would look like:

  • When a task yields, decrease priority (by 1000?) to a minimum of 0.
  • Instead of selecting in two phases, select in a single phase, roughly like this:

    SELECT id, leaseOwner FROM t WHERE leaseExpires < UNIX_TIMESTAMP() ORDER BY priority DESC, leaseOwner IS NULL ASC, leaseExpires ASC, id ASC;

Then priority 4000 tasks will always run before priority 3000 tasks, e.g., if they are able, and we avoid deadlock by dropping yields down the queue.

I'm not sure if the query plan on that is any good, though, and in practice this wouldn't be much different from what we'd do without it, since a yielding harbormaster task (priority = 3000) would end up behind imports (priority = 1000) very quickly if the queue was backed up a bit (which is expected; tasks may need to yield fairly often). We could make it lose fewer priority levels, maybe, but overall I think this just isn't worth it, at least for now.

(Does it make more sense for a lower number to represent a higher priority?)

I don't have strong feelings on which way the priority numbers go. I think "higher numbers = higher priority" is the most natural rule, but it's not the rule implemented by nice, and it's not always the rule we implement elsewhere. (I think we're probably a bit inconsistent about which way orders go.)

It would be nice to have a less ambiguous rule. Maybe picking another name (other than "priority") with stronger semantics would help. We use "sequence" in some cases, and that always goes from low to high ("1st item in the sequence is 1st in the list"), but that's not really appropriate here since the positions aren't unique. However, I don't have any suggestions on this.