Page MenuHomePhabricator

Track total time from task creation to task archival

Authored by amckinley on Feb 20 2019, 7:18 PM.



Ref T5401. Depends on D20201. Add timestamps to worker tasks to track task creation, and pass that through to archive tasks. This lets us measure the total time the task spent in the queue, not just the duration it was actually running.

Also displays this information in the daemon status console; see screenshot:

Test Plan

Stopped daemons, ran bin/search index --all --background to create lots of tasks, restarted daemons, observed expected values for dateCreated and epochArchived in the archive worker table.

Also tested the changes to unarchiveTask by forcing a search task to permanently fail and then bin/worker retrying it.

Diff Detail

rP Phabricator
Automatic diff as part of commit; lint not applicable.
Automatic diff as part of commit; unit tests not applicable.

Event Timeline

amckinley created this revision.Feb 20 2019, 7:18 PM
Owners added a subscriber: Restricted Owners Package.Feb 20 2019, 7:18 PM

I actually ran the 2nd migration when the table was full of active tasks and nothing broke. We could try to get clever by setting dateCreated and dateModified for active tasks to now(), but in the worst-case scenario where we end up doing some math wrong and show negative 2 billion seconds spent waiting in the queue, it'll fix itself 14 days after this deploys.

amckinley requested review of this revision.Feb 20 2019, 7:20 PM
epriestley accepted this revision.Feb 20 2019, 7:21 PM
epriestley added inline comments.

PhabricatorWorkerArchiveTask->unarchiveTask() might need a similar change (copy over datecreated / datemodified, but just drop archivedEpoch since it's no longer archived).

This is currently only testable by bin/worker retry-ing a permanently failed task, but I want to change bin/worker retry to let you retry successful tasks too, just with a warning ("This task already succeeded, so retrying it might do whatever it does twice, are you sure you can handle that?")

This revision is now accepted and ready to land.Feb 20 2019, 7:21 PM
amckinley added inline comments.Feb 20 2019, 10:03 PM

This doesn't quite work. dateCreated can be copied over fine, but Lisk is (I assume) detecting that the object is dirty and overriding my dateModified to be the current time.

amckinley added inline comments.Feb 20 2019, 10:10 PM

Yeah, it's willSaveObject on LiskDAO. I could implement willSaveObject on WorkerActiveTask and just do the right thing, unless you have another idea.

I think it's fine to let dateModified update to the current time -- you're technically modifying the thing.

amckinley updated this revision to Diff 48232.Feb 20 2019, 10:21 PM

Update unarchive, add code to daemon console to use this new data.

amckinley edited the summary of this revision. (Show Details)Feb 20 2019, 10:22 PM
amckinley edited the test plan for this revision. (Show Details)
amckinley requested review of this revision.Feb 20 2019, 10:25 PM

Can you take a quick look at this with the added UI changes?

epriestley accepted this revision.Feb 20 2019, 10:29 PM

Looks good to me. I'll see about getting random_bytes and random_int into the builtins map to fix that lint thing.


(This needs a libphutil/ patch too, right?)


If the other setDateModified() didn't work, this one probably has no effect either?

This revision is now accepted and ready to land.Feb 20 2019, 10:29 PM
amckinley marked 2 inline comments as done.Feb 20 2019, 10:42 PM
amckinley added inline comments.

Yep, D20201 just landed.


Oh yeah, it'll be overwritten the same way.

This revision was automatically updated to reflect the committed changes.
amckinley marked 2 inline comments as done.