Page MenuHomePhabricator

Workboards updating in real time
Closed, ResolvedPublic

Assigned To
Authored By
Apr 24 2014, 10:17 PM
"Like" token, awarded by n3v3rf411."Love" token, awarded by pouyana."Like" token, awarded by VRspace4."Love" token, awarded by boruchy."Like" token, awarded by vcrom."Love" token, awarded by kristo.mario."The World Burns" token, awarded by spawnlt."Like" token, awarded by wschroo."Like" token, awarded by dozniak."Like" token, awarded by ivh."Like" token, awarded by tycho.tatitscheff."Like" token, awarded by helix."Like" token, awarded by joshuaspence."Like" token, awarded by quiddity."Piece of Eight" token, awarded by jdforrester."Mountain of Wealth" token, awarded by svemir."Like" token, awarded by xiaogaozi.


If two people are watching a Workboard, and one of them moves a task around, the other should see the updates in real time. As seen in Trello and Asana boards.

Not critical but very useful for remote teams planning sprints together.

Revisions and Commits

rP Phabricator

Related Objects

Event Timeline

There are a very large number of changes, so older changes are hidden. Show Older Changes
athulj moved this task from Test to Backlog on the Workboards (v3) board.
athulj moved this task from Backlog to Test on the Workboards (v3) board.
athulj moved this task from Test to Backlog on the Workboards (v3) board.

A couple of weird cases on state management:

  • The "default" controller currently needs to "force" the URI parameters because it doesn't know how to figure out the correct default value otherwise. This should be fixable now.
  • The "filter" controller currently needs to "force" the URI parameters, probably for the same reason? This is probably fixable fairly easily.
  • The "apply" flow, where a new custom filter is applied, currently redirects to the board URI (without state), not the state-preserved URI. This is pre-existing and should be fixable. This flow should be separated.

The "default" controller currently needs to "force" the URI parameters...
The "filter" controller currently needs to "force" the URI parameters...
The "apply" flow, where a new custom filter is applied...

These should all be resolved now that state handling is unified. D20637 removes the old "force URI parameters" code entirely.

Here's roughly where I'm headed next. I'm planning to:

  • Add a "reload" controller and make pressing "R" on a workboard update the state of every visible card. I believe the client side code for accommodating new card information is generally in good shape, but the binding code for "process an update to the whole board" vs "process an update to a single card" probably isn't.
  • Add versioning based on the ManiphestTransaction table so we only need to fetch things that have actually changed, instead of fetching the whole state of every visible card on the board.
  • Deal with cases like disabling updates during a drag? If you're dragging (or maybe even "if you recently moved your mouse"), we don't want to move cards around under you.
  • Automatically make a drag-and-drop operation include pending updates to visible cards? Maybe this actually happens sooner, since it kind of ties in with the glue code?
  • Make pressing "R" event-driven through Aphlict instead of a key you manually press.
  • Possibly do some kind of event stuff for column edits (that is, cases where a column is modified, not a card -- new column, column rename/reorder, etc), but probably "Columns have changed, click to reload." or similar, at least for the moment. I think real-time updating these is likely to be much less useful/interesting than card updates.
  • Maybe animate/cue updates better since the raw update flow is just an instantaneous state change.

Just thinking out loud:

See D20639. When we update a board, there are some cases where the client view of the board is out of date but a transaction-based version number won't show changes correctly. These include:

  1. A task is assigned to Alice. Alice changes her profile picture, username, and real name. The client will still show an out-of-date profile picture, username, and real name as the assignee on the task card.
  2. A task is tagged with project "Orange". The project is renamed to "Orangutan", and its color and icon are changed. The client will still show an out-of-date tag.
  3. The user changes their language setting from "English (US)" to "English (Pirate)" in another browser window. The client will still show labels like "Author" and "Owner" (instead of "Rumormonger" and "Captain").
  4. A task is part of space Security. The viewer loses access to the space. The client will still show the card.
  5. A task is visible "When the moon is waxing". The moon begins waning. The client will still show the card.
  6. A task is visible to the user, then edited so it is no longer visible. The server won't be able to see the changes now, so it won't consider the task changed after the client version, and won't report the changes to the client. The client will still show the card.

I think cases (1), (2), and (3) we just have to live with. These are rare and I think they can only be resolved with a very deep reactive framework like Asana built out.

In Asana, as of a moment ago, if you open two browser windows and change your username in window A, your username also changes in window B. This is extremely impressive to me technically, and I'm fairly amazed that they didn't have to compromise on this at some point. (In fairness to my model of the world and of which engineering challenges are practical to solve, changing your language setting in Asana does not work like this: you must reload the page to apply changes, and the change does not synchronize to the other window. So they did compromise somewhere, but they seem to have held the line very very deep in the stack.)

I think this level of reactive updates is practically not very useful relative to the enormous cost of building and maintaining it. It's fine to reload the page when users change usernames, projects change names/icons, you change your language setting, etc.

For (4), (5), and (6), I think we can reasonably handle these cases mostly correctly. For (4) and (5), we can't realistically generate a push event to notify the client of the update, but we can probably update into the right state on the next synchronize action. For (6), we can generate an event and synchronize immediately.

I think (6) is the easiest case. There's an actual edit and the omnipotent user sees a version bump, the user just can't see the actual transaction which causes the bump. We can push the edit to everyone watching the board (this is how notifications work anyway) and just have to figure out how to synchronize state.

I see two approaches:

  • We pull transactions using the omnipotent user, then query tasks using the real viewer. If task X was modified in the version window but isn't visible to the viewer, we send down a "remove X from the client" message.
  • The client passes the server all the cards it can see. We pull them all, then send back "hide card X" for any that aren't in the working set.

A problem with the first approach is that, on its own, it exposes information about tasks the viewer can't see. Alice can ask "synchronize my board view from version 0", and get back a list of every card ever removed from the board. We also can't really build this query efficiently anyway. So I think the client needs to give the server a list of visible cards either way.

From there, we can theoretically do a single query to find:

  • All tasks that are (on the board now OR in the client visible set) with changes in the version range.

This won't be able to synchronize changes in the form of (4) or (5) above properly, since they have no changes in the range. So it's probably better to do two queries:

  • All tasks on the board now with changes in the version range.
  • All tasks on the board that the client reports as visible, regardless of changes.

And, actually, this still isn't that good since it doesn't account for the inverse of (4) or (5), where a task is added to the board by weakening a Space or Policy setting.

So perhaps a better model is:

  • Version every object individually.
  • The client sends its visible set with versions.
  • We query every visible task on the board and send back updates per-object.

This is vaguely less efficient than full versioning, but much simpler. This eventually turns into a bit of a scaling mess if you have a board with 20K tasks on it, but the server could send back partial updates if it wanted and boards with 20K tasks on them are "obviously bad" anyway. We can also cheat our way through this for a long time by saying "if a board has more than X tasks on it, it doesn't live update".


  • Starting with the simple dumb thing seems best, and the viable smart things we can do that I can come up with are evolutions of the simple dumb thing, not dramatically different protocol structures.
  • We can sort of upgrade this into partial updates if necessary, but it's not clear this is really useful, and "if a board has more than 1,000 tasks on it, it doesn't live update" seems like it probably gets us 99% of the way there with 0.01% of the effort.

From D20653:

[The hard-coded version scheme is] not tremendously efficient, but we can make versioning better (using the largest object transaction ID)...

A possible flaw with this approach is that querying these version numbers is not terribly efficient. As far as I know, you have to execute a query in the general form:

SELECT objectPHID, MAX(id) FROM transactions WHERE objectPHID IN (A, B, C, ...) GROUP BY objectPHID;

I think this query falls under a broad umbrella of "not that bad", but isn't as good as a dedicated version record somewhere would be. Particularly, I'm unsure that MAX(id) can fully leverage the implicit id part of the <objectPHID, [id]> key during query planning.

If we wanted a standalone version record, I think our options are:

  • Add a column to maniphest_task and support this at the LiskDAO level -- basically, a logical clock to complement the wall-time clocks in dateCreated / dateModified.
  • Add a separate maniphest_task_version or object_version table somewhere.

Neither of these feel especially great, although neither are particularly awful either. If the logical clock is on the object we need to be mindful of transaction operations when we bump it. If it's in a separate table per-object, we have another table to deal with. If it's in a separate global table, we're touching a lot of workflows for a pretty niche benefit. Possibly some consideration for overlapping this with T12799 and letting third-party code bump logical clocks, too.

For now, I'm just going to do the "not that bad" query and we can revisit this if specific problems arise.

(We could also just use dateCreated as if it were a logical clock but I'm dismissing this as "obviously bad" since it tends to lead toward haunted behavior which is hard to provide remote support for.)

a logical clock to complement the wall-time clocks

A convenient property of this approach is that $task->getVersion() just works out of the box and none of the supporting workflows need to contort themselves around loading the data. I also vaguely suspect providing a logical clock on objects would benefit API callers, although I don't have any specific use cases in mind.

For third-parties, this clock is probably most useful if it's global, not local to the object. That is, exactly one object of a given type (or perhaps even exactly one object in the whole system) is assigned version 123, so you can query Conduit calls by clock version and process only values larger than the largest value you've ever seen.

In any case, none of this is actually terribly important (giving the client a version 1 on every object and the server a version 2 is basically fine), so I think I'm going to hook up the Aphlict stuff next.

There's currently a bug (likely related to D20652 or D20654?) where normal edits aren't respecting card order correctly during the redraw. I suspect this is just an order parameter getting lost somewhere.