Page MenuHomePhabricator

Build Charting for Facts
Open, NormalPublic

Description

The next major step for Facts is better charting support.

I'm currently planning to follow the Query/SearchEngine model, where charts are addressed by hash (so you can quickly edit/share them) and can optionally be named/saved.

I'm not yet entirely clear on how chart "types" will work. Essentially all current charts are based on time-series facts where it makes sense for the X-axis to be "time" and the Y-axis to be "value" (often "count"). At least for now, perhaps this is left to the API, but it would be good to develop a clearer picture of this.

Charts should also support a widget and maybe some kind of {chart XYZ} quick reference markup.


Continuity Across Axes: For functions that are continuous between discrete datapoints (like "accumulate(...)"), if there are points on either side of the domain limits (e.g., a point at x = -1 and a point at x = 1), we should interpolate a point on the axis. That is, here, if we draw only points in the domain, we get this:

We'd like something more like this, but without the glitchy rendering:

Details

Differential Revisions
D20452: Separate "accumulate(...)" from Fact functions
Commits
D20506 / rPf91bef64f163: Stack chart functions in a more physical way
D20504 / rPf190c42bcd2e: Store charts earlier and build them out a little later
D20503 / rP493a6b72c1c0: Automatically select the range for charts in a general way
D20502 / rPe90360c28967: Wrap "<min, max>" chart domain pairs in an "Interval" class
D20501 / rPa80426b339c4: Provide chart function labels over the wire instead of making them up
D20500 / rPc6052b41a604: Label important data on charts
D20498 / rP81456db5594e: Roughly support stacked area charts
D20497 / rP0776b5ca2c4d: Update D3 to the current version
D20496 / rP5c1b91ab457d: Consolidate burndown logic into a "BurndownChartEngine"
D20495 / rP0aee3da19e6a: Add a "Reports" menu item to Projects
D20489 / rPf87c1ac362a7: Start the fact daemon in "bin/phd start"
D20487 / rP146317f2c447: Remove the legacy chart behavior from Maniphest
D20486 / rPf8ebc71b8f21: Replace the chart in Maniphest Reports with a chart driven by Facts
D20485 / rPff6b13872ccf: Add a rough "Chart" Dashboard Panel
D20483 / rPc458b50b8569: Render charts from storage instead of just one ad-hoc hard-coded chart
D20446 / rP70c643c685f8: Simplify implementation of "pure" Chart functions
D20445 / rPedaf17f3fe51: Make chart function argument parsing modular/flexible with 900 pages of error…
D20449 / rPe1076528ef76: Copy the "line-chart" behavior to "line-chart-legacy" to keep "Maniphest >…
D20444 / rPd8dba26a08f4: Add a "sin()" function to charts
D20443 / rPc4e4448ee649: Add chart functions "x()" and "constant(3)"
D20442 / rP23d8f000f709: Select the domain (X-axis range) for charts before pulling data
D20441 / rP954831f5336b: Separate chart functions into a class tree
D20440 / rP45b3c23148d0: Fetch chart data via async request and redraw charts when the window is resized
D20439 / rP044c6fbc19d4: Support drawing multiple functions on the same chart
D20438 / rP02981f0add6d: Fix negative chart values, add storage for charts

Event Timeline

epriestley triaged this task as Normal priority.Apr 16 2019, 5:41 PM
epriestley created this task.
epriestley updated the task description. (Show Details)Apr 17 2019, 1:03 PM

hmmm

After:

Before:

The concrete use cases that are currently filed are:

Multiple Project Burnups Stacked in an Area Chart: (T12403) We're close to being able to produce this today, although configuring the chart would be very complicated, and I'm not entirely sure which layer of abstraction the "stacked area" part should go into. For one project, you could draw the part below the axis like this:

[sum
  [accumulate [fact tasks.open-count.create.project <phid>]]
  [accumulate [fact tasks.open-count.status.project <phid>]]
  [accumulate [fact tasks.open-count.assign.project <phid>]]]
[scale -1]

...and the part above the axis like this, perhaps, provided we write window(min, max) (which is f(x) => x if min <= x < max; null otherwise):

[window 2019-01-01 2019-04-01]
[accumulate [fact tasks.open-count.status.project <phid>]]
[scale -1]

For 2+ projects, it would be possible to write each series as a very complex function, but it's probably easier to put the "stacking" part in the display logic.

If we're shipping this as an upstream chart, I think it either needs to be heavily templated or we need a lot of templating support so that you can copy/paste a template from somewhere on the internet and get a usable chart-building-thing out of it.

Forecasting: (T12459) This is a big can of worms that I don't really want to touch any time soon. Maybe we write a linear-fit(f) function which throws a line of best fit on some data and let installs write optimistic-forecast(f) by copy/pasting it. I suspect no function here provides much real insight, projects all become highly non-linear toward the end, etc., although maybe this doesn't matter.

Stacked Bar Grouped By Time Period: (T12402) If we write a group by week function, each color in the example in T12402 is [accumulate [fact tasks.open-count.status.project <phid>] [scale -1] [group-by-week]. Then we render it as stacked-bar instead of stacked-area or line. The caveat here is that tasks in multiple projects will be double-counted. We'd need to move those events into a "multiple projects" group. This doesn't map cleanly to the function model as it exists today.

Queue Length: Various requests focus on queue length, see:

In these cases, we need to ETL time-to-close. In the case of revisions, we could also pull time-to-first-response, time-to-first-accept, time-to-accept-state, number-of-updates, etc. We can do less of this automatically for tasks today, but we could do "time-to-assign" and could annotate task statuses, conceivably, if installs want to micro-manage task lifecycles.

I don't particularly love the example reports in either T12410 or T12177 as ways to visualize this data. In T12410, the magnitude is lost in the color, and the bar height shows count/points rather than lead time. The resulting chart doesn't seem especially useful for answering any of the questions posed (i.e., looking at it, there isn't much of an apparent pattern). I think it also excludes "tasks that were filed but which have not been completed", which is a pretty critical piece of information.

In T12177, the chart shows completion but not time, and doesn't indicate scale.

For the current state of the world, I suspect that better is a stacked bar chart, grouped by time period. The height of the bar is the number of objects created in that period. The bar colors show the current status of those objects (open, accepted, etc).

This chart doesn't show changes over time well. I'm hesitant to put "time to close" on a minor axis like color, because the value is likely logarithmic. A histogram of age-at-close for tasks closed in a given period might be more insightful, although this omits tasks which have not been closed yet, which is pretty critical. A histogram of age-at-close for tasks created in a given period, including "still not closed", is perhaps more useful.

In any case, most of the missing stuff here is just at the display level.

I think a possible model for the UI is to have some base ChartConfigurationEngine, similar to SearchEngine, which defines a set of fields. Some fields would be common (e.g., all time-series charts can have time controls for domains).

Then we'd provide concrete subclasses of this. In Maniphest/Projects:

  • Burndown, for zero or more projects or owners. One project draws the current "burnup" chart. Multiple projects stack ala T12402. Projects can have an on-project page for their burndown, plus a link into Facts to refine the chart.
  • Task age, for zero or more projects or owners.

In Differential/Diffusion:

  • Revision age, for zero or more projects, repositories, or authors.

We could put author charts in People but I suspect this is a bit of a nest of vipers, and maybe we shouldn't even build these for now.

In Facts, we could conceivably provide a FreeformChartConfigurationEngine that would give you raw access to the underlying JSON configuration and let you pull whatever you want out of the fact engine, but I suspect this is a very advanced feature and not of much general interest.

See also PHI1226. An install has an environment where automatic rules (Herald/Owners) frequently trigger group reviewers (projects/packages), and some groups are not prompt about performing review.

It would be helpful to have charts which can try to answer questions like:

When a rule adds group X as a reviewer, how much longer do we expect review to take?

We can make a first pass at this by charting time-to-review for revisions with that reviewer against time-to-review for all other revisions, or time-to-action for that reviewer against time-to-action for all other reviewers. This is probably best as overlaid box plots, although it would be nice to show total review count too (width of the box?) so we don't run into situations like "no one did any review while the company was having its annual pool party week, so the numbers are completely whack".

A problem with "time-to-review" or "time-to-action" is that it discards cases where no action occurred. If you review 5% of revisions in 1 hour and never look at the other revisions, you shouldn't get a chart out of it that says "average review time: 30 minutes".

Multiple Project Burnups Stacked in an Area Chart

Mostly rambling for my own purposes:

I think I'm going to start here. I think we get two flavors of chart out of this today: one where the Y axis is task count; one where the Y axis is points. They should mostly be similar. Points is available if maniphest.points is enabled.

This chart accepts a list of one or more projects as parameters. When a parent project is specified, subprojects or milestones are broken down and stacked separately. Maybe we can do this all in one chart, or maybe there's a separate summarize(project) token later to do the rollup visually.

On the points chart, maybe we overlay un-costed task count? Or just put that on a separate chart? These charts may otherwise be misleading when you have 30 points of open tasks and 92 uncosted open tasks. This should be on a second Y-axis if it's present.

When two or more projects are present, we need to create virtual groups for "Two or more projects" -- either one dumb "Two Or More" group, or smart "X and Y" groups, which we stack between X and Y. A "dumb" group is probably a requirement when you chart 30 different projects, so maybe we start there and then refine into "X and Y" if it's relevant. "X and Y" is a good group if your chart is "X" and "Y", but it's the same as "everything". "X and Y" might be a good group if your chart is "X and Y and Z", but we only have two slots between the stacks and it's probably better if stacking order is consistent, so the in-between groups would be forced. This might not be worthwhile.

Visually, the best stacking order is often "area", but that may feel somewhat random.

We could let users drag-to-restack.

So the specific things we need are:

  • Application-level configuration engines on top of generic charting.
  • Some kind of prototype place to put charts in Projects.
  • Stacked-area as a way to render a dataseries.
  • Support for dataseries with multiple functions (each tier of the stack), assuming the stacking logic is in the display layer ("draw this stacked area series"). Or a new object above "dataseries" that covers "a rendering of one or more series onto the chart", but I think "dataseries" is this object. Another way to approach this is "draw this specific area, bounded above by line X and below by line Y", but I suspect that's less powerful in the long run.
  • Labeling, click-stuff-to-do-things, customizable domains, etc.

I would also like to relate each point on the chart to the exact actions which affected it, e.g. "this line went up here because Txyz was rescoped". There's currently soft support for this.


When this chart shows only one project (e.g., a particular milestone, or a project with no subprojects) it would be nice to stack reasons instead. The reasons are "task created", "task status change" (i.e., task closed or reopened), and "task added/removed from project" (i.e., scope change). Knowing that a project is behind because of scope changes or late additions is potentially useful.

This is a little ways out, but the actual facts Maniphest currently generates need to be adjusted to properly support stacked-area-by-reason: currently, when a task is closed, we don't have enough data to figure out whether it should be removed from the "created into this project" area, the "reopened work" area, or the "moved work" area. This is okay if we're summing the opens/closes but not precise enough if we're breaking them down.

One issue with model we're evolving toward is that if you view a chart in Klingon, then send the link to a user whose primary language is Kerbal, we'd prefer to relabel the chart in Kerbal, not restore serialized Klingon-language chart labels. That means label generation has to be at render time, not at chart construction time, and I'm serializing a data-structure which is slightly too low-level for common application cases.

There's some tension between concerns here. If we want to let you recolor or relabel a chart, save it, and present it later, we presumably need to be serializing fairly low-level primitives that are common across all charts so that serialized data can also include "but this function should be red, since the user recolored it". But perhaps the correct level of serialization is that we put unique keys on each function, then store more like "chart engine + engine configuration + per-unique-function-key-overrides" to let you specify the display behavior of each function. I don't anticipate letting you go any lower than that (e.g., move points on the chart) and things I could imagine allowing (e.g., draw on top of the chart) are compatible with higher-level serialization.

This doesn't have a lot of impact and is easy to move around (we just move chart generation forward a little bit: Engine -> Serialize -> Deserialize -> Chart -> Render instead of Engine -> Chart -> Serialize -> Deserialize -> Render) but the change does get somewhat more mechanically difficult over time since somewhat more lines of code will need to be touched.

If we do expose "raw chart API" in some sense eventually, the "Engine" and "Chart" steps are the same step anyway and we need most/all of the serialize/deserialize-on-functions code anyway.

The actual facts Maniphest currently generates need to be adjusted to properly support stacked-area-by-reason: currently, when a task is closed, we don't have enough data to figure out whether it should be removed from the "created into this project" area, the "reopened work" area, or the "moved work" area.

This is kind of a mess that's somewhat hard to reason about.

Suppose a task is created in project A with a value of 20 points. We're looking at the stacked area burndown for project A. The burndown is stacking "base work" (the cost of tasks created in the project), "rescoring" (the cost of changes to score of tasks already in the project, i.e. a realization that an estimate was incorrect), "rescoping" (tasks moved into the project), and "reopening" (work which was marked complete but then reopened).

If the task value is increased to 30 points, we can record that as a +10 rescore event, put that area on top of the basic "points" area, and color it differently to show how much new work is due to rescoring.

If the task is closed, we can record that as a -10 rescore event plus a -20 base event. The size of both areas decreases.

However, if the task is rescored to 5 points, things get tricky. If we record this as a -15 rescore event, our rescore line may now have negative values and the area may be negative. This is confusing and sort of nonphysical. Although it's technically possible to draw negative area, I suspect this is a very bad path to walk down and will make the resulting chart wildly perplexing to viewers without multiple Phabricator PHDs.

We can record this as a -10 rescore event plus a -5 base event. All the areas on the chart will remain positive, which is seems good. But this looks like you completed work when you really just rescored work.

Another consequence of this is that if you have a task worth 50 points, increase the score to 100, decrease the score to 1, then increase the score to 100, the chart will now show two different pictures when the task was in the same state. After the first rescore to 100, it will show 50 points of base work and 50 points of rescoring. After the second rescore to 100, it will show 1 point of base work and 99 points of rescoring. This is a plausible interpretation, but the human expectation of changing a value from X to Y and back to X is usually that it restores us to the same state.

We could record this as a -15 rescore event, then draw the function [sum [fact base] [interval [fact rescore] null 0]], i.e. draw the base score as lower (the lower bound of the negative area) but track the real value in the database. Then the X -> Y -> X state change works as expected and we draw 50 points of base score and 50 points of rescoring at both points.

This doesn't fix the rescore making it look like work was completed, but maybe that's fine.

Under this approach, I think the data is (mostly?) okay as-is, and we just need more magic in the composition functions:

  • At the top of the stack is [interval [fact rescope] 0 null], i.e. the area where added scope is positive.
  • Under that is [interval [fact rescore] 0 null]], i.e. the area where rescoring is positive.
  • Under that is [interval [fact status] 0 null]], i.e. the area where reopening is positive.
  • Under that is [sum [fact create] [interval [fact rescope] null 0] [interval [fact rescore] null 0] [interval [fact status] null 0]], i.e. the remaining values plus the negative areas of the other regions.

This is perhaps not a good area of focus for the moment, since just summing everything rather than trying to stack stuff gets us an easy "useful" result, but interval is easy to write so maybe I'll at least take a shot at this.

Not sure where to report this, but since a recent upgrade at Wikimedia, the new version of the Burnup Graph (now Reports: Burndown), has a tendency to go below zero. I'm aware the old version had inaccuracies so perhaps it was happening before as well but hidden (e.g. artificially replaced with zero or something like that).

https://phabricator.wikimedia.org/project/reports/1212/

Yeah, this is still in a transitional state, it's just been stalled for a bit (not blocked by anything, just other stuff has been getting attention). The two major issues I'm aware of right now are:

  • burnup chart data is pretty questionable; and
  • if prototypes are disabled, some stuff is broken, particularly "View Chart".

Both issues should be resolved in the next few patches once this gets more attention.

if prototypes are disabled, some stuff is broken, particularly "View Chart".

See also https://discourse.phabricator-community.org/t/maniphest-burnup-rate-chart-is-not-displaying/2852/.

I think there isn't really an easy way out of this other than marking Facts as not-a-prototype and putting prototype-and-application guards (instead of just application guards) on the integration points. This can happen relatively soon (in "development time" terms, not necessarily "linear time" terms) but I don't want to vanish parts of the UI even if they aren't entirely working.

burnup chart data is pretty questionable

I think we're closer than it looks to having a useful result here, but the opaqueness of this chart itself is a problem -- I wanted to take this opportunity to improve observability of the data before actually fixing the chart. That is, the chart should provide enough tools for a knowledgeable observer to figure out what's wrong without digging into the source and internals. The major tools I want to provide next are:

  • tooltips indicating the events associated with each datapoint; and
  • the ability to view a dataseries in tabular format.

These should make charts generally more useful, and let the viewer understand why things are doing whatever they're doing.

epriestley added a comment.EditedFri, Sep 13, 5:12 PM

In stacked area charts, when we stack areas, we accumulate some error by stacking integer points on top of interpolated points.

For example, if tasks close at T = 1 and T = 3, the "closed task" count at T = 2 will be, say, "7.5", interpolating between the points at T=1 and T=3.

If we have a real event in a dataseries stacked on top of this, it may be raised to an actual Y-value of "18.5". This isn't meaningful as a datapoint, even though it is a geometrically accurate interpretation of the interpolated data.

That is, the real values are integer step functions. We draw them with smooth lines since this looks nicer and it's what users expect (and it's convenient for all points to have unique X values), but the aggregation should be based on the underlying step function, not the graphical interpolation. We could continue using the interpolated points for graphical aggregation, although it's probably fine if the top and bottom of an area aren't completely in sync in terms of how they're interpolating the underlying step function.