Build Charting for Facts
Open, NormalPublic
Actions

Assigned To

None

Authored By

	epriestley
	Apr 16 2019, 5:41 PM

Description

The next major step for Facts is better charting support.

I'm currently planning to follow the Query/SearchEngine model, where charts are addressed by hash (so you can quickly edit/share them) and can optionally be named/saved.

I'm not yet entirely clear on how chart "types" will work. Essentially all current charts are based on time-series facts where it makes sense for the X-axis to be "time" and the Y-axis to be "value" (often "count"). At least for now, perhaps this is left to the API, but it would be good to develop a clearer picture of this.

Charts should also support a widget and maybe some kind of {chart XYZ} quick reference markup.

Continuity Across Axes: For functions that are continuous between discrete datapoints (like "accumulate(...)"), if there are points on either side of the domain limits (e.g., a point at x = -1 and a point at x = 1), we should interpolate a point on the axis. That is, here, if we draw only points in the domain, we get this:

Screen Shot 2019-04-17 at 6.02.47 AM.png (861×976 px, 156 KB)

We'd like something more like this, but without the glitchy rendering:

Screen Shot 2019-04-17 at 6.03.14 AM.png (861×976 px, 156 KB)

Revisions and Commits

rP Phabricator
	Abandoned		D20452 Separate "accumulate(...)" from Fact functions
		D20879	rP9d8cdce8e1f2 Make the top-level burndown chart in "Maniphest > Reports" show open tasks, not…
		D20822	rP09d86c2d2031 Unprototype "Facts" to clear Maniphest/chart fatals
		D20819	rP16de9151c7c1 Give "Burndown" charts a more straightforward definition and move all the event…
		D20818	rPc06dd4818b53 Support explicit stacking configuration in stacked area charts
		D20817	rPd4ed5d0428ec Make various UX improvements to charts so they're closer to making visual sense
		D20816	rP080e132aa740 Track chart datapoints from their sources and provide a tabular view of chart…
		D20814	rPf529abf90065 In stacked area charts, group nearby points so they don't overlap
		D20506	rPf91bef64f163 Stack chart functions in a more physical way
		D20504	rPf190c42bcd2e Store charts earlier and build them out a little later
		D20503	rP493a6b72c1c0 Automatically select the range for charts in a general way
		D20502	rPe90360c28967 Wrap "<min, max>" chart domain pairs in an "Interval" class
		D20501	rPa80426b339c4 Provide chart function labels over the wire instead of making them up
		D20500	rPc6052b41a604 Label important data on charts
		D20498	rP81456db5594e Roughly support stacked area charts
		D20497	rP0776b5ca2c4d Update D3 to the current version
		D20496	rP5c1b91ab457d Consolidate burndown logic into a "BurndownChartEngine"
		D20495	rP0aee3da19e6a Add a "Reports" menu item to Projects
		D20489	rPf87c1ac362a7 Start the fact daemon in "bin/phd start"
		D20487	rP146317f2c447 Remove the legacy chart behavior from Maniphest
		D20486	rPf8ebc71b8f21 Replace the chart in Maniphest Reports with a chart driven by Facts
		D20485	rPff6b13872ccf Add a rough "Chart" Dashboard Panel
		D20483	rPc458b50b8569 Render charts from storage instead of just one ad-hoc hard-coded chart
		D20446	rP70c643c685f8 Simplify implementation of "pure" Chart functions
		D20445	rPedaf17f3fe51 Make chart function argument parsing modular/flexible with 900 pages of error…
		D20449	rPe1076528ef76 Copy the "line-chart" behavior to "line-chart-legacy" to keep "Maniphest >…
		D20444	rPd8dba26a08f4 Add a "sin()" function to charts
		D20443	rPc4e4448ee649 Add chart functions "x()" and "constant(3)"
		D20442	rP23d8f000f709 Select the domain (X-axis range) for charts before pulling data
		D20441	rP954831f5336b Separate chart functions into a class tree
		D20440	rP45b3c23148d0 Fetch chart data via async request and redraw charts when the window is resized
		D20439	rP044c6fbc19d4 Support drawing multiple functions on the same chart
		D20438	rP02981f0add6d Fix negative chart values, add storage for charts

Related Objects
Search...

		Status	Assigned	Task
		Open	None	T13083 Plans: Portals and Facts
		Open	None	T13279 Build Charting for Facts

Event Timeline

There are a very large number of changes, so older changes are hidden. Show Older Changes

epriestley triaged this task as Normal priority.Apr 16 2019, 5:41 PM

Herald added subscribers: amckinley, faulconbridge. · View Herald TranscriptApr 16 2019, 5:41 PM

epriestley added a revision: D20438: Fix negative chart values, add storage for charts.Apr 17 2019, 1:33 AM

epriestley added a revision: D20439: Support drawing multiple functions on the same chart.Apr 17 2019, 2:01 AM

epriestley added a revision: D20440: Fetch chart data via async request and redraw charts when the window is resized.Apr 17 2019, 2:31 AM

epriestley added a revision: D20441: Separate chart functions into a class tree.Apr 17 2019, 12:23 PM

epriestley updated the task description. (Show Details)Apr 17 2019, 1:03 PM

epriestley added a revision: D20442: Select the domain (X-axis range) for charts before pulling data.Apr 17 2019, 1:09 PM

epriestley added a revision: D20443: Add chart functions "x()" and "constant(3)".Apr 17 2019, 2:04 PM

epriestley added a revision: D20444: Add a "sin()" function to charts.Apr 17 2019, 2:28 PM

epriestley added a revision: D20445: Make chart function argument parsing modular/flexible with 900 pages of error messages.Apr 17 2019, 6:29 PM

epriestley added a revision: D20446: Simplify implementation of "pure" Chart functions.Apr 17 2019, 10:08 PM

epriestley mentioned this in 2019 Week 16 (Late April).Apr 18 2019, 1:55 AM

epriestley added a revision: D20449: Copy the "line-chart" behavior to "line-chart-legacy" to keep "Maniphest > Reports" working.Apr 18 2019, 1:41 PM

epriestley added a commit: rP02981f0add6d: Fix negative chart values, add storage for charts.Apr 18 2019, 1:59 PM

epriestley added a commit: rP044c6fbc19d4: Support drawing multiple functions on the same chart.Apr 18 2019, 2:07 PM

epriestley added a commit: rP45b3c23148d0: Fetch chart data via async request and redraw charts when the window is resized.Apr 18 2019, 2:10 PM

epriestley added a commit: rP954831f5336b: Separate chart functions into a class tree.Apr 18 2019, 8:06 PM

epriestley added a commit: rP23d8f000f709: Select the domain (X-axis range) for charts before pulling data.Apr 18 2019, 8:11 PM

epriestley added a commit: rPc4e4448ee649: Add chart functions "x()" and "constant(3)".Apr 18 2019, 8:16 PM

epriestley added a commit: rPd8dba26a08f4: Add a "sin()" function to charts.Apr 18 2019, 8:19 PM

epriestley added a revision: D20452: Separate "accumulate(...)" from Fact functions.Apr 18 2019, 9:07 PM

epriestley added a commit: rPe1076528ef76: Copy the "line-chart" behavior to "line-chart-legacy" to keep "Maniphest >….Apr 19 2019, 2:05 PM

joshuaspence added a subscriber: joshuaspence.Apr 20 2019, 12:14 AM

epriestley added a commit: rPedaf17f3fe51: Make chart function argument parsing modular/flexible with 900 pages of error….Apr 23 2019, 7:26 PM

epriestley added a commit: rP70c643c685f8: Simplify implementation of "pure" Chart functions.Apr 23 2019, 7:29 PM

hmmm

After:

Screen Shot 2019-04-30 at 8.17.45 AM.png (573×1 px, 48 KB)

Before:

Screen Shot 2019-04-30 at 8.17.32 AM.png (489×1 px, 43 KB)

epriestley added a revision: D20483: Render charts from storage instead of just one ad-hoc hard-coded chart.Apr 30 2019, 3:31 PM

epriestley added a revision: D20485: Add a rough "Chart" Dashboard Panel.Apr 30 2019, 3:36 PM

epriestley added a revision: D20486: Replace the chart in Maniphest Reports with a chart driven by Facts.Apr 30 2019, 3:42 PM

epriestley added a revision: D20487: Remove the legacy chart behavior from Maniphest.Apr 30 2019, 3:56 PM

epriestley added a revision: D20489: Start the fact daemon in "bin/phd start".Apr 30 2019, 4:52 PM

epriestley mentioned this in T1562: Build "Facts", an ETL pipeline and charting application.Apr 30 2019, 8:52 PM

epriestley mentioned this in T4171: Building reporting and data systems.Apr 30 2019, 8:57 PM

The concrete use cases that are currently filed are:

Multiple Project Burnups Stacked in an Area Chart: (T12403) We're close to being able to produce this today, although configuring the chart would be very complicated, and I'm not entirely sure which layer of abstraction the "stacked area" part should go into. For one project, you could draw the part below the axis like this:

[sum
  [accumulate [fact tasks.open-count.create.project <phid>]]
  [accumulate [fact tasks.open-count.status.project <phid>]]
  [accumulate [fact tasks.open-count.assign.project <phid>]]]
[scale -1]

...and the part above the axis like this, perhaps, provided we write window(min, max) (which is f(x) => x if min <= x < max; null otherwise):

[window 2019-01-01 2019-04-01]
[accumulate [fact tasks.open-count.status.project <phid>]]
[scale -1]

For 2+ projects, it would be possible to write each series as a very complex function, but it's probably easier to put the "stacking" part in the display logic.

If we're shipping this as an upstream chart, I think it either needs to be heavily templated or we need a lot of templating support so that you can copy/paste a template from somewhere on the internet and get a usable chart-building-thing out of it.

Forecasting: (T12459) This is a big can of worms that I don't really want to touch any time soon. Maybe we write a linear-fit(f) function which throws a line of best fit on some data and let installs write optimistic-forecast(f) by copy/pasting it. I suspect no function here provides much real insight, projects all become highly non-linear toward the end, etc., although maybe this doesn't matter.

Stacked Bar Grouped By Time Period: (T12402) If we write a group by week function, each color in the example in T12402 is [accumulate [fact tasks.open-count.status.project <phid>] [scale -1] [group-by-week]. Then we render it as stacked-bar instead of stacked-area or line. The caveat here is that tasks in multiple projects will be double-counted. We'd need to move those events into a "multiple projects" group. This doesn't map cleanly to the function model as it exists today.

Queue Length: Various requests focus on queue length, see:

T12410: Cycle Time and Lead Time Reports
T12177: Provide time-to-review statistics and reporting in Facts
PHI1190
(Arguably, daemon queue length, although I think we probably won't put daemon tasks directly into the Fact ETL engine.)

In these cases, we need to ETL time-to-close. In the case of revisions, we could also pull time-to-first-response, time-to-first-accept, time-to-accept-state, number-of-updates, etc. We can do less of this automatically for tasks today, but we could do "time-to-assign" and could annotate task statuses, conceivably, if installs want to micro-manage task lifecycles.

I don't particularly love the example reports in either T12410 or T12177 as ways to visualize this data. In T12410, the magnitude is lost in the color, and the bar height shows count/points rather than lead time. The resulting chart doesn't seem especially useful for answering any of the questions posed (i.e., looking at it, there isn't much of an apparent pattern). I think it also excludes "tasks that were filed but which have not been completed", which is a pretty critical piece of information.

In T12177, the chart shows completion but not time, and doesn't indicate scale.

For the current state of the world, I suspect that better is a stacked bar chart, grouped by time period. The height of the bar is the number of objects created in that period. The bar colors show the current status of those objects (open, accepted, etc).

This chart doesn't show changes over time well. I'm hesitant to put "time to close" on a minor axis like color, because the value is likely logarithmic. A histogram of age-at-close for tasks closed in a given period might be more insightful, although this omits tasks which have not been closed yet, which is pretty critical. A histogram of age-at-close for tasks created in a given period, including "still not closed", is perhaps more useful.

In any case, most of the missing stuff here is just at the display level.

I think a possible model for the UI is to have some base ChartConfigurationEngine, similar to SearchEngine, which defines a set of fields. Some fields would be common (e.g., all time-series charts can have time controls for domains).

Then we'd provide concrete subclasses of this. In Maniphest/Projects:

Burndown, for zero or more projects or owners. One project draws the current "burnup" chart. Multiple projects stack ala T12402. Projects can have an on-project page for their burndown, plus a link into Facts to refine the chart.
Task age, for zero or more projects or owners.

In Differential/Diffusion:

Revision age, for zero or more projects, repositories, or authors.

We could put author charts in People but I suspect this is a bit of a nest of vipers, and maybe we shouldn't even build these for now.

In Facts, we could conceivably provide a FreeformChartConfigurationEngine that would give you raw access to the underlying JSON configuration and let you pull whatever you want out of the fact engine, but I suspect this is a very advanced feature and not of much general interest.

aklapper added a subscriber: aklapper.May 3 2019, 8:53 AM

See also PHI1226. An install has an environment where automatic rules (Herald/Owners) frequently trigger group reviewers (projects/packages), and some groups are not prompt about performing review.

It would be helpful to have charts which can try to answer questions like:

When a rule adds group X as a reviewer, how much longer do we expect review to take?

We can make a first pass at this by charting time-to-review for revisions with that reviewer against time-to-review for all other revisions, or time-to-action for that reviewer against time-to-action for all other reviewers. This is probably best as overlaid box plots, although it would be nice to show total review count too (width of the box?) so we don't run into situations like "no one did any review while the company was having its annual pool party week, so the numbers are completely whack".

A problem with "time-to-review" or "time-to-action" is that it discards cases where no action occurred. If you review 5% of revisions in 1 hour and never look at the other revisions, you shouldn't get a chart out of it that says "average review time: 30 minutes".

Multiple Project Burnups Stacked in an Area Chart

Mostly rambling for my own purposes:

I think I'm going to start here. I think we get two flavors of chart out of this today: one where the Y axis is task count; one where the Y axis is points. They should mostly be similar. Points is available if maniphest.points is enabled.

This chart accepts a list of one or more projects as parameters. When a parent project is specified, subprojects or milestones are broken down and stacked separately. Maybe we can do this all in one chart, or maybe there's a separate summarize(project) token later to do the rollup visually.

On the points chart, maybe we overlay un-costed task count? Or just put that on a separate chart? These charts may otherwise be misleading when you have 30 points of open tasks and 92 uncosted open tasks. This should be on a second Y-axis if it's present.

When two or more projects are present, we need to create virtual groups for "Two or more projects" -- either one dumb "Two Or More" group, or smart "X and Y" groups, which we stack between X and Y. A "dumb" group is probably a requirement when you chart 30 different projects, so maybe we start there and then refine into "X and Y" if it's relevant. "X and Y" is a good group if your chart is "X" and "Y", but it's the same as "everything". "X and Y" might be a good group if your chart is "X and Y and Z", but we only have two slots between the stacks and it's probably better if stacking order is consistent, so the in-between groups would be forced. This might not be worthwhile.

Visually, the best stacking order is often "area", but that may feel somewhat random.

We could let users drag-to-restack.

So the specific things we need are:

Application-level configuration engines on top of generic charting.
Some kind of prototype place to put charts in Projects.
Stacked-area as a way to render a dataseries.
Support for dataseries with multiple functions (each tier of the stack), assuming the stacking logic is in the display layer ("draw this stacked area series"). Or a new object above "dataseries" that covers "a rendering of one or more series onto the chart", but I think "dataseries" is this object. Another way to approach this is "draw this specific area, bounded above by line X and below by line Y", but I suspect that's less powerful in the long run.
Labeling, click-stuff-to-do-things, customizable domains, etc.

I would also like to relate each point on the chart to the exact actions which affected it, e.g. "this line went up here because Txyz was rescoped". There's currently soft support for this.

When this chart shows only one project (e.g., a particular milestone, or a project with no subprojects) it would be nice to stack reasons instead. The reasons are "task created", "task status change" (i.e., task closed or reopened), and "task added/removed from project" (i.e., scope change). Knowing that a project is behind because of scope changes or late additions is potentially useful.

epriestley added a revision: D20495: Add a "Reports" menu item to Projects.May 3 2019, 6:54 PM

epriestley added a revision: D20496: Consolidate burndown logic into a "BurndownChartEngine".May 3 2019, 7:16 PM

epriestley added a revision: D20497: Update D3 to the current version.May 3 2019, 10:15 PM

epriestley added a revision: D20498: Roughly support stacked area charts.May 3 2019, 10:44 PM

This is a little ways out, but the actual facts Maniphest currently generates need to be adjusted to properly support stacked-area-by-reason: currently, when a task is closed, we don't have enough data to figure out whether it should be removed from the "created into this project" area, the "reopened work" area, or the "moved work" area. This is okay if we're summing the opens/closes but not precise enough if we're breaking them down.

Krinkle added a subscriber: Krinkle.May 7 2019, 11:53 AM

One issue with model we're evolving toward is that if you view a chart in Klingon, then send the link to a user whose primary language is Kerbal, we'd prefer to relabel the chart in Kerbal, not restore serialized Klingon-language chart labels. That means label generation has to be at render time, not at chart construction time, and I'm serializing a data-structure which is slightly too low-level for common application cases.

There's some tension between concerns here. If we want to let you recolor or relabel a chart, save it, and present it later, we presumably need to be serializing fairly low-level primitives that are common across all charts so that serialized data can also include "but this function should be red, since the user recolored it". But perhaps the correct level of serialization is that we put unique keys on each function, then store more like "chart engine + engine configuration + per-unique-function-key-overrides" to let you specify the display behavior of each function. I don't anticipate letting you go any lower than that (e.g., move points on the chart) and things I could imagine allowing (e.g., draw on top of the chart) are compatible with higher-level serialization.

This doesn't have a lot of impact and is easy to move around (we just move chart generation forward a little bit: Engine -> Serialize -> Deserialize -> Chart -> Render instead of Engine -> Chart -> Serialize -> Deserialize -> Render) but the change does get somewhat more mechanically difficult over time since somewhat more lines of code will need to be touched.

If we do expose "raw chart API" in some sense eventually, the "Engine" and "Chart" steps are the same step anyway and we need most/all of the serialize/deserialize-on-functions code anyway.

epriestley added a revision: D20500: Label important data on charts.May 8 2019, 1:55 PM

epriestley added a revision: D20501: Provide chart function labels over the wire instead of making them up.May 8 2019, 2:22 PM

epriestley added a revision: D20502: Wrap "<min, max>" chart domain pairs in an "Interval" class.May 8 2019, 3:44 PM

epriestley added a revision: D20503: Automatically select the range for charts in a general way.May 8 2019, 3:55 PM

epriestley added a revision: D20504: Store charts earlier and build them out a little later.May 8 2019, 4:35 PM

The actual facts Maniphest currently generates need to be adjusted to properly support stacked-area-by-reason: currently, when a task is closed, we don't have enough data to figure out whether it should be removed from the "created into this project" area, the "reopened work" area, or the "moved work" area.

This is kind of a mess that's somewhat hard to reason about.

Suppose a task is created in project A with a value of 20 points. We're looking at the stacked area burndown for project A. The burndown is stacking "base work" (the cost of tasks created in the project), "rescoring" (the cost of changes to score of tasks already in the project, i.e. a realization that an estimate was incorrect), "rescoping" (tasks moved into the project), and "reopening" (work which was marked complete but then reopened).

If the task value is increased to 30 points, we can record that as a +10 rescore event, put that area on top of the basic "points" area, and color it differently to show how much new work is due to rescoring.

If the task is closed, we can record that as a -10 rescore event plus a -20 base event. The size of both areas decreases.

However, if the task is rescored to 5 points, things get tricky. If we record this as a -15 rescore event, our rescore line may now have negative values and the area may be negative. This is confusing and sort of nonphysical. Although it's technically possible to draw negative area, I suspect this is a very bad path to walk down and will make the resulting chart wildly perplexing to viewers without multiple Phabricator PHDs.

We can record this as a -10 rescore event plus a -5 base event. All the areas on the chart will remain positive, which is seems good. But this looks like you completed work when you really just rescored work.

Another consequence of this is that if you have a task worth 50 points, increase the score to 100, decrease the score to 1, then increase the score to 100, the chart will now show two different pictures when the task was in the same state. After the first rescore to 100, it will show 50 points of base work and 50 points of rescoring. After the second rescore to 100, it will show 1 point of base work and 99 points of rescoring. This is a plausible interpretation, but the human expectation of changing a value from X to Y and back to X is usually that it restores us to the same state.

We could record this as a -15 rescore event, then draw the function [sum [fact base] [interval [fact rescore] null 0]], i.e. draw the base score as lower (the lower bound of the negative area) but track the real value in the database. Then the X -> Y -> X state change works as expected and we draw 50 points of base score and 50 points of rescoring at both points.

This doesn't fix the rescore making it look like work was completed, but maybe that's fine.

Under this approach, I think the data is (mostly?) okay as-is, and we just need more magic in the composition functions:

At the top of the stack is [interval [fact rescope] 0 null], i.e. the area where added scope is positive.
Under that is [interval [fact rescore] 0 null]], i.e. the area where rescoring is positive.
Under that is [interval [fact status] 0 null]], i.e. the area where reopening is positive.
Under that is [sum [fact create] [interval [fact rescope] null 0] [interval [fact rescore] null 0] [interval [fact status] null 0]], i.e. the remaining values plus the negative areas of the other regions.

This is perhaps not a good area of focus for the moment, since just summing everything rather than trying to stack stuff gets us an easy "useful" result, but interval is easy to write so maybe I'll at least take a shot at this.

epriestley added a revision: D20506: Stack chart functions in a more physical way.May 8 2019, 7:54 PM

epriestley added a commit: rPc458b50b8569: Render charts from storage instead of just one ad-hoc hard-coded chart.May 22 2019, 11:31 AM

epriestley added a commit: rPff6b13872ccf: Add a rough "Chart" Dashboard Panel.May 22 2019, 11:36 AM

epriestley added a commit: rPf8ebc71b8f21: Replace the chart in Maniphest Reports with a chart driven by Facts.May 22 2019, 11:44 AM

epriestley added a commit: rP146317f2c447: Remove the legacy chart behavior from Maniphest.May 22 2019, 11:48 AM

epriestley added a commit: rPf87c1ac362a7: Start the fact daemon in "bin/phd start".May 22 2019, 11:58 AM

epriestley added a commit: rP0aee3da19e6a: Add a "Reports" menu item to Projects.May 22 2019, 12:09 PM

epriestley added a commit: rP5c1b91ab457d: Consolidate burndown logic into a "BurndownChartEngine".

epriestley added a commit: rP0776b5ca2c4d: Update D3 to the current version.May 22 2019, 12:16 PM

epriestley added a commit: rP81456db5594e: Roughly support stacked area charts.May 22 2019, 12:19 PM

epriestley added a commit: rPc6052b41a604: Label important data on charts.

epriestley added a commit: rPa80426b339c4: Provide chart function labels over the wire instead of making them up.May 22 2019, 12:23 PM

epriestley added a commit: rPe90360c28967: Wrap "<min, max>" chart domain pairs in an "Interval" class.May 22 2019, 12:28 PM

epriestley added a commit: rP493a6b72c1c0: Automatically select the range for charts in a general way.May 22 2019, 12:37 PM

epriestley added a commit: rPf190c42bcd2e: Store charts earlier and build them out a little later.May 22 2019, 12:39 PM

epriestley added a commit: rPf91bef64f163: Stack chart functions in a more physical way.

Not sure where to report this, but since a recent upgrade at Wikimedia, the new version of the Burnup Graph (now Reports: Burndown), has a tendency to go below zero. I'm aware the old version had inaccuracies so perhaps it was happening before as well but hidden (e.g. artificially replaced with zero or something like that).

https://phabricator.wikimedia.org/project/reports/1212/

Yeah, this is still in a transitional state, it's just been stalled for a bit (not blocked by anything, just other stuff has been getting attention). The two major issues I'm aware of right now are:

burnup chart data is pretty questionable; and
if prototypes are disabled, some stuff is broken, particularly "View Chart".

Both issues should be resolved in the next few patches once this gets more attention.

Krinkle awarded a token.Sep 4 2019, 4:04 PM

if prototypes are disabled, some stuff is broken, particularly "View Chart".

I think there isn't really an easy way out of this other than marking Facts as not-a-prototype and putting prototype-and-application guards (instead of just application guards) on the integration points. This can happen relatively soon (in "development time" terms, not necessarily "linear time" terms) but I don't want to vanish parts of the UI even if they aren't entirely working.

burnup chart data is pretty questionable

I think we're closer than it looks to having a useful result here, but the opaqueness of this chart itself is a problem -- I wanted to take this opportunity to improve observability of the data before actually fixing the chart. That is, the chart should provide enough tools for a knowledgeable observer to figure out what's wrong without digging into the source and internals. The major tools I want to provide next are:

tooltips indicating the events associated with each datapoint; and
the ability to view a dataseries in tabular format.

These should make charts generally more useful, and let the viewer understand why things are doing whatever they're doing.

In stacked area charts, when we stack areas, we accumulate some error by stacking integer points on top of interpolated points.

For example, if tasks close at T = 1 and T = 3, the "closed task" count at T = 2 will be, say, "7.5", interpolating between the points at T=1 and T=3.

If we have a real event in a dataseries stacked on top of this, it may be raised to an actual Y-value of "18.5". This isn't meaningful as a datapoint, even though it is a geometrically accurate interpretation of the interpolated data.

That is, the real values are integer step functions. We draw them with smooth lines since this looks nicer and it's what users expect (and it's convenient for all points to have unique X values), but the aggregation should be based on the underlying step function, not the graphical interpolation. We could continue using the interpolated points for graphical aggregation, although it's probably fine if the top and bottom of an area aren't completely in sync in terms of how they're interpolating the underlying step function.

epriestley added a revision: D20814: In stacked area charts, group nearby points so they don't overlap.Sep 16 2019, 5:41 PM

D20814 combines nearby points so we don't have tons of overlapping tiny circles. For now, the grouping is just based on X-axis distance:

Before	After

We could later imagine grouping points based on some other measure (like trying to put points at the bottom and top of sudden changes in value, or always putting a point on the 1st day of each week or 1st day of each month).

The "compose(...)" function won't select input values correctly if the first function in the chain isn't a data function. For example, if we compose:

shift(-3600) ∘ fact(beep)

...to generate an hour-by-hour overlay chart of how many times something has beeped, we'll incorrectly select the current hour's x values, shift them by -3600 seconds, then push them through the map. This will accidentally work if we happen to sample at those points: for example, the underlying data has one datapoint exactly every 60 seconds and we're shifting by a multiple of 60. This will also accidentally work if our data function has been made continuous, e.g.:

shift(-3600) ∘ accumulate(fact(beep))

However, at least in the specific case of accumulate(...), we'll currently lose the datapoint refs if this happens.

The correct behavior in this case is to likely to determine that shift(-3600) is invertible, remap the domain, generate input values from fact(beep) using the remapped domain, then pass the values back through shift(-3600)^-1. This is somewhat complex.

We can't do this if shift(-3600) is not invertible, but this kind of construction is probably meaningless anyway:

cos() ∘ fact(beep)

In this case, we could conceivably clip the domain into [-1, 1] and emit arccos() as the inverse function? I don't think the UI is going to let user have access to any of this stuff anyway, so it's pretty moot.

epriestley added a revision: D20816: Track chart datapoints from their sources and provide a tabular view of chart data.Sep 16 2019, 10:34 PM

epriestley added a revision: D20817: Make various UX improvements to charts so they're closer to making visual sense.Sep 17 2019, 4:26 PM

epriestley added a commit: rPf529abf90065: In stacked area charts, group nearby points so they don't overlap.Sep 17 2019, 4:27 PM

epriestley added a commit: rP080e132aa740: Track chart datapoints from their sources and provide a tabular view of chart….Sep 17 2019, 4:41 PM

epriestley added a commit: rPd4ed5d0428ec: Make various UX improvements to charts so they're closer to making visual sense.Sep 17 2019, 4:43 PM

epriestley added a revision: D20818: Support explicit stacking configuration in stacked area charts.Sep 17 2019, 7:30 PM

epriestley added a revision: D20819: Give "Burndown" charts a more straightforward definition and move all the event stuff into "Activity" charts.Sep 17 2019, 8:25 PM

epriestley added a commit: rPc06dd4818b53: Support explicit stacking configuration in stacked area charts.Sep 17 2019, 8:26 PM

epriestley added a commit: rP16de9151c7c1: Give "Burndown" charts a more straightforward definition and move all the event….Sep 17 2019, 8:30 PM

I believe charts in the UI now make physical sense. They aren't very fancy/polished yet, but everything I see on this install looks cohesive (no more overlapping negative areas, weird gaps, general nonsense, etc):

Screen Shot 2019-09-18 at 10.44.34 AM.png (1×1 px, 416 KB)

The next steps I'm planning to take are focused around fixing the most severe sharp edges; notably:

Prototype / Fatals
- get rid of all obsolete / policy-violating stuff so no part of the Facts application is broken;
- get rid of the inline data generation in the chart controller, or move it behind a developer mode gate, since it's a huge performance cost in service of making developing easier;
- update the gating so that Facts can be unprototyped;
- unprototype Facts, fixing the various weird fatals.
Basic Usability
- move tabular views to a debugging UI;
- charts should support basic stuff like zooming;
- when you try to view a chart you don't have permission to access, we should raise this as a sensible policy exception;
- range fitting could probably be much better without much work (e.g. "display Y max = data Y max + 5%");
- point hover behavior is exceptionally un-useful right now and likely low-effort to fix.

get rid of all obsolete / policy-violating stuff so no part of the Facts application is broken;

I don't think we have any policy violations currently. The only thing you can really do is take the chart of everything, then pull every chart for every project you can see, and use the difference to derive total activity for all objects you can't see. Since this is generally in the class of information we let you know about objects you can't see, I think this is okay.

For example, if you can see T100 and T200 but nothing else, you can already determine that there are 99 tasks between them that you can't see. We're comfortable disclosing this because the value of human-readable Txxx monograms outweighs the policy implications of knowing that things you can't see exist.

Charting currently gives you very slightly more information (if you do a significant amount of legwork, you can probably figure out how many of those 99 tasks are open or closed) but I think we have to accept this as the cost of metric extraction.

I think the only obsolete part of Facts is the home screen. I can just redirect this to the demo chart minimally.

get rid of the inline data generation in the chart controller, or move it behind a developer mode gate, since it's a huge performance cost in service of making developing easier;

I'll just gate this to developer mode for now.

update the gating so that Facts can be unprototyped;
unprototype Facts, fixing the various weird fatals.

These are straightforward.

epriestley added a revision: D20822: Unprototype "Facts" to clear Maniphest/chart fatals.Sep 23 2019, 7:43 PM

epriestley added a commit: rP09d86c2d2031: Unprototype "Facts" to clear Maniphest/chart fatals.Sep 23 2019, 8:01 PM

epriestley added a revision: D20879: Make the top-level burndown chart in "Maniphest > Reports" show open tasks, not total tasks.Oct 29 2019, 8:46 PM

epriestley added a commit: rP9d8cdce8e1f2: Make the top-level burndown chart in "Maniphest > Reports" show open tasks, not….Oct 29 2019, 8:48 PM

kuba-orlik added a subscriber: kuba-orlik.Jan 17 2020, 11:38 AM

	F6868194: Screen Shot 2019-09-18 at 10.44.34 AM.png
	Sep 18 2019, 5:55 PM

	F6862351: Screen Shot 2019-09-16 at 10.41.29 AM.png
	Sep 16 2019, 5:44 PM

	F6862353: Screen Shot 2019-09-16 at 10.41.38 AM.png
	Sep 16 2019, 5:44 PM

	F6412690: Screen Shot 2019-04-30 at 8.17.32 AM.png
	Apr 30 2019, 3:18 PM

	F6412688: Screen Shot 2019-04-30 at 8.17.45 AM.png
	Apr 30 2019, 3:18 PM

	F6382347: Screen Shot 2019-04-17 at 6.02.47 AM.png
	Apr 17 2019, 1:03 PM

	F6382349: Screen Shot 2019-04-17 at 6.03.14 AM.png
	Apr 17 2019, 1:03 PM

	F6607442: capture.png
	Jul 19 2019, 3:32 PM

Build Charting for FactsOpen, NormalPublicActions

Description

Revisions and Commits

Related ObjectsSearch...

Event Timeline

Build Charting for Facts
Open, NormalPublic
Actions

Related Objects
Search...