Page MenuHomePhabricator

Harbormaster results are hard to aggregate
Open, Needs TriagePublic16 Quest Points

Description

A Revision's Harbormaster results are many jumps away from a it (RevisionDiffBuildableBuildTarget), and don't aggregate lint and unit status (They have only "pass/fail" status).

This results in several issues when using CI to test Revisions:

  • Diffs still show "No Unit Test Coverage" even if many tests exist.
  • If the CI also lints, you can see "No lint issues" followed by a list of 50 "lint errors".
  • Displaying a revision with lots of tests is slow, because we load all tests to figure out which to show, and to aggregate coverage information.
  • Revision History shows stars for lint/unit, but these don't take into account HM results.
  • The "arc lint/arc unit" auto-build also allow for excuses, which we store on the diff instead of the HM build, which is funny.

We should aggregate lints, units, and coverage on the Buildable/Build/BuildTarget level (and maybe allow "excuse" as well, to avoid some special casing).

We currently figure out the status of build/unit on the client, and send it up during arc diff using differential.creatediff; We'll need to either change this flow to use a new/modified harbormaster.sendmessage (Which we currently use for lint/unit content), or else calculate it on the server (Where we'll be aggregating anyway).


Original ticket:

Harbormaster reported unit results showing "No Unit Test Coverage"

I'm implementing Unit test reporting over harbormaster for Jenkins (via the jenkins-phabricator-plugin: https://github.com/uber/phabricator-jenkins-plugin/pull/93)

I've successfully converted the unit tests into the harbormaster format, however on the Differential summary, under "Unit" it still shows "No Unit Test Coverage", even though it has test results:

As per the conduit spec, I am sending a single harbormaster.sendmessage call with a unit array (and overall result "pass").

Is this a UI bug, or is there a (possibly undocumented?) way to update the differential to show that it does have "Unit Test Coverage" available?

Thanks.

Event Timeline

sectioneight updated the task description. (Show Details)
sectioneight added a project: Harbormaster.
sectioneight added a subscriber: sectioneight.

This is UI oddness, you're doing everything right.

arc uploads a separate lint/unit status value, which is what's shown in the UI -- it isn't currently derived from the actual tests, so supplemental tests don't affect the UI.

This behavior is miselading, although fixing it isn't as simple as just reading something else because the arc status has additional information (notably, a "skip" status if the user skipped with --nounit or --nolint).

+@avivey -- We talked about this briefly on IRC yesterday.

Basically, I think the way forward is something like:

  • We should preserve the fact that the user typed --nounit / --nolint in the UI, even if other processes have added build information later.
  • We should otherwise consider all the tests when showing the overall status.
  • If there are 5K test results, it would be nice to not have to load 5K tests every time we're rendering the page. But I think there isn't an obvious place to cache this right now.
avivey added a comment.Oct 2 2015, 6:55 PM

Should we move the coverage information out of the individual tests? I don't remember any tester that actually reports coverage per test, and we just merge the whole thing anyway on the display.

IIRC we report coverage per-test, for whatever little that's worth.

Offhand, I don't think it's particularly valuable to keep it separate, but it seems a little silly to just throw it away for no reason. We could maybe provide a more formal way to report "coverage for all tests" for test engines that don't have per-test coverage metrics. Today they presumably just shove the coverage on an arbitrary test result. Are you running into something in this vein?

avivey added a comment.Oct 2 2015, 9:46 PM

I was thinking it would make the aggregation-for-display easier, but yes, I often end up adding a "fake test" test with the coverage information.

eadler added a project: Restricted Project.Jan 11 2016, 9:36 PM
eadler moved this task from Restricted Project Column to Restricted Project Column on the Restricted Project board.Jan 11 2016, 9:42 PM
epriestley moved this task from v2 to v3 on the Harbormaster board.Feb 29 2016, 11:46 PM
epriestley edited projects, added Harbormaster (v3); removed Harbormaster.
avivey added a project: Restricted Project.May 20 2016, 12:25 AM

Here's what I'm thinking here:

  • Add lint/unit/coverage "summary" field to each Build.
  • Recalculate whenever new lint/unit objects show up (Possibly offline via daemon).
  • arc-side lint/unit already generates a Build, so this one will hold the --nolint/--nounit information (+ excuse?)
  • Anywhere we currently show that information, load the summary information and aggregate that, instead of aggregating all lints/units directly.

This will also solve the "loading revision with 50k tests is slow" issue, because we will no longer load 50k tests for it, and it will include the right stars in Revision History if you have CI configured.

eadler moved this task from Restricted Project Column to Restricted Project Column on the Restricted Project board.Jul 4 2016, 9:15 PM
avivey moved this task from Restricted Project Column to Restricted Project Column on the Restricted Project board.Jul 12 2016, 9:55 PM
avivey renamed this task from Harbormaster reported unit results showing "No Unit Test Coverage" to Harbormaster results are hard to aggregate .Aug 16 2016, 12:29 AM
avivey set the point value for this task to 16.
avivey updated the task description. (Show Details)
epriestley moved this task from v3 to Logs / Unit Logs on the Harbormaster board.Feb 21 2018, 2:41 PM
epriestley edited projects, added Harbormaster; removed Harbormaster (v3).