Display coverage information in the diffusion browse UI
AbandonedPublic
Actions

Authored by epriestley on Dec 17 2015, 3:52 AM.

Details

Reviewers

yelirekim
chad

Group Reviewers

Blessed Reviewers

Maniphest Tasks

T9524: Show aggregate subdirectory coverage in Diffusion browse views

Summary

See T9524 for discussion. This aggregates the coverage information gathered on a per-file basis in diffusion.updatecoverage and stores concatenated coverage reports for all of the files within a given directory. Once we have that data it gets threaded through the diffusion-pull-lastmodified calls alongside the other columns in the browse view.

Test Plan

Screen Shot 2016-02-01 at 7.50.25 PM.png (1×2 px, 373 KB)

Diff Detail

Repository

rP Phabricator

Branch

coverage_column (branched from master)

Lint

Lint Passed

Unit

Tests Passed

Build Status

Buildable 10442
Build 13186: Run Core Tests
Build 12761: arc lint + arc unit

Event Timeline

yelirekim updated this revision to Diff 35792.Dec 17 2015, 3:52 AM

yelirekim retitled this revision from to Display coverage information in the diffusion browse UI.

yelirekim updated this object.

yelirekim edited the test plan for this revision. (Show Details)

Herald added a reviewer: Blessed Reviewers. · View Herald TranscriptDec 17 2015, 3:52 AM

Herald added a subscriber: epriestley. · View Herald Transcript

yelirekim added a task: T9524: Show aggregate subdirectory coverage in Diffusion browse views.Dec 17 2015, 3:52 AM

This isn't quite done, I'm putting it up to ask for some ideas. I've got all of the data that we would need funneled into DiffusionLastModifiedController:: renderColumns but had one particular question about how to proceed. Should we render coverage for each file in this same column? If so, how can we differentiate per-file coverage UI elements from per-directory UI elements?

Moving bar rendering to a specialized coverage bar view

Herald added a reviewer: chad. · View Herald TranscriptFeb 2 2016, 12:49 AM

yelirekim edited the test plan for this revision. (Show Details)Feb 2 2016, 12:52 AM

yelirekim edited edge metadata.

Can I do anything to make this more appealing or land-able or attention worthy?

Sorry, this one's just wedged in a weird place of probably-getting-attention-soon and also-kind-of-needing-nontrivial-attention. I'm planning to complete T4245 and adjacent work in Diffusion after Subprojects/Milestones, and expect to look at it then.

We have a new PHUISegmentBarView which I think should be a little more flexible for rendering the bar. It's the thing that renders this:

(It might still need some CSS tweaks in this role, but it has a more promising future than the existing AphrontBar stuff does.)

I think this is all basically pretty reasonable as a first cut, except for the "update" mode issue (see inline), which also seems difficult to resolve.

The hard part of the "update" thing is probably just getting a list of all the paths you need to query for. Two things you could do:

Add a recursive flag to diffusion.browsequery and call that to get the list of paths that exist in the repository. This is the straightforward approach but kind of a pain to test, especially for Mercurial/SVN, which you probably care exactly zero about.
OR: Just select all the existing coverage at the given commit, query which paths it is for by looking up path IDs, and then figure out how much of it matters for rebuilding the affected paths.

The latter (just grab everything, then figure out how much of it matters) may be substantially easier, and I think it's probably "scalable enough". If we did run out of memory with it, we could squeeze more headroom out of it by iterating over coverage, but the path IDs + coverage should be on the order of the number of lines + files in the codebase, which should be a reasonable number of megabytes to keep in RAM even in large repositories, I think.

It's also substantially better in repositories with only a limited amount of coverage, which might be many real repositories.

src/applications/diffusion/conduit/DiffusionUpdateCoverageConduitAPIMethod.php
92	This potentially gives us a 1MB coverage string at the root for a million-line codebase, right? I guess that's not actually probably not all that terrible, although we could improve it storing these in some smarter way. I'm comfortable with this approach for now, though.
102–110	I think this will do the wrong thing with the `update` mode. Here's the problem scenario: Make one call in the "overwrite" mode, passing coverage for `x/a.c`, `x/b.c` and `x/c.c`. Make a second call in the "update" mode, passing coverage for `x/a.c` only. I think we'll replace the coverage for "x/" with only the "x/a.c" coverage, and incorrectly discard the `x/b.c` and `x/c.c` coverage. Instead, maybe the logic could work more like this: Do everything normally. Go through everything we touched and find all ancestor paths (same logic this implements, basically, so far). But then: Query all the descendant paths of all those paths that exist at the given commit (might not be trivial, currently)? Query all the coverage of those descendant paths. Then, aggregate and replace all the coverage data, combining the new data with anything that was already in the database.

This revision now requires changes to proceed.Feb 17 2016, 1:03 AM

I'm also happy to finish this up while I'm in here if you want, just yell. But I don't want to deprive you of the joy of pushing changes upsteram.

yelirekim added inline comments.Feb 17 2016, 1:09 AM

src/applications/diffusion/conduit/DiffusionUpdateCoverageConduitAPIMethod.php

I have another application (surprise surprise) that does this:

mysql> show create table path_coverage\G
*************************** 1. row ***************************
       Table: path_coverage
Create Table: CREATE TABLE `path_coverage` (
  `id` int(10) unsigned NOT NULL AUTO_INCREMENT,
  `pathID` int(10) unsigned NOT NULL,
  `notExecutable` int(10) unsigned NOT NULL,
  `covered` int(10) unsigned NOT NULL,
  `uncovered` int(10) unsigned NOT NULL,
  `unreachable` int(10) unsigned NOT NULL,
  `health` int(10) unsigned NOT NULL DEFAULT '0',
  `reportID` int(10) unsigned NOT NULL,
  PRIMARY KEY (`id`),
  UNIQUE KEY `key_report_path` (`reportID`,`pathID`),
  KEY `key_health` (`health`)
) ENGINE=InnoDB AUTO_INCREMENT=1189 DEFAULT CHARSET=utf8mb4 COLLATE=utf8mb4_bin
1 row in set (0.00 sec)

mysql> show create table coverage_report\G
*************************** 1. row ***************************
       Table: coverage_report
Create Table: CREATE TABLE `coverage_report` (
  `id` int(10) unsigned NOT NULL AUTO_INCREMENT,
  `phid` varbinary(64) NOT NULL,
  `repositoryPHID` varbinary(64) NOT NULL,
  `commitPHID` varbinary(64) NOT NULL,
  `buildPHID` varbinary(64) DEFAULT NULL,
  `dateCreated` int(10) unsigned NOT NULL,
  `dateModified` int(10) unsigned NOT NULL,
  PRIMARY KEY (`id`),
  UNIQUE KEY `key_repository_commit` (`repositoryPHID`,`commitPHID`),
  UNIQUE KEY `key_phid` (`phid`)
) ENGINE=InnoDB AUTO_INCREMENT=2 DEFAULT CHARSET=utf8mb4 COLLATE=utf8mb4_bin
1 row in set (0.00 sec)

for storage.

I figured that this implementation is easier to get upstreamed. One step at a time? I don't mind taking the steps to eventually migrate this data to a storage model that looks something like this.