Page MenuHomePhabricator

Display coverage information in the diffusion browse UI
AbandonedPublic

Authored by epriestley on Dec 17 2015, 3:52 AM.
Tags
None
Referenced Files
Unknown Object (File)
Fri, Mar 22, 10:35 PM
Unknown Object (File)
Wed, Mar 20, 4:56 PM
Unknown Object (File)
Tue, Mar 5, 7:17 AM
Unknown Object (File)
Feb 13 2024, 4:57 AM
Unknown Object (File)
Feb 12 2024, 1:07 PM
Unknown Object (File)
Feb 12 2024, 12:40 PM
Unknown Object (File)
Jan 7 2024, 12:09 PM
Unknown Object (File)
Jan 3 2024, 6:52 PM
Subscribers

Details

Summary

See T9524 for discussion. This aggregates the coverage information gathered on a per-file basis in diffusion.updatecoverage and stores concatenated coverage reports for all of the files within a given directory. Once we have that data it gets threaded through the diffusion-pull-lastmodified calls alongside the other columns in the browse view.

Test Plan

Screen Shot 2016-02-01 at 7.50.25 PM.png (1×2 px, 373 KB)

Diff Detail

Repository
rP Phabricator
Branch
coverage_column (branched from master)
Lint
Lint Passed
Unit
Tests Passed
Build Status
Buildable 10442
Build 13186: Run Core Tests
Build 12761: arc lint + arc unit

Event Timeline

yelirekim retitled this revision from to Display coverage information in the diffusion browse UI.
yelirekim updated this object.
yelirekim edited the test plan for this revision. (Show Details)

This isn't quite done, I'm putting it up to ask for some ideas. I've got all of the data that we would need funneled into DiffusionLastModifiedController:: renderColumns but had one particular question about how to proceed. Should we render coverage for each file in this same column? If so, how can we differentiate per-file coverage UI elements from per-directory UI elements?

yelirekim edited edge metadata.

Moving bar rendering to a specialized coverage bar view

yelirekim edited edge metadata.

Can I do anything to make this more appealing or land-able or attention worthy?

Sorry, this one's just wedged in a weird place of probably-getting-attention-soon and also-kind-of-needing-nontrivial-attention. I'm planning to complete T4245 and adjacent work in Diffusion after Subprojects/Milestones, and expect to look at it then.

epriestley added a reviewer: epriestley.

We have a new PHUISegmentBarView which I think should be a little more flexible for rendering the bar. It's the thing that renders this:

Screen_Shot_2016-02-12_at_3.48.21_PM.png (52×238 px, 5 KB)

(It might still need some CSS tweaks in this role, but it has a more promising future than the existing AphrontBar stuff does.)

I think this is all basically pretty reasonable as a first cut, except for the "update" mode issue (see inline), which also seems difficult to resolve.

The hard part of the "update" thing is probably just getting a list of all the paths you need to query for. Two things you could do:

  • Add a recursive flag to diffusion.browsequery and call that to get the list of paths that exist in the repository. This is the straightforward approach but kind of a pain to test, especially for Mercurial/SVN, which you probably care exactly zero about.
  • OR: Just select all the existing coverage at the given commit, query which paths it is for by looking up path IDs, and then figure out how much of it matters for rebuilding the affected paths.

The latter (just grab everything, then figure out how much of it matters) may be substantially easier, and I think it's probably "scalable enough". If we did run out of memory with it, we could squeeze more headroom out of it by iterating over coverage, but the path IDs + coverage should be on the order of the number of lines + files in the codebase, which should be a reasonable number of megabytes to keep in RAM even in large repositories, I think.

It's also substantially better in repositories with only a limited amount of coverage, which might be many real repositories.

src/applications/diffusion/conduit/DiffusionUpdateCoverageConduitAPIMethod.php
92

This potentially gives us a 1MB coverage string at the root for a million-line codebase, right? I guess that's not actually probably not all that terrible, although we could improve it storing these in some smarter way. I'm comfortable with this approach for now, though.

102–110

I think this will do the wrong thing with the update mode. Here's the problem scenario:

  • Make one call in the "overwrite" mode, passing coverage for x/a.c, x/b.c and x/c.c.
  • Make a second call in the "update" mode, passing coverage for x/a.c only.

I think we'll replace the coverage for "x/" with only the "x/a.c" coverage, and incorrectly discard the x/b.c and x/c.c coverage. Instead, maybe the logic could work more like this:

  • Do everything normally.
  • Go through everything we touched and find all ancestor paths (same logic this implements, basically, so far).

But then:

  • Query all the descendant paths of all those paths that exist at the given commit (might not be trivial, currently)?
  • Query all the coverage of those descendant paths.
  • Then, aggregate and replace all the coverage data, combining the new data with anything that was already in the database.
This revision now requires changes to proceed.Feb 17 2016, 1:03 AM

I'm also happy to finish this up while I'm in here if you want, just yell. But I don't want to deprive you of the joy of pushing changes upsteram.

src/applications/diffusion/conduit/DiffusionUpdateCoverageConduitAPIMethod.php
92

I have another application (surprise surprise) that does this:

mysql> show create table path_coverage\G
*************************** 1. row ***************************
       Table: path_coverage
Create Table: CREATE TABLE `path_coverage` (
  `id` int(10) unsigned NOT NULL AUTO_INCREMENT,
  `pathID` int(10) unsigned NOT NULL,
  `notExecutable` int(10) unsigned NOT NULL,
  `covered` int(10) unsigned NOT NULL,
  `uncovered` int(10) unsigned NOT NULL,
  `unreachable` int(10) unsigned NOT NULL,
  `health` int(10) unsigned NOT NULL DEFAULT '0',
  `reportID` int(10) unsigned NOT NULL,
  PRIMARY KEY (`id`),
  UNIQUE KEY `key_report_path` (`reportID`,`pathID`),
  KEY `key_health` (`health`)
) ENGINE=InnoDB AUTO_INCREMENT=1189 DEFAULT CHARSET=utf8mb4 COLLATE=utf8mb4_bin
1 row in set (0.00 sec)

mysql> show create table coverage_report\G
*************************** 1. row ***************************
       Table: coverage_report
Create Table: CREATE TABLE `coverage_report` (
  `id` int(10) unsigned NOT NULL AUTO_INCREMENT,
  `phid` varbinary(64) NOT NULL,
  `repositoryPHID` varbinary(64) NOT NULL,
  `commitPHID` varbinary(64) NOT NULL,
  `buildPHID` varbinary(64) DEFAULT NULL,
  `dateCreated` int(10) unsigned NOT NULL,
  `dateModified` int(10) unsigned NOT NULL,
  PRIMARY KEY (`id`),
  UNIQUE KEY `key_repository_commit` (`repositoryPHID`,`commitPHID`),
  UNIQUE KEY `key_phid` (`phid`)
) ENGINE=InnoDB AUTO_INCREMENT=2 DEFAULT CHARSET=utf8mb4 COLLATE=utf8mb4_bin
1 row in set (0.00 sec)

for storage.

I figured that this implementation is easier to get upstreamed. One step at a time? I don't mind taking the steps to eventually migrate this data to a storage model that looks something like this.

Normally I would say that I want to do it myself, but if you're willing to clean it up right now, go for it.

I have several teams that have been asking for this for months and they'll love you for it.

epriestley edited reviewers, added: yelirekim; removed: epriestley.

I'm splitting this apart into a data half and a UI half, see followups shortly.