See T9524 for discussion. This aggregates the coverage information gathered on a per-file basis in diffusion.updatecoverage and stores concatenated coverage reports for all of the files within a given directory. Once we have that data it gets threaded through the diffusion-pull-lastmodified calls alongside the other columns in the browse view.
- Group Reviewers
- Maniphest Tasks
- T9524: Show aggregate subdirectory coverage in Diffusion browse views
This isn't quite done, I'm putting it up to ask for some ideas. I've got all of the data that we would need funneled into DiffusionLastModifiedController:: renderColumns but had one particular question about how to proceed. Should we render coverage for each file in this same column? If so, how can we differentiate per-file coverage UI elements from per-directory UI elements?
We have a new PHUISegmentBarView which I think should be a little more flexible for rendering the bar. It's the thing that renders this:
(It might still need some CSS tweaks in this role, but it has a more promising future than the existing AphrontBar stuff does.)
I think this is all basically pretty reasonable as a first cut, except for the "update" mode issue (see inline), which also seems difficult to resolve.
The hard part of the "update" thing is probably just getting a list of all the paths you need to query for. Two things you could do:
- Add a recursive flag to diffusion.browsequery and call that to get the list of paths that exist in the repository. This is the straightforward approach but kind of a pain to test, especially for Mercurial/SVN, which you probably care exactly zero about.
- OR: Just select all the existing coverage at the given commit, query which paths it is for by looking up path IDs, and then figure out how much of it matters for rebuilding the affected paths.
The latter (just grab everything, then figure out how much of it matters) may be substantially easier, and I think it's probably "scalable enough". If we did run out of memory with it, we could squeeze more headroom out of it by iterating over coverage, but the path IDs + coverage should be on the order of the number of lines + files in the codebase, which should be a reasonable number of megabytes to keep in RAM even in large repositories, I think.
It's also substantially better in repositories with only a limited amount of coverage, which might be many real repositories.
This potentially gives us a 1MB coverage string at the root for a million-line codebase, right? I guess that's not actually probably not all that terrible, although we could improve it storing these in some smarter way. I'm comfortable with this approach for now, though.
I think this will do the wrong thing with the update mode. Here's the problem scenario:
I think we'll replace the coverage for "x/" with only the "x/a.c" coverage, and incorrectly discard the x/b.c and x/c.c coverage. Instead, maybe the logic could work more like this:
I have another application (surprise surprise) that does this:
mysql> show create table path_coverage\G *************************** 1. row *************************** Table: path_coverage Create Table: CREATE TABLE `path_coverage` ( `id` int(10) unsigned NOT NULL AUTO_INCREMENT, `pathID` int(10) unsigned NOT NULL, `notExecutable` int(10) unsigned NOT NULL, `covered` int(10) unsigned NOT NULL, `uncovered` int(10) unsigned NOT NULL, `unreachable` int(10) unsigned NOT NULL, `health` int(10) unsigned NOT NULL DEFAULT '0', `reportID` int(10) unsigned NOT NULL, PRIMARY KEY (`id`), UNIQUE KEY `key_report_path` (`reportID`,`pathID`), KEY `key_health` (`health`) ) ENGINE=InnoDB AUTO_INCREMENT=1189 DEFAULT CHARSET=utf8mb4 COLLATE=utf8mb4_bin 1 row in set (0.00 sec) mysql> show create table coverage_report\G *************************** 1. row *************************** Table: coverage_report Create Table: CREATE TABLE `coverage_report` ( `id` int(10) unsigned NOT NULL AUTO_INCREMENT, `phid` varbinary(64) NOT NULL, `repositoryPHID` varbinary(64) NOT NULL, `commitPHID` varbinary(64) NOT NULL, `buildPHID` varbinary(64) DEFAULT NULL, `dateCreated` int(10) unsigned NOT NULL, `dateModified` int(10) unsigned NOT NULL, PRIMARY KEY (`id`), UNIQUE KEY `key_repository_commit` (`repositoryPHID`,`commitPHID`), UNIQUE KEY `key_phid` (`phid`) ) ENGINE=InnoDB AUTO_INCREMENT=2 DEFAULT CHARSET=utf8mb4 COLLATE=utf8mb4_bin 1 row in set (0.00 sec)
I figured that this implementation is easier to get upstreamed. One step at a time? I don't mind taking the steps to eventually migrate this data to a storage model that looks something like this.
Normally I would say that I want to do it myself, but if you're willing to clean it up right now, go for it.
I have several teams that have been asking for this for months and they'll love you for it.