Loading differential revision slow when lots of unit test messages exist
Open, WishlistPublic
Actions

Assigned To

None

Authored By

	richardvanvelzen
	Mar 21 2016, 6:07 PM

Description

Case in point:

Our test suite produces a lot of messages, which cause noticeable slowdowns when trying to view a revision. (It takes ~13 seconds with xhprof enabled).

I've pinpointed the issue to (amongst others) HarbormasterUnitStatus::getUnitStatusSort, which takes up the bulk of loading the page.

Sample xhprof profile:

profile.xhprof823 KBDownload

Revisions and Commits

rP Phabricator
	Abandoned		D16417 Aggregate lint, unit information in HarbormasterBuildable
		D20970	rP9d1af762d518 In summary interfaces, don't render very large inline remarkup details for unit…

Related Objects
Search...

		Status	Assigned	Task
		Open	None	T11402 Garbage collect and/or compress/archive harbormaster build unit messages
		Open	None	T10635 Loading differential revision slow when lots of unit test messages exist

Event Timeline

richardvanvelzen created this task.Mar 21 2016, 6:07 PM

• camilorojas added a subscriber: • camilorojas.Mar 21 2016, 9:07 PM

This comment was removed by epriestley.

epriestley added a subscriber: epriestley.Mar 23 2016, 5:28 PM

We don't currently have a sortable column on the unit message table, since pass, fail, etc., aren't naturally sortable by any key MySQL can construct.

Consequently, we pull everything and then sort it in the UI. This doesn't scale gracefully to large result sizes.

Building a sortable key is kind of iffy because we may introduce new results in the future or want to change the display order of results (that is, there's no unambiguously correct ordering of pass, fail, skip, broken in the way that there is a single unambiguous ordering of 1, 2, 3), but I think we just have to do our best. The consequence of getting this ordering wrong is minor: changes to the ordering just won't be reflected in older builds.

I think the pathway forward here is:

Add an ordering column with an appropriate key.
Generate a surrogate ordering value which MySQL can sort when writing test data.
Start doing limits in queries instead of pulling all the data.

None of this is too tricky, although step (1) may be a hefty migration on installs with lots of data and step (3) may be very disruptive to the ordering of older builds. We might need to add a bin/harbormaster fix-all-the-old-order-surrogate-keys or similar to help with that.

epriestley merged a task: T9704: Differential pages slow to a crawl when there are a lot (tens of thousands) of tests on the diff..Mar 23 2016, 5:49 PM

epriestley added subscribers: cburroughs, avivey, NorthIsUp.

T9704 also discusses slowness on inserting the tests. That may be unnecessarily slow right now, but I don't expect clients to submit unlimited numbers of tests in one API call. Instead, submit tests in chunks (say, of 1K tests per page or whatever) by calling harbormaster.sendmessage repeatedly with a work status.

Ideally, you can do this as the tests run, providing a stream of results to the user sooner.

In the extreme case where you have, say, a million tests, I'd expect there is probably little value in reporting each test into Harbormaster as a "pass", and your harness might want to summarize all passes into "999,998 additional tests passed" and submit two failures and one aggregate-pass. We haven't started to formally explore this yet so there may be more support in the future.

Broadly, the expectations are:

Reporting ten million tests per build should not impact the usability of the UI. It currently does; the steps described above will fix this.
Reporting ten million tests per build will require multiple calls to harbormaster.sendmessage.
Reporting ten million tests per build may take a long time.
Reporting ten million tests per build may cause storage to grow at an uncomfortable rate.
If you have ten million tests, some aggregation mechanism for passing tests is probably desirable as the value of recording individual passes is probably far lower than the cost of storing them, but we don't have a formal recommendation on where this should go or what it should look like yet.

In the extreme case where you have, say, a million tests, I'd expect there is probably little value in reporting each test into Harbormaster as a "pass", and your harness might want to summarize all passes into "999,998 additional tests passed" and submit two failures and one aggregate-pass.

This is pretty what I've done right now. The basic unit tests are reported via arc (~2000 separate cases), and only the failing tests from CI are reported back via`harbormaster.sendmessage`.

• camilorojas removed a subscriber: • camilorojas.Mar 23 2016, 7:26 PM

thoughtpolice added projects: Harbormaster, Haskell.org.May 13 2016, 10:03 PM

thoughtpolice moved this task from Backlog to Details on the Haskell.org board.May 13 2016, 10:05 PM

epriestley mentioned this in T11402: Garbage collect and/or compress/archive harbormaster build unit messages.Jul 31 2016, 12:37 PM

eadler added a project: Restricted Project.Aug 5 2016, 4:45 PM

Herald added a subscriber: eadler. · View Herald TranscriptAug 5 2016, 4:45 PM

avivey added a revision: D16417: Aggregate lint, unit information in HarbormasterBuildable.Aug 19 2016, 7:16 PM

avivey mentioned this in D16417: Aggregate lint, unit information in HarbormasterBuildable.Aug 25 2016, 7:03 PM

epriestley added a parent task: T11402: Garbage collect and/or compress/archive harbormaster build unit messages.Aug 29 2016, 4:35 PM

epriestley mentioned this in D16483: Show broken units in revision history.Sep 2 2016, 12:02 PM

For people who suffer from this: Note that D16483 might have actually made things a little worse, by searching the DB for old messages (Not loading them though); I tested with 1M messages on an SSD drive, and I didn't see any changes, but YMMV.

epriestley moved this task from Backlog to Logs / Unit Logs on the Harbormaster board.Feb 21 2018, 2:39 PM

epriestley mentioned this in T13088: Plans: Harbormaster UI usability and interconnectedness.Feb 21 2018, 3:43 PM

epriestley mentioned this in T13189: Plans: 2018 Week 35 Bonus Content.Aug 28 2018, 7:52 PM