See PHI351. An install has 256+ character unit test names. I think this essentially always a problem before the data hits Phabricator (i.e., these names aren't useful to humans, either) but this is probably a case where we should give installs enough rope to shoot themselves in the foot. See also T11402, which discusses the total size of the test table. I expect that extracting these strings to a dimension table likely makes sense, with some absurd soft limit (like 1MB) to forestall disaster when an install uploads a unit test with a 750MB name and every interface becomes mysteriously slow.
General questions about "which specific tests are failing" which storage changes should support:
- T9365 is probably related but needs triage.
- Same with T12029.
- This gets a mention in PHI383.
- T9951 wants this storage to have a basket of random properties. Maybe? This tends to make archival/storage more difficult but doesn't seem unreasonable.
- T10123 is probably some aggregation of other stuff here.
- See also T11763.
T10635 is perhaps adjacent although I'm not sure what the current state of it is.
What unit tests have recently failed? ... On which buildables?
What build plans have failed recently? ... On which buildables?
What build steps have failed recently? ... On which buildables?
What builds are running right now against buildables of this type?
You can get some of this in the UI but it's very scattered. These are all largely reasonable questions to ask. Harbormaster doesn't do a good job of showing an overall "state of the world in builds" status dashboard or giving you the components to build one today.
What builds are running right now? What builds are running of a given plan?
These can be answered at /harbormaster/build/ but it (or some other interface) could be better at answering questions instead of just presenting information.
[ Which Drydock resources have had unit test, plan and step failures recently? ]
What builds are running within a given [Drydock] resource pool?
What builds has this drydock resource/resource pool had run on it? What build plans? What tests have failed on it?
These are likely reasonable but the path is currently a little muddier since Harbormaster/Drydock have little explicit bridging today.
See PHI405. This task discusses some reasonable improvements to the Harbormaster/Differential integration.
See PHI446. This task discusses providing ways to access older builds and build logs instead of vanishing them from the UI completely. (This works now, but it would be nice to make the UI a little richer.)
See PHI430. Policy exceptions raised while rendering "Build Status" are currently not handled as well as they could be.
See PHI507. An install would like better support for richer build artifacts, particularly screenshots.
See T10568. The checkmark icon tooltip in Diffusion for builds could be more useful on failures, e.g. "3/5 builds passed" or, more likely, break out which builds failed.
- It should probably be called out in the UI.
- It shouldn't be able to generate artifacts or depend on other steps.
- It should either skip target generation entirely, or generate an empty target, and then immediately re-enter the build update. Currently, we'll return to the queue to execute an empty step which does nothing.
- T13072 should happen, and BuildCommand needs to stop being transactional; it can currently race.
- Also, this whole thing might be a bad idea.
- It should possibly issue a special build command like "Render Obsolete", not "Abort".
- Long-running steps, like "Sleep" and "Drydock: Run Command", should test for build aborts while sitting in their local equivalent of a select() loop.
See T11350. This is a minor but reasonable UI improvement.
See PHI859. If you pass a poorly-named variable to Harbormaster, it should complain immediately.
See PHI901, which is roughly T10260: Harbormaster notification rules aren't currently very flexible, and rules like "Notify on build failure: ..." and/or Herald support for something like "When build status changes to failed" would help address some reasonable use cases.
See PHI919, which discusses improvements to failed/aborted messaging.
See PHI927, which requests arc:onto be a build variable.