Page MenuHomePhabricator

Plans: 2018 Week 35 Bonus Content
Closed, ResolvedPublic



See PHI642. An install would like to give "Disable User" permission to all users so everyone can participate in a large "Battle Royale" event. This is an excellent idea but may justify shoring up some aspects of how user accounts are currently edited. This was "mostly done" in Week 33 by T13186.

The "Setup Issues" page with no setup issues renders a bit weird:

Screen Shot 2018-08-25 at 6.58.19 AM.png (429×1 px, 42 KB)

The auth.require-approval setting has a non-markup'd link to another setting (note This would be nice to do somewhat more formally perhaps since I think it's happened a few times?

Screen Shot 2018-08-27 at 7.34.40 AM.png (478×929 px, 75 KB)

See this beautiful thread: The device reaction logic for filetrees is a bit wonky, and we call a nonexistent resetdrag() method.

See PHI580. An install would like write API access to object relationships (depends on, fixes, etc). This is complicated, but perhaps plannable, and we may be able to put a reasonable facade with partial support into the upstream. See some prior discussion in T13130. This didn't get much traction previously (see T13137) so the action may just be to punt it for now. (Yeah, I punted this.)

See PHI690. An install would like access to Drydock lease metrics, notably the time a lease spent in queue. See also T13073, but this is more narrowly scoped.

See PHI610. This is yikes. This is also T12251.

See PHI710. This is a request for richer lint/unit results. It's likely entangled with T13088 but can probably be planned.

Event Timeline

epriestley created this task.

I landed D19586 (the thing which breaks T13186) and pushed it here. This generally causes us to perform more policy checks, and may cause new policy exceptions which did not exist before (because the workflows relied implicitly on weak policy enforcement). If these crop up, the remedy is generally to explicitly weaken the required capability for the action.

For example, muting previously relied on nothing actually doing a CAN_EDIT check, since you should be able to mute objects you can't edit. This change caused muting to fail because we began doing a CAN_EDIT check. I remedied this by explicitly removing the requirement, since muting shouldn't require any capabilities.

I kicked some user/profile/account stuff to T13190.

See PHI690. An install would like access to Drydock lease metrics, notably the time a lease spent in queue. See also T13073, but this is more narrowly scoped.

D19613 adds acquiredEpoch and activatedEpoch fields to DrydockLease.

The acquiredEpoch stores the first time the lease transitioned into "Acquired".

The activatedEpoch stores the first time the lease transitioned into "Active".

Normally, leases wait in the daemon queue for some time, then are acquired. They do some setup (e.g., checking out repositories), then activate. In this case, the difference between "Created" and "Acquired" is roughly how much time was spent waiting on other stuff to finish up. The difference between "Acquired" and "Active" is how much time the particular resource the lease is associated with took to get ready.

If the former time is long, you may be able to add more hosts/daemons/etc.

If the latter time is long, resource setup time might merit a look (git clone, EC2 setup, etc).

In some cases, leases activate instantly with no setup. In this case, acquiredEpoch will be NULL and activatedEpoch will be non-null. In this case, the difference between "Created" and "Activated" is entirely waiting in queue. I left acquiredEpoch as null in this case (rather than just marking the same time as activatedEpoch) so these cases could be distinguished, although I'm not sure that's terribly important.

See PHI710. This is a request for richer lint/unit results. It's likely entangled with T13088 but can probably be planned.

PHI710 wants a way to make an individual unit result link elsewhere. In that particular case, aggregated results are being reported to Harbormaster and providing a link would let the aggregated result unfold into a detailed result elsewhere.

I think the outstanding plans for unit messages are:

  • T10635 reports a performance issue in Differential with a large number of unit results. The underlying problem is that ORDER BY fail, pass isn't a keyed query. I'll revisit that since it's old, but I expect to introduce a surrogate sortable column (where fail = 2000, pass = 1000, etc) and sort by it.
  • T9951 wants richer unit result information in general, but I think it possibly predates details being added. If we make details render as remarkup (perhaps optionally?) that probably addresses it. See below for some other thoughts.
  • PHI351 discusses tests with very long names. I plan to address this by adding a separate string table and updating the test table to reference it. This should make the overall data size somewhat more manageable, too, since these strings are extremely redundant and very heavily compressible.
  • T11402 discusses large overall data size of this table. I think this is probably orthogonal to the other changes, and I'd like to make them first and then revisit exactly how we want to GC test data. We have the opportunity to GC it selectively (e.g., coverage, details can be GC'd separately). We could also GC only passes. We can also aggregate coverage data (we currently record it for each test separately). It would also be nice to leave behind a summary stub (e.g., pass/failure count) if we do fully GC it.

On "attach files to unit results" (for example, screenshots), a possible issue here is that if you attach largeish files to a test result, it would be nice to GC those files if the test result gets GC'd -- otherwise you may end up with a lot of old screenshots and other clutter building up. There's no GC on results today beyond bin/remove destroy, but there may be a more structured GC process in the future. I don't immediately see a good way to mark files for GC, but maybe this looks something like a "refcount this file" flag when you file.allocate. The behavior of this flag would be:

  • When we remove an ObjectHasFile edge, we check if the file has the "refcount this file" flag set.
  • If it does, we check if it's attached to anything else.
  • If it is not, we delete it.
  • (When you first allocate a refcounted file, we GC it after an hour if it doesn't have a refcount yet, so files which get uploaded but not attached to anything GC.)

This seems sort of reasonable? Maybe I'll file that somewhere and we can pursue it when there's more of a use case.

I'm inclined to move forward like this:

  • Add a format or similar field to harbormaster.sendmessage which defaults to "text" but may be specified as "remarkup".
  • When the details are specified as "remarkup", render the details as remarkup.

This is cheap/easy and should address PHI710 and T9951. The more involved changes are well-defined but can wait, and will need some migrations.