Garbage collect and/or compress/archive harbormaster build unit messages
Open, Needs TriagePublic


This table can get extremely unwieldy if left untended, especially when you do things like report coverage data for your builds.

FWIW we're writing ~36 million rows every 30 days, which takes up ~28 gigs. I've been manually clearing the table with a script that finds the lowest id in the table and then deletes all of the results that belong to the same target. Haven't seen any detrimental effects from that yet.

chad added a subscriber: chad.Jul 30 2016, 11:03 PM

Is this different than T5822?

Yeah, unit messages are stored in a different table.

chad added a comment.Jul 30 2016, 11:15 PM

(ノಠ益ಠ)ノ彡 sboן ʇsǝʇ ʇıun

T10635 is vaguely related, too.

eadler added a project: Restricted Project.Aug 5 2016, 4:45 PM

I'd like to sort out T9365 / T10635 (better aggregation and/or query options for unit/coverage data) first, since that likely impacts how we GC this data. Ideally, we'd leave an aggregated stub behind (e.g., pass/failure count) rather than nuking the data completely.

Although T5822 discusses different data, we should probably plan the pathway through log collection at the same time. I think this became somewhat less urgent after D15380, which added compression, but we still can't grow forever.