Ref T5791. This edge table grows 2+X faster than the corresponding mail table depending on usage. Ergo, lets make sure to clean that up too in the delete code.
Details
- Reviewers
epriestley - Maniphest Tasks
- T5791: Write Herald rules for outbound mail
- Commits
- Restricted Diffusion Commit
rP189fb2660a90: MetaMTA - make sure mail garbage collection also cleans up recipient edges
careful thought
Diff Detail
- Repository
- rP Phabricator
- Lint
Lint Not Applicable - Unit
Tests Not Applicable
Event Timeline
Possibly we should implement DestructibleInterface and use PhabricatorDestructionEngine instead.
That will automatically clean up edges -- including all edge types and inverse edges, should Mail ever get those. These don't matter today but seem like reasonable things that might matter in the future. It will also do stuff like clean up any mail-related tasks in the task queue today, although I don't think this is terribly important (they should automatically fail, and it's unlikely that there are any left after 90 days).
I can come up with two downsides to this approach:
First, this will also write a PhabricatorSystemDestructionLog as a side effect, which is silly/pointless when doing garbage collection, since we'd be trading one row for another one. We could add a flag to it like setDisableLogging() to turn off logging, so we aren't just moving rows around. This seems reasonable but is a little bit of work.
Second, it's much heavier than just doing DELETEs, since it does a bunch of extra queries to look for flags, notifications, etc. The GC is single-process right now, so if an install ever generates mail faster than one process can delete it the GC won't be able to keep up. I suspect this is not a real risk for a very long time, but it might become a more realistic risk for large installs if multiple GCs start using DestructionEngine or the GC process gets heavier (although we could do multi-process GC without too much trouble).
I think this is reasonable as written and avoids both these potential downsides, but if these objects get more full-fledged in the future it might be worth converting.
Perhaps relevant is that DestructionEngine also cleans up HeraldTranscript records "for free", which these probably will have soon.
Haha, I had a TODO to implement the destruction interface, but then figured that was too heavy weight for garbage collected stuff. For now, I want to just check this in as is to prevent data growing unexpectedly.
I guess the systems design question to me is if you want garbage collection to run through the destruction engine or not?
I don't think the upcoming HeraldTranscript integration (and other feature set) makes maintaining this delete too bad. We could reduce technical cost here by having edges and HeraldTranscript modularize how they delete themselves a bit more if necessary, so some code can be shared here? IDK waving my hands here but feels directionally like it might help.
src/applications/metamta/garbagecollector/MetaMTAMailSentGarbageCollector.php | ||
---|---|---|
11 | I missed this -- this used to be time() - ttl, but was changed to just time(), altering the threshold for deleting mail from "90 days ago" to "0 seconds ago" and deleting all mail in the table. Not really a big deal, just why /mail/ is empty now. I'll fix this shortly... |