Ref T13142. When commits are pushed, we try to handle them on one of two pathways:
- Normal changes: we load these into memory and potentially apply Herald content rules to them.
- "Enormous" changes: we don't load these into memory and skip content rules for them.
The goal is to degrade gracefully when users push huge changes: they should work, just not support all the features.
However, some changes can slip through the cracks right now:
- If you push a lot of commits at once, we'll try to cache all of the changes smaller than 1GB in memory. This can require an arbitrarily large amount of RAM.
- We calculate sizes by just looking at the strlen() of the diff, but a changeset takes more RAM in PHP than the raw diff does. So even if a diff is "only" 500MB, it can take much more memory than that. On systems with relatively little memory available, this may result in OOM while processing changes that are close to the "enormous" limit.
This change makes two improvements:
- Instead of caching everything, cache only 64MB of things.
- For most pushes, this is the same, since they have less than 64MB of diffs.
- For pushes of single very large changes, this is a bit slower (more CPU) since we have to do some work twice.
- For pushes of many changes, this is slower (more CPU) since we have to do some work twice, but, critically, doesn't require unlimited memory.
- Instead of flagging changes as "enormous" at 1GB, flag them as "enormous" at 256MB.
- This reduces how much memory is required to process the largest "non-enormous" changes.
- This also gets us under Git's hard-coded 512MB "always binary" cutoff; see T13143.
- This is still completely gigantic and way larger than any normal change should be.
An additional improvement would be to try to reduce the amount of memory we need to use to hold a change in process memory. I think the other changes here alone will fix the immediate issue in PHI657, but it would be nice if the "largest non-enormous change" required only a couple gigs of RAM.