parsing giganto-normous commits causes admin headaches
Open, WishlistPublic
Actions

Assigned To

None

Authored By

	cburroughs
	Aug 13 2015, 3:58 PM

Description

User creates a large ~5MiB commit
This explodes into a 250 MiB unified diff
The commit can be parsed with a "reasonable" amount of memory and the script/daemon gets killed by the OS. Then the task backs off and runs off to get killed later I'm not sure how much this particular commit would need but ~8 GiB was insufficient.
User is sad that their diffs are not getting closed.

Ideally this would take less memory but there will probably always be bigger diffs on the horizon. Some way to skip parsing the diff (automatically?) if is 'Too Big' is one possibility.

https://secure.phabricator.com/chatlog/channel/6/?at=210774

Related Objects

Mentioned Here: D19748: Skip copied code detection for changes that are too large for it to be useful
T8612: Improve handling of "large" changesets

Event Timeline

cburroughs created this task.Aug 13 2015, 3:58 PM

cburroughs raised the priority of this task from to Needs Triage.

cburroughs updated the task description. (Show Details)

cburroughs added a subscriber: cburroughs.

chad added projects: Diffusion, Daemons.Aug 13 2015, 4:09 PM

Herald added a subscriber: eadler. · View Herald TranscriptAug 13 2015, 4:09 PM

joshuaspence added a subscriber: joshuaspence.Aug 13 2015, 9:23 PM

T8612 is related.

devurandom added a subscriber: devurandom.Aug 14 2015, 9:28 AM

FWIW The particular commit that was causing trouble eventually (!?!) was processed. Since I could reproduce multiple times times with 4x the RAM this feels spooky. Maybe the memory consumption is non-deterministic? Or somehow very different when running in the daemon instead of the cli?

iiam

eadler added a project: Restricted Project.Sep 15 2016, 6:08 PM

urzds added a subscriber: urzds.Jul 12 2017, 11:08 AM

As of D19748, I'm not aware of any change of size X that requires more than 8X bytes of memory to parse. This isn't ideal, but it's a fair bit better than the 32X in the original report.

It's likely fairly easy to build a diff which will fail (e.g., just make a text diff with 50GB of random nonsense -- at some point, we aren't going to be able to fit it in memory) but since this rarely arises and isn't much of a big deal when it does (you can usually just bin/worker cancel the task) it's hard to prioritize going out of the way to try to break stuff and then fix the stuff we broke on purpose.

epriestley moved this task from Backlog to Far Future on the Daemons board.Feb 15 2019, 3:22 AM

epriestley moved this task from Backlog to Far Future on the Diffusion board.Apr 15 2019, 4:03 PM

Herald added a subscriber: amckinley. · View Herald TranscriptApr 15 2019, 4:03 PM

parsing giganto-normous commits causes admin headachesOpen, WishlistPublicActions

Description

Related Objects

Event Timeline

parsing giganto-normous commits causes admin headaches
Open, WishlistPublic
Actions