Allow daemon pools to autoscale down to 0 processes
Closed, WontfixPublic
Actions

Assigned To

Authored By

	epriestley
	Feb 20 2017, 4:30 PM

Description

After T7352, daemons are organized into "pools" which can autoscale up and down, so each instance can, for example, run 1 taskmaster by default and scale up to 4 when there's work to be done. This allowed us to get about 100 instances onto each repo host.

Now that we have free instances, scalability is again bottlenecked by daemon memory pressure. The most-available fix we can apply is to allow pools to scale down below 1 process, to 0, so they don't need to be using any resources while asleep. In theory, this gives us about 4x headroom by stopping the Taskmaster, PullLocal, and Trigger processes and just leaving the Overseer running (although, realistically, we can't use all of that since we need some free memory for actual work, but 2x-3x is likely safe).

The infrastructure changes out of T7352 work, but they aren't especially clean. In particular, the Overseer has some awkward responsibilities, not everything is really a pool, autoscaling does some magic, there's a lot of "dictionary of keys" stuff instead of "actual object" stuff, and so on.

I plan to clean this up first (let the Overseer have a list of Pools, not a list of DaemonHandles, then make the Pools deal with the daemon/autoscale stuff), then give daemons tools to entomb themselves.

Revisions and Commits

rPHU libphutil
		D17560	rPHUc0bc116bedc8 Clean up a few more daemon behaviors
		D17553	rPHUf2b2abeacf84 Send EXIT events more consistently from daemons
		D17551	rPHUbebe54d5762f Don't awaken or scale pools during daemon shutdown
		D17539	rPHU01b33af6f4d5 Clean up overseer modules slightly and provide a throttling support method
		D17538	rPHUf2bc1710cf91 Allow daemon modules to awake hibernating daemons from slumber
		D17407	rPHU0625e4d28b16 Allow daemons to "hibernate", reducing pool size to 0 for a time
	Audited	D17389	rPHUbe546154255c Reorganize PhutilDaemon into Overseers, Pools and Daemons in libphutil
rP Phabricator
		D17635	rP845a7d871666 Allow the PullLocal daemon to actually hibernate
		D17559	rP8b553d2f183b Allow taskmaster daemons to hibernate
		D17550	rPf13637627d65 Improve daemon "waiting" message, config reload behavior
		D17540	rP9099485a7125 Allow the PullLocal daemon to hibernate, and wake it when repositories need an…
		D17429	rPfcd8c9c240d4 Update `phd launch`
		D17408	rP40cc403d2385 Allow the Trigger daemon to hibernate, reducing processes to 0
		D17390	rP6f50729a9171 Update Phabricator for new daemon pool changes

Related Objects
Search...

Status	Assigned	Task
Resolved	epriestley	T12218 Reduce the operational cost of a larger Phacility cluster
Invalid	epriestley	T12217 Reduce the hardware cost of Phacility free instances
Wontfix	epriestley	T12298 Allow daemon pools to autoscale down to 0 processes

Event Timeline

epriestley created this task.Feb 20 2017, 4:30 PM

Herald added subscribers: chad, eadler. · View Herald TranscriptFeb 20 2017, 4:30 PM

epriestley updated the task description. (Show Details)Feb 20 2017, 9:42 PM

I believe I have the first part of this (restructuring the code into a more sensible Overseer > Pool > Daemon sort of thing) working, but it could use more testing. I'm going to see if we have anything else in Daemons that I can fix while I'm here to help me kick the tires a bit.

epriestley moved this task from Backlog to vNext on the Daemons board.Feb 21 2017, 12:10 AM

epriestley added a revision: D17389: Reorganize PhutilDaemon into Overseers, Pools and Daemons in libphutil.Feb 21 2017, 4:48 PM

epriestley added a revision: D17390: Update Phabricator for new daemon pool changes.

No time like the present.

C4j63T_WQAApRwR.jpg-large.jpeg (848×1 px, 129 KB)

epriestley added a commit: rPHUbe546154255c: Reorganize PhutilDaemon into Overseers, Pools and Daemons in libphutil.Feb 22 2017, 9:15 PM

epriestley added a commit: rP6f50729a9171: Update Phabricator for new daemon pool changes.

epriestley added a revision: D17407: Allow daemons to "hibernate", reducing pool size to 0 for a time.Feb 24 2017, 5:06 PM

epriestley added a revision: D17408: Allow the Trigger daemon to hibernate, reducing processes to 0.Feb 24 2017, 5:14 PM

Hibernating daemons currently show as "Waiting" in the Daemon console, but I'm not going to worry about that for now.

epriestley added a commit: rPHU0625e4d28b16: Allow daemons to "hibernate", reducing pool size to 0 for a time.Feb 24 2017, 6:50 PM

epriestley added a commit: rP40cc403d2385: Allow the Trigger daemon to hibernate, reducing processes to 0.Feb 24 2017, 6:54 PM

joshuaspence added a subscriber: joshuaspence.Feb 25 2017, 8:48 AM

epriestley mentioned this in T12317: PhabricatorTriggerDaemon shows as "Waiting" but there are no errors.Feb 26 2017, 4:01 AM

joshuaspence added a revision: D17429: Update `phd launch`.Feb 28 2017, 3:49 AM

joshuaspence added a commit: rPfcd8c9c240d4: Update `phd launch`.Mar 2 2017, 10:37 AM

epriestley mentioned this in T12412: PhutilDaemonOverseer's pidfile updates break `phd status`'s dead-daemon detection.Mar 17 2017, 3:01 AM

epriestley added a revision: D17538: Allow daemon modules to awake hibernating daemons from slumber.Mar 23 2017, 1:17 AM

epriestley added a revision: D17539: Clean up overseer modules slightly and provide a throttling support method.Mar 23 2017, 1:57 AM

epriestley added a revision: D17540: Allow the PullLocal daemon to hibernate, and wake it when repositories need an update.Mar 23 2017, 2:02 AM

epriestley added a commit: rPHUf2bc1710cf91: Allow daemon modules to awake hibernating daemons from slumber.Mar 23 2017, 5:51 PM

epriestley added a commit: rPHU01b33af6f4d5: Clean up overseer modules slightly and provide a throttling support method.

epriestley added a commit: rP9099485a7125: Allow the PullLocal daemon to hibernate, and wake it when repositories need an….

epriestley added a revision: D17550: Improve daemon "waiting" message, config reload behavior.Mar 24 2017, 3:21 PM

epriestley added a revision: D17551: Don't awaken or scale pools during daemon shutdown.Mar 24 2017, 3:23 PM

epriestley added a commit: rPf13637627d65: Improve daemon "waiting" message, config reload behavior.Mar 24 2017, 3:32 PM

epriestley added a commit: rPHUbebe54d5762f: Don't awaken or scale pools during daemon shutdown.Mar 24 2017, 3:36 PM

epriestley added a revision: D17553: Send EXIT events more consistently from daemons.Mar 24 2017, 3:56 PM

epriestley added a commit: rPHUf2b2abeacf84: Send EXIT events more consistently from daemons.Mar 24 2017, 4:19 PM

epriestley added a revision: D17559: Allow taskmaster daemons to hibernate.Mar 24 2017, 8:28 PM

epriestley added a revision: D17560: Clean up a few more daemon behaviors.Mar 24 2017, 8:44 PM

epriestley added a commit: rP8b553d2f183b: Allow taskmaster daemons to hibernate.Mar 24 2017, 8:51 PM

epriestley added a commit: rPHUc0bc116bedc8: Clean up a few more daemon behaviors.Mar 24 2017, 9:01 PM

epriestley added a revision: D17635: Allow the PullLocal daemon to actually hibernate.Apr 6 2017, 10:26 PM

epriestley added a commit: rP845a7d871666: Allow the PullLocal daemon to actually hibernate.Apr 6 2017, 10:41 PM

epriestley mentioned this in T12629: Start daemons that should be running but aren't.Apr 23 2017, 5:45 PM

epriestley mentioned this in D18024: Garbage collect old daemon records based on modification date, not creation date.May 26 2017, 3:51 PM

epriestley mentioned this in rP69538274c1ac: Garbage collect old daemon records based on modification date, not creation date.May 26 2017, 4:18 PM

epriestley mentioned this in T13052: Differentiate "Waiting" from "Restarting after an error" on the daemon console.Jan 27 2018, 8:51 PM

We no longer offer free instances so I don't currently plan to pursue this.

I think it's also possible that we may want to remove all this autoscale/hibernation code, since it no longer serves any purpose but is very complicated. But it seems stable, T13052 excepted, so it's not on the chopping block immediately.

	F3183276: C4j63T_WQAApRwR.jpg-large.jpeg
	Feb 22 2017, 9:14 PM

Allow daemon pools to autoscale down to 0 processesClosed, WontfixPublicActions

Description

Revisions and Commits

Related ObjectsSearch...

Event Timeline

Allow daemon pools to autoscale down to 0 processes
Closed, WontfixPublic
Actions

Related Objects
Search...