Page MenuHomePhabricator

OpsRelease
ActivePublic

Members (1)

Watchers

  • This project does not have any watchers.

Recent Activity

Nov 3 2018

epriestley closed T13207: Cycle More AWS Hosts (October 2018) as Resolved.

I think everything here is now fully cycled, synchronized, and cleaned up.

Nov 3 2018, 6:44 PM · Phacility, Ops
epriestley added a comment to T13207: Cycle More AWS Hosts (October 2018).

Taking care of these now. I expect everything to be pretty routine.

Nov 3 2018, 6:23 PM · Phacility, Ops

Oct 22 2018

epriestley added a comment to T13207: Cycle More AWS Hosts (October 2018).

Plus: db018.phacility.net, repo001.phacility.net, db024.phacility.net.

Oct 22 2018, 8:46 PM · Phacility, Ops

Oct 19 2018

epriestley added a comment to T13207: Cycle More AWS Hosts (October 2018).

One more of these just came in for repo003.

Oct 19 2018, 8:56 PM · Phacility, Ops

Oct 8 2018

epriestley added a comment to T13207: Cycle More AWS Hosts (October 2018).

I think this is all done but want to let things run against bastion007 for a bit before I tear down bastion005.

Oct 8 2018, 3:43 PM · Phacility, Ops
epriestley added a revision to T13207: Cycle More AWS Hosts (October 2018): Restricted Differential Revision.
Oct 8 2018, 3:41 PM · Phacility, Ops
epriestley added a comment to T13207: Cycle More AWS Hosts (October 2018).

I also needed to copy the old master.key from bastion005 to bastion007 in /core/lib/keystore/.

Oct 8 2018, 3:40 PM · Phacility, Ops
epriestley added a comment to T13207: Cycle More AWS Hosts (October 2018).

I turned bastion.phacility.net and bastion-external.phacillity.net into CNAME records and pointed them at the new bastions.

Oct 8 2018, 3:35 PM · Phacility, Ops
epriestley added a comment to T13207: Cycle More AWS Hosts (October 2018).

There's a minor deadlock on bastion deployment with the current scripts: during deploy, we run deploy-key to copy the deploy key from the bastion to the target host during deployment, so that we don't need to put the entire keystore on normal cluster nodes, and so that we don't need to have the keystore on the control host (staff laptop) outside the cluster.

Oct 8 2018, 3:31 PM · Phacility, Ops

Oct 6 2018

epriestley added a comment to T13207: Cycle More AWS Hosts (October 2018).

I cycled all the hosts except bastion. saux001 needs to be vetted a bit (it handles "Land Revision" from the web UI) but it isn't critical if it needs a bit more work.

Oct 6 2018, 3:38 PM · Phacility, Ops
epriestley added a commit to T13207: Cycle More AWS Hosts (October 2018): Restricted Diffusion Commit.
Oct 6 2018, 3:36 PM · Phacility, Ops
epriestley added a comment to T13207: Cycle More AWS Hosts (October 2018).

I 'm going to get these underway once the deploy finishes.

Oct 6 2018, 2:45 PM · Phacility, Ops

Oct 1 2018

epriestley added a commit to T13207: Cycle More AWS Hosts (October 2018): Restricted Diffusion Commit.
Oct 1 2018, 8:16 PM · Phacility, Ops
epriestley added a revision to T13207: Cycle More AWS Hosts (October 2018): Restricted Differential Revision.
Oct 1 2018, 4:37 PM · Phacility, Ops
epriestley added a comment to T13207: Cycle More AWS Hosts (October 2018).

"Use the API" seemed to work OK. Of those instances, only bastion005 is at all unusual.

Oct 1 2018, 4:35 PM · Phacility, Ops
epriestley updated the task description for T13207: Cycle More AWS Hosts (October 2018).
Oct 1 2018, 4:34 PM · Phacility, Ops
epriestley triaged T13207: Cycle More AWS Hosts (October 2018) as Normal priority.
Oct 1 2018, 3:59 PM · Phacility, Ops

Sep 11 2018

epriestley closed T13199: All "web" host rate limited ELB `/status/` requests simultaneously after 11 months without configuration changes as Resolved.

This is now live.

Sep 11 2018, 12:52 AM · Ops, Phacility
epriestley added a comment to T13199: All "web" host rate limited ELB `/status/` requests simultaneously after 11 months without configuration changes.

Deploying the changes to web now.

Sep 11 2018, 12:48 AM · Ops, Phacility

Sep 10 2018

epriestley added a comment to T13199: All "web" host rate limited ELB `/status/` requests simultaneously after 11 months without configuration changes.

I've issued all instances a 24-hour service credit for the disruption. This should be reflected on your next invoice.

Sep 10 2018, 9:17 PM · Ops, Phacility
epriestley added a comment to T13199: All "web" host rate limited ELB `/status/` requests simultaneously after 11 months without configuration changes.

Here's the request rate leading up to the rate limiting:

Sep 10 2018, 9:11 PM · Ops, Phacility
epriestley added a comment to T13199: All "web" host rate limited ELB `/status/` requests simultaneously after 11 months without configuration changes.

D19653 (above) changes the per-"Host" rate limit to require "X-Forwarded-For" be present in the request. This should exempt ELB requests from these limits.

Sep 10 2018, 8:55 PM · Ops, Phacility
epriestley added a comment to T13199: All "web" host rate limited ELB `/status/` requests simultaneously after 11 months without configuration changes.
...
[Mon Sep 10 20:48:43.928021 2018] [:error] [pid 21570] [client 172.30.0.171:16516] Array\n(\n    [f] => \n    [h] => 172.30.0.60\n)\n
...
Sep 10 2018, 8:51 PM · Ops, Phacility
epriestley added a comment to T13199: All "web" host rate limited ELB `/status/` requests simultaneously after 11 months without configuration changes.

... in production today as a next step.

Sep 10 2018, 8:46 PM · Ops, Phacility
epriestley added a comment to T13199: All "web" host rate limited ELB `/status/` requests simultaneously after 11 months without configuration changes.

This should have the pleasant side effect of letting us drop the goofy hard-coded internal rate limiting IP list.

Sep 10 2018, 8:44 PM · Ops, Phacility
epriestley added a comment to T13199: All "web" host rate limited ELB `/status/` requests simultaneously after 11 months without configuration changes.

There are four rate limits, and I don't currently have enough information to figure out which one triggered. The rate limits are:

Sep 10 2018, 8:41 PM · Ops, Phacility
epriestley triaged T13199: All "web" host rate limited ELB `/status/` requests simultaneously after 11 months without configuration changes as High priority.
Sep 10 2018, 8:25 PM · Ops, Phacility

Sep 5 2018

epriestley moved T13076: Plans: Phacility cluster caching, renaming, and rebalance/compaction from Tentative to Soon on the Plans board.
Sep 5 2018, 1:34 PM · Plans, Ops, Infrastructure, Phacility

Aug 25 2018

epriestley closed T13183: AWS is rebooting instances in late August 2018 as Resolved.

That one seemed straightforward.

Aug 25 2018, 1:57 PM · Ops, Phacility
epriestley added a comment to T13183: AWS is rebooting instances in late August 2018.

Doing admin001 now.

Aug 25 2018, 1:52 PM · Ops, Phacility
epriestley added a comment to T13183: AWS is rebooting instances in late August 2018.

Kicking secure001 now.

Aug 25 2018, 1:52 PM · Ops, Phacility
epriestley added a comment to T13183: AWS is rebooting instances in late August 2018.

(It not being covered is covered by T12879.)

Aug 25 2018, 1:38 PM · Ops, Phacility
epriestley added a comment to T13183: AWS is rebooting instances in late August 2018.

I think the only thing on secure or admin which isn't properly covered by deploy automation is the crontab on secure001:

Aug 25 2018, 1:37 PM · Ops, Phacility
epriestley added a comment to T13183: AWS is rebooting instances in late August 2018.

I'm going to do admin001 and secure001 today.

Aug 25 2018, 1:19 PM · Ops, Phacility

Aug 18 2018

epriestley added a comment to T13183: AWS is rebooting instances in late August 2018.

Think I got through the easy ones without any issues. I suspect admin and secure may be a little more involved so I'm going to leave the cat in the bag for the moment.

Aug 18 2018, 9:21 PM · Ops, Phacility
epriestley updated the task description for T13183: AWS is rebooting instances in late August 2018.
Aug 18 2018, 9:10 PM · Ops, Phacility
epriestley updated the task description for T13183: AWS is rebooting instances in late August 2018.
Aug 18 2018, 9:04 PM · Ops, Phacility
epriestley updated the task description for T13183: AWS is rebooting instances in late August 2018.
Aug 18 2018, 9:00 PM · Ops, Phacility
epriestley added a comment to T13183: AWS is rebooting instances in late August 2018.

I'm going to stop/start at least some of these now.

Aug 18 2018, 8:57 PM · Ops, Phacility

Aug 15 2018

epriestley updated the task description for T13183: AWS is rebooting instances in late August 2018.
Aug 15 2018, 6:43 PM · Ops, Phacility
epriestley added a comment to T13062: Trying to manage anything in Gsuite is kind of not great?.

Oh, right, "meetings". I've heard of those!

Aug 15 2018, 4:56 PM · Phacility, Ops
epriestley updated the task description for T13183: AWS is rebooting instances in late August 2018.
Aug 15 2018, 4:56 PM · Ops, Phacility
20after4 added a comment to T13062: Trying to manage anything in Gsuite is kind of not great?.
Aug 15 2018, 3:04 AM · Phacility, Ops

Aug 13 2018

amckinley closed T13185: AWS Reboots Part 2, Electric Boogaloo as Resolved.

Hahah you beat me to it!

Aug 13 2018, 9:25 PM · Phacility, Ops
amckinley claimed T13183: AWS is rebooting instances in late August 2018.
Aug 13 2018, 9:25 PM · Ops, Phacility
epriestley added a comment to T13185: AWS Reboots Part 2, Electric Boogaloo.

I think T13183 is the exact same hosts. :)

Aug 13 2018, 9:24 PM · Phacility, Ops
amckinley added a comment to T13185: AWS Reboots Part 2, Electric Boogaloo.

Same as T13183?

Aug 13 2018, 9:24 PM · Phacility, Ops
epriestley added a comment to T13185: AWS Reboots Part 2, Electric Boogaloo.

Same as T13183?

Aug 13 2018, 8:45 PM · Phacility, Ops
amckinley triaged T13185: AWS Reboots Part 2, Electric Boogaloo as Normal priority.
Aug 13 2018, 7:47 PM · Phacility, Ops
epriestley updated the task description for T13183: AWS is rebooting instances in late August 2018.
Aug 13 2018, 6:14 PM · Ops, Phacility