Apr 20 2022
Almost every host currently in production was provisioned with Piledriver and things have been stable for quite a while, so I'm calling this resolved. See elsewhere for issues with Ubuntu20, mail, etc.
Calling this resolved, since it has been in production in the Phacility cluster for some time and worked correctly through relevant hardware changes.
Dec 11 2021
I put all the database migration stuff everywhere and it appears stable. I'm hooking up Postmark as an outbound pathway now. If I get that working, I'll let it sit for a while and start migrating databases.
Dec 10 2021
Finally, there are other some MySQL version issues which can be avoided with:
Dec 9 2021
The new core/ support for the API is partially deployed; the new services/ support isn't anywhere yet.
Dec 4 2021
The latest version of Phabricator itself is everywhere.
I'm going to hold it until the weekend and try deploying then if things look calm on my end.
Dec 1 2021
While waiting to deploy db stuff, I was planning to look at pruning dead data out of S3 -- but, on closer examination, the total S3 bill is something like $1/day, so no priority on that whatsoever.
Piledriver also needs to be able to provision database hosts, but these are more-or-less a trivial subset of repository hosts.
- Make InstancesStateQuery use a dictionary when building the database ref information internally.
Piledriver also needs to be able to provision database hosts, but these are more-or-less a trivial subset of repository hosts.
I completed all the repository migrations over the weekend and seemingly haven't run into any issues.
Nov 21 2021
Just for completeness, vault used to be an HAProxy host serving as an SSH load balancer, but this responsibility moved to lb001 once ELBs became able to listen on inbound port 22 and TCP forward, so there is no longer a vault class of machines.
Nov 20 2021
The new provisioning process for repository shards is:
Nov 19 2021
Piledriver was built before the FutureGraph stuff settled in T11968; it runs into the same general set of sequencing problems and yield would likely be a good approach.
Nov 18 2021
I can't figure out how to delete...
I got rid of everything I could, and nothing appears to be affected.
We have a lot of leftover VPC cruft that I'm going to nuke, notably meta and admin VPCs that (as far as I can tell) have nothing in them, and then a bunch of subnets (meta.private-a, meta.private-b, block-public-222, admin.public-a, admin.public-b, meta.public-a, meta.public-b, block-private-3) and some NGWs etc. I'm like 99% sure this stuff is all leftover from testing years ago and nothing depends on it, but I guess we'll see what happens when I delete all of it.
Here's the last known state of the world from T12816: