Page MenuHomePhabricator
Feed Advanced Search

Feb 13 2024

epriestley closed T13701: Perform a large data export from Phacility as Resolved.

(Of course, it'll probably just work the first time now...)

Feb 13 2024, 8:04 PM · Files, Phacility

Feb 12 2024

epriestley added a comment to T13701: Perform a large data export from Phacility.

The export process is already robust at a coarse level: the dump is retained on disk and the process can be retried at the "upload the whole file again" level, then picked up with bin/host export using the --database or --database-file flags (probably with --keep-file).

Feb 12 2024, 11:29 PM · Files, Phacility
epriestley added a comment to T13701: Perform a large data export from Phacility.

The (anonymized) error the process encountered occurred while transferring the dump to central storage was:

Feb 12 2024, 11:19 PM · Files, Phacility
epriestley triaged T13701: Perform a large data export from Phacility as Low priority.
Feb 12 2024, 11:10 PM · Files, Phacility

Nov 13 2023

epriestley updated the task description for T13700: Notes to Self, Late 2023.
Nov 13 2023, 7:10 PM · Phacility
epriestley closed T13700: Notes to Self, Late 2023 as Resolved.

See D21862.

Nov 13 2023, 7:10 PM · Phacility
epriestley added a revision to T13700: Notes to Self, Late 2023: D21875: Correct Aphlict websocket URI construction after PHP8 compatibility changes.
Nov 13 2023, 7:00 PM · Phacility
epriestley added a comment to T13700: Notes to Self, Late 2023.

Next issue: can't pull from secure.

Nov 13 2023, 6:49 PM · Phacility
epriestley added a comment to T13700: Notes to Self, Late 2023.

Issue 3:

Nov 13 2023, 6:44 PM · Phacility
epriestley added a revision to T13700: Notes to Self, Late 2023: Restricted Differential Revision.
Nov 13 2023, 6:43 PM · Phacility
epriestley added a comment to T13700: Notes to Self, Late 2023.

With bin/provision events working again:

Nov 13 2023, 6:29 PM · Phacility
epriestley triaged T13700: Notes to Self, Late 2023 as Wishlist priority.
Nov 13 2023, 6:28 PM · Phacility

Oct 26 2022

epriestley closed T13686: Disable Ubuntu unattended upgrades as Resolved.

I patched and partially deployed this in early August. Another unattended MySQL upgrade went out on Monday night, also didn't restart MySQL on affected hosts, and caused some downtime on hosts that didn't have the patch (to "disable unattended upgrades"). I've now deployed this everywhere, and am presuming this is fixed until evidence arises to the contrary.

Oct 26 2022, 7:53 PM · Phacility

Oct 25 2022

epriestley added a comment to T13686: Disable Ubuntu unattended upgrades.

See also PHI2219, PHI2220.

Oct 25 2022, 12:17 PM · Phacility

Jul 29 2022

epriestley triaged T13686: Disable Ubuntu unattended upgrades as Normal priority.
Jul 29 2022, 12:16 PM · Phacility

Apr 20 2022

epriestley lowered the priority of T11132: New Phabricator NUX from High to Wishlist.
Apr 20 2022, 10:43 PM · Design, Phacility, NUX
epriestley lowered the priority of T11456: Don't lose user in NUX flow because of Timezone issues from High to Wishlist.
Apr 20 2022, 10:43 PM · Design, Phacility, NUX
epriestley closed T10847: 30GB Phacility instance caused a series of cascading failures which left web services unreachable as Resolved.

There's nothing particularly useful or actionable here now, so closing it out. (I believe this was the most severe incident Phacility ever experienced while actively maintained.)

Apr 20 2022, 10:43 PM · Ops, Phacility
epriestley closed T12610: Audit behavior of LB healthchecks against *.phacility.com and secure.phabricator.com as Wontfix.

This hasn't caused any more problems in like 4 years, so I guess it's kind of whatever.

Apr 20 2022, 10:30 PM · Ops, Phacility
epriestley closed T13537: Support local port forwarding through Phacility cluster bastion hosts as Wontfix.

This isn't really resolved, but almost certainly does not make sense to pursue given the Phacility wind-down.

Apr 20 2022, 9:09 PM · Phacility
epriestley closed T13630: Move Phacility provisioning to Piledriver as Resolved.

Almost every host currently in production was provisioned with Piledriver and things have been stable for quite a while, so I'm calling this resolved. See elsewhere for issues with Ubuntu20, mail, etc.

Apr 20 2022, 7:10 PM · Almanac, Infrastructure, Phacility
epriestley closed T13641: Support "Disabled" devices in Almanac, a subtask of T13630: Move Phacility provisioning to Piledriver, as Resolved.
Apr 20 2022, 6:39 PM · Almanac, Infrastructure, Phacility
epriestley closed T13646: Add "E" to "variables_order" in Phacility environments as Resolved.

Moved the rest of this to T13640.

Apr 20 2022, 6:36 PM · Phacility, Infrastructure

Apr 19 2022

epriestley closed T13661: Give Phame configurable interact policies as Resolved.

I deployed this and it seems to be working properly.

Apr 19 2022, 9:07 PM · Phacility, Phame
epriestley closed T13674: Ubuntu20 systemd restart script does not reliably execute on Ubuntu20/m4 chassis hosts as Resolved.

Hey, it worked once. Good enough for me!

Apr 19 2022, 5:55 PM · Phacility
epriestley added a comment to T13674: Ubuntu20 systemd restart script does not reliably execute on Ubuntu20/m4 chassis hosts.

No dice. We need bin/upgrade to run before mysql because it has to mount the data volume. So now I'm trying this:

Apr 19 2022, 5:48 PM · Phacility
epriestley added a comment to T13674: Ubuntu20 systemd restart script does not reliably execute on Ubuntu20/m4 chassis hosts.

... service ... start rather than service ... restart ...

Apr 19 2022, 5:27 PM · Phacility
epriestley added a comment to T13674: Ubuntu20 systemd restart script does not reliably execute on Ubuntu20/m4 chassis hosts.

...probably tested...

Apr 19 2022, 5:22 PM · Phacility
epriestley triaged T13674: Ubuntu20 systemd restart script does not reliably execute on Ubuntu20/m4 chassis hosts as Low priority.
Apr 19 2022, 4:30 PM · Phacility

Apr 1 2022

epriestley added a comment to T13661: Give Phame configurable interact policies.

This has some rough edges that I'm not going to deal with for now:

Apr 1 2022, 7:52 PM · Phacility, Phame
epriestley added a revision to T13661: Give Phame configurable interact policies: D21755: Improve some UI/language for Phame posts when viewer doesn't have CAN_INTERACT.
Apr 1 2022, 7:49 PM · Phacility, Phame
epriestley added a revision to T13661: Give Phame configurable interact policies: D21754: Give Phame blog posts configurable interact policies, with a default policy of "Same as Blog".
Apr 1 2022, 7:41 PM · Phacility, Phame
epriestley added a revision to T13661: Give Phame configurable interact policies: D21753: Remove unused "MARKUP_FIELD_SUMMARY" for Phame posts.
Apr 1 2022, 7:16 PM · Phacility, Phame
epriestley added a revision to T13661: Give Phame configurable interact policies: D21751: Give Phame blogs mutable interact policies.
Apr 1 2022, 7:05 PM · Phacility, Phame
epriestley added a revision to T13661: Give Phame configurable interact policies: D21750: Fix double-bordered breadcrumbs in Phame blogs.
Apr 1 2022, 6:48 PM · Phacility, Phame
epriestley added a revision to T13661: Give Phame configurable interact policies: D21749: Remove ancient Remarkup constants from Phame and Maniphest.
Apr 1 2022, 6:46 PM · Phacility, Phame
epriestley added a revision to T13661: Give Phame configurable interact policies: D21748: Make Phame blog policies non-nullable.
Apr 1 2022, 6:43 PM · Phacility, Phame

Dec 19 2021

epriestley closed T11230: Phacility: Private Clusters as Wontfix.

See T12847. All the technical parts of this are now solved except for billing, but since Phacility is winding down I no longer plan to pursue it.

Dec 19 2021, 8:45 PM · Phacility
epriestley closed T8688: Attach and initialize backup volumes during `remote deploy` workflow as Resolved.

I resolved this in rCORE320b2854.

Dec 19 2021, 8:43 PM · Phacility
epriestley closed T12847: A Pathway Towards Private Clusters as Wontfix.

After T13630:

Dec 19 2021, 8:39 PM · Plans, Ops, Phacility
epriestley closed T12847: A Pathway Towards Private Clusters, a subtask of T11230: Phacility: Private Clusters, as Wontfix.
Dec 19 2021, 8:39 PM · Phacility
epriestley closed T13601: Support "SCA" / "3D Secure 2" in billing workflows as Wontfix.

Only one instance was impacted by this and I just credited them until 2099. I don't currently expect to pursue this.

Dec 19 2021, 8:26 PM · Phacility, Phortune
epriestley closed T13610: Support per-node billing for hosted Phacility instances as Wontfix.

I no longer expect to pursue this.

Dec 19 2021, 8:25 PM · Phortune, Phacility
epriestley closed T13618: When a Phacility "rbak" device does not exist, backups can run twice and converge to a "successful" but inconsistent state as Wontfix.
  • Hosts in the repo class are now build by Piledriver (see T13630), which automatically creates the rbak device entries, so this error isn't likely to occur again.
  • I also don't expect to launch any more hosts.
Dec 19 2021, 8:25 PM · Phacility
epriestley closed T13654: Wind Down Phacility Operations as Resolved.

I compacted secure onto new hardware (T13671) and shut down saux001 ("Land Revision") and sbuild001 (Harbormaster remote builds). I think all the remaining work is covered under T13630 (largely, just a handful of large database migrations remain).

Dec 19 2021, 8:23 PM · Phacility
epriestley closed T13671: Merge "secure003.phacility.net" into "secure001.phacility.net", then migrate to "m4.large" hardware as Resolved.

I just swapped configs over without merging the LBs, since it wasn't immediately obvious to me what the Application vs Classic state of the world is and swapping was good enough.

Dec 19 2021, 7:35 PM · Phacility
epriestley added a comment to T13671: Merge "secure003.phacility.net" into "secure001.phacility.net", then migrate to "m4.large" hardware.

The aphlict/notify stuff still needs to be tweaked. I think the snlb + slb setup can be merged into a single slb with "TCP (Secure)" forwarding now.

Dec 19 2021, 4:19 AM · Phacility
epriestley added a comment to T13671: Merge "secure003.phacility.net" into "secure001.phacility.net", then migrate to "m4.large" hardware.

Databases are moved and secure is out of read-only mode. I'm going to adjust repository configuration, then I should be able to tear down secure001.

Dec 19 2021, 4:11 AM · Phacility
epriestley added a comment to T13671: Merge "secure003.phacility.net" into "secure001.phacility.net", then migrate to "m4.large" hardware.

I'm going to put secure back into read-only mode now and move the databases to the new host.

Dec 19 2021, 12:08 AM · Phacility
epriestley added a comment to T13671: Merge "secure003.phacility.net" into "secure001.phacility.net", then migrate to "m4.large" hardware.

I brought up the new host and pointed the slb001 load balancer at it. The database is still on the old host, and the new host doesn't have repositories yet, but the basics seem to be working.

Dec 19 2021, 12:07 AM · Phacility

Dec 18 2021

epriestley added a comment to T13671: Merge "secure003.phacility.net" into "secure001.phacility.net", then migrate to "m4.large" hardware.

Merging 003 into 001 worked fine with a few expected tricks (e.g., when secure is in read-only mode, you can't push a change to take it out of read-only mode, since pushing is a write). Next up is launching a modern m4.large secure-pool host and then migrating the data.

Dec 18 2021, 7:57 PM · Phacility
epriestley added a comment to T13671: Merge "secure003.phacility.net" into "secure001.phacility.net", then migrate to "m4.large" hardware.

I'm putting secure into read-only mode now, with the intent of completing steps 1-5 above.

Dec 18 2021, 7:13 PM · Phacility
epriestley added a revision to T13671: Merge "secure003.phacility.net" into "secure001.phacility.net", then migrate to "m4.large" hardware: D21745: Add a "--database <name> ..." flag to "bin/storage dump".
Dec 18 2021, 7:06 PM · Phacility

Dec 17 2021

epriestley triaged T13671: Merge "secure003.phacility.net" into "secure001.phacility.net", then migrate to "m4.large" hardware as Low priority.
Dec 17 2021, 7:32 PM · Phacility

Dec 16 2021

epriestley lowered the priority of T13655: Provide a formal "destroyed" status for Phacility instances from Normal to Wishlist.

It would still be nice to have this from a completeness/correctness perspective, but other changes have made it less valuable:

Dec 16 2021, 2:58 PM · Phacility

Dec 11 2021

epriestley added a comment to T13630: Move Phacility provisioning to Piledriver.

I put all the database migration stuff everywhere and it appears stable. I'm hooking up Postmark as an outbound pathway now. If I get that working, I'll let it sit for a while and start migrating databases.

Dec 11 2021, 5:43 PM · Almanac, Infrastructure, Phacility

Dec 10 2021

epriestley added a comment to T13630: Move Phacility provisioning to Piledriver.

Finally, there are other some MySQL version issues which can be avoided with:

Dec 10 2021, 6:22 PM · Almanac, Infrastructure, Phacility

Dec 9 2021

epriestley added a comment to T13630: Move Phacility provisioning to Piledriver.

The new core/ support for the API is partially deployed; the new services/ support isn't anywhere yet.

Dec 9 2021, 11:13 PM · Almanac, Infrastructure, Phacility

Dec 8 2021

epriestley edited the content of Migrating Repository Shards.
Dec 8 2021, 9:39 PM · Phacility

Dec 4 2021

epriestley added a comment to T13630: Move Phacility provisioning to Piledriver.

The latest version of Phabricator itself is everywhere.

Dec 4 2021, 11:46 PM · Almanac, Infrastructure, Phacility
epriestley added a comment to T13630: Move Phacility provisioning to Piledriver.

I'm going to hold it until the weekend and try deploying then if things look calm on my end.

Dec 4 2021, 9:23 PM · Almanac, Infrastructure, Phacility

Dec 2 2021

epriestley added a comment to T13037: An attacker gained staff access to Mailgun and was able to read customer API keys.

I'm satisfied that we aren't violating our commitment to our customers by continuing to use Mailgun as a service provider...

Dec 2 2021, 10:39 PM · Phacility, Security, Mail

Dec 1 2021

epriestley added a comment to T13630: Move Phacility provisioning to Piledriver.

While waiting to deploy db stuff, I was planning to look at pruning dead data out of S3 -- but, on closer examination, the total S3 bill is something like $1/day, so no priority on that whatsoever.

Dec 1 2021, 11:57 PM · Almanac, Infrastructure, Phacility
epriestley added a comment to T13630: Move Phacility provisioning to Piledriver.

Piledriver also needs to be able to provision database hosts, but these are more-or-less a trivial subset of repository hosts.

Dec 1 2021, 11:47 PM · Almanac, Infrastructure, Phacility
epriestley added a comment to T13630: Move Phacility provisioning to Piledriver.
  • Make InstancesStateQuery use a dictionary when building the database ref information internally.
Dec 1 2021, 11:06 PM · Almanac, Infrastructure, Phacility
epriestley added a revision to T13630: Move Phacility provisioning to Piledriver: Restricted Differential Revision.
Dec 1 2021, 11:03 PM · Almanac, Infrastructure, Phacility
epriestley added a revision to T13630: Move Phacility provisioning to Piledriver: Restricted Differential Revision.
Dec 1 2021, 10:44 PM · Almanac, Infrastructure, Phacility
epriestley added a revision to T13630: Move Phacility provisioning to Piledriver: Restricted Differential Revision.
Dec 1 2021, 9:34 PM · Almanac, Infrastructure, Phacility
epriestley added a revision to T13630: Move Phacility provisioning to Piledriver: Restricted Differential Revision.
Dec 1 2021, 9:25 PM · Almanac, Infrastructure, Phacility
epriestley added a comment to T13630: Move Phacility provisioning to Piledriver.

Piledriver also needs to be able to provision database hosts, but these are more-or-less a trivial subset of repository hosts.

Dec 1 2021, 8:44 PM · Almanac, Infrastructure, Phacility
epriestley added a comment to T13630: Move Phacility provisioning to Piledriver.

I completed all the repository migrations over the weekend and seemingly haven't run into any issues.

Dec 1 2021, 8:41 PM · Almanac, Infrastructure, Phacility

Nov 22 2021

epriestley closed T13653: After an AWS event, Phacility hosts may come up with swap only partially configured as Resolved.

This appears resolved: the workflow now tests that /proc/meminfo reports an appropriate value for TotalSwap.

Nov 22 2021, 2:02 PM · Phacility
epriestley added a revision to T13653: After an AWS event, Phacility hosts may come up with swap only partially configured: D21733: Provide an API for parsing swap information from "/proc/meminfo".
Nov 22 2021, 1:30 PM · Phacility

Nov 21 2021

epriestley added a comment to T13630: Move Phacility provisioning to Piledriver.

Just for completeness, vault used to be an HAProxy host serving as an SSH load balancer, but this responsibility moved to lb001 once ELBs became able to listen on inbound port 22 and TCP forward, so there is no longer a vault class of machines.

Nov 21 2021, 3:55 PM · Almanac, Infrastructure, Phacility

Nov 20 2021

epriestley added a comment to T13630: Move Phacility provisioning to Piledriver.

The new provisioning process for repository shards is:

Nov 20 2021, 9:02 PM · Almanac, Infrastructure, Phacility

Nov 19 2021

epriestley added a revision to T13630: Move Phacility provisioning to Piledriver: D21732: Allow "PhutilAWSException" to identify "EBS: Not Found" errors.
Nov 19 2021, 10:27 PM · Almanac, Infrastructure, Phacility
epriestley added a revision to T13630: Move Phacility provisioning to Piledriver: Restricted Differential Revision.
Nov 19 2021, 10:24 PM · Almanac, Infrastructure, Phacility
epriestley added a comment to T13630: Move Phacility provisioning to Piledriver.

Piledriver was built before the FutureGraph stuff settled in T11968; it runs into the same general set of sequencing problems and yield would likely be a good approach.

Nov 19 2021, 10:22 PM · Almanac, Infrastructure, Phacility

Nov 18 2021

epriestley added a comment to T13630: Move Phacility provisioning to Piledriver.

I can't figure out how to delete...

Nov 18 2021, 7:24 PM · Almanac, Infrastructure, Phacility
epriestley added a comment to T13630: Move Phacility provisioning to Piledriver.

I got rid of everything I could, and nothing appears to be affected.

Nov 18 2021, 7:20 PM · Almanac, Infrastructure, Phacility
epriestley added a comment to T13630: Move Phacility provisioning to Piledriver.

We have a lot of leftover VPC cruft that I'm going to nuke, notably meta and admin VPCs that (as far as I can tell) have nothing in them, and then a bunch of subnets (meta.private-a, meta.private-b, block-public-222, admin.public-a, admin.public-b, meta.public-a, meta.public-b, block-private-3) and some NGWs etc. I'm like 99% sure this stuff is all leftover from testing years ago and nothing depends on it, but I guess we'll see what happens when I delete all of it.

Nov 18 2021, 6:55 PM · Almanac, Infrastructure, Phacility
epriestley added a comment to T13630: Move Phacility provisioning to Piledriver.

Here's the last known state of the world from T12816:

Nov 18 2021, 6:49 PM · Almanac, Infrastructure, Phacility
epriestley added a revision to T13630: Move Phacility provisioning to Piledriver: Restricted Differential Revision.
Nov 18 2021, 6:26 PM · Almanac, Infrastructure, Phacility
epriestley added a revision to T13630: Move Phacility provisioning to Piledriver: Restricted Differential Revision.
Nov 18 2021, 6:21 PM · Almanac, Infrastructure, Phacility
epriestley added a revision to T13630: Move Phacility provisioning to Piledriver: Restricted Differential Revision.
Nov 18 2021, 5:15 PM · Almanac, Infrastructure, Phacility

Nov 17 2021

epriestley added a comment to T13630: Move Phacility provisioning to Piledriver.

See also NAT carryover from T12816, via T13542.

Nov 17 2021, 8:02 PM · Almanac, Infrastructure, Phacility
epriestley closed T13542: Rebalance Phacility instances into a private subnet as Resolved.

Closing this in favor of T13630, which covers the same ground.

Nov 17 2021, 8:02 PM · Phacility

Nov 15 2021

epriestley added a comment to T13654: Wind Down Phacility Operations.

I'm planning to simply delete the Discourse forum without preserving any content.

Nov 15 2021, 4:22 PM · Phacility

Jul 21 2021

epriestley updated the task description for T13661: Give Phame configurable interact policies.
Jul 21 2021, 9:26 PM · Phacility, Phame
epriestley triaged T13661: Give Phame configurable interact policies as Low priority.
Jul 21 2021, 9:25 PM · Phacility, Phame

Jul 9 2021

epriestley closed T13656: Automate the Phacility export process, as a support action as Resolved.

For now, this has been working fine as a simple CLI flow.

Jul 9 2021, 4:39 PM · Phacility

Jun 1 2021

epriestley added a comment to T13656: Automate the Phacility export process, as a support action.

This is approximately working now, although the "button" is currently this mess:

Jun 1 2021, 7:34 PM · Phacility
epriestley closed T7148: Allow users to export their data from Phacility as Resolved.

See T13656 for followup.

Jun 1 2021, 7:32 PM · Phacility
epriestley closed T7148: Allow users to export their data from Phacility, a subtask of T7146: Improve administrative UI around backup management, as Resolved.
Jun 1 2021, 7:32 PM · Phacility
epriestley closed T7148: Allow users to export their data from Phacility, a subtask of T9303: Improve Phacility Onboarding/NUX, as Resolved.
Jun 1 2021, 7:32 PM · Phacility
epriestley added a revision to T13656: Automate the Phacility export process, as a support action: Restricted Differential Revision.
Jun 1 2021, 7:09 PM · Phacility
epriestley added a comment to T13655: Provide a formal "destroyed" status for Phacility instances.

Instances technically have a formal "Deleted" status -- but it isn't really used by anything, nothing ever puts them into that status, and there are no instances in that status. For consistency with existing CLI workflows, I'm going to rename this to "Destroyed".

Jun 1 2021, 6:14 PM · Phacility
epriestley triaged T13656: Automate the Phacility export process, as a support action as Normal priority.
Jun 1 2021, 5:40 PM · Phacility
epriestley added a comment to T13655: Provide a formal "destroyed" status for Phacility instances.

A related issue is that I think nothing currently destroys S3 data. For most instances this isn't significant, but it isn't helping anything. This should likely be part of the database destruction step, although it can probably interact with the S3 bucket directly.

Jun 1 2021, 3:39 PM · Phacility
epriestley triaged T13655: Provide a formal "destroyed" status for Phacility instances as Normal priority.
Jun 1 2021, 3:37 PM · Phacility