diff --git a/src/applications/almanac/application/PhabricatorAlmanacApplication.php b/src/applications/almanac/application/PhabricatorAlmanacApplication.php --- a/src/applications/almanac/application/PhabricatorAlmanacApplication.php +++ b/src/applications/almanac/application/PhabricatorAlmanacApplication.php @@ -83,8 +83,7 @@ phutil_tag( 'a', array( - 'href' => PhabricatorEnv::getDoclink( - 'User Guide: Phabricator Clusters'), + 'href' => PhabricatorEnv::getDoclink('Clustering Introduction'), 'target' => '_blank', ), pht('Learn More'))); diff --git a/src/applications/almanac/controller/AlmanacController.php b/src/applications/almanac/controller/AlmanacController.php --- a/src/applications/almanac/controller/AlmanacController.php +++ b/src/applications/almanac/controller/AlmanacController.php @@ -178,7 +178,7 @@ 'a', array( 'href' => PhabricatorEnv::getDoclink( - 'User Guide: Phabricator Clusters'), + 'Clustering Introduction'), 'target' => '_blank', ), pht('Learn More')); diff --git a/src/docs/user/cluster/cluster.diviner b/src/docs/user/cluster/cluster.diviner --- a/src/docs/user/cluster/cluster.diviner +++ b/src/docs/user/cluster/cluster.diviner @@ -26,6 +26,9 @@ The remainder of this document summarizes how to add redundancy to each service and where your efforts are likely to have the greatest impact. +For additional guidance on setting up a cluster, see "Overlaying Services" +and "Cluster Recipes" at the bottom of this document. + Cluster: Databases ================= @@ -44,7 +47,8 @@ Cluster: Repositories ===================== -Configuring multiple repository hosts is complex. +Configuring multiple repository hosts is complex, but is required before you +can add multiple daemon or web hosts. Repository replicas are important for availability if you host repositories on Phabricator, but less important if you host repositories elsewhere @@ -55,3 +59,123 @@ the entire history. For details, see @{article:Cluster: Repositories}. + + +Cluster: Daemons +================ + +Configuring multiple daemon hosts is straightforward, but you must configure +repositories first. + +With daemons running on multiple hosts, you can transparently survive the loss +of any subset of hosts without an interruption to daemon services, as long as +at least one host remains alive. Daemons are stateless, so spreading daemons +across multiple hosts provides no resistance to data loss. + +For details, see @{article:Cluster: Daemons}. + + +Cluster: Web Servers +==================== + +Configuring multiple web hosts is straightforward, but you must configure +repositories first. + +With multiple web hosts, you can transparently survive the loss of any subset +of hosts as long as at least one host remains alive. Web hosts are stateless, +so putting multiple hosts in service provides no resistance to data loss. + +For details, see @{article:Cluster: Web Servers}. + + +Overlaying Services +=================== + +Although hosts can run a single dedicated service type, certain groups of +services work well together. Phabricator clusters usually do not need to be +very large, so deploying a small number of hosts with multiple services is a +good place to start. + +In planning a cluster, consider these blended host types: + +**Everything**: Run HTTP, SSH, MySQL, repositories and daemons on a single +host. This is the starting point for single-node setups, and usually also the +best configuration when adding the second node. + +**Everything Except Databases**: Run HTTP, SSH, repositories and daemons on one +host, and MySQL on a different host. MySQL uses many of the same resources that +other services use. It's also simpler to separate than other services, and +tends to benefit the most from dedicated hardware. + +**Just Databases**: Separating MySQL onto dedicated nodes + +Database nodes tend to benefit the most from + +**Repositories and Daemons**: Run repositories and daemons on the same host. +Repository hosts //must// run daemons, and it normally makes sense to +completely overlay repositories and daemons. These services tend to use +different resources (repositories are heavier on I/O and lighter on CPU/RAM; +daemons are heavier on CPU/RAM and lighter on I/O). + +Repositories and daemons are also both less latency sensitive than other +service types, so there's a wider margin of error for underprovisioning them +before performance is noticably affected. + +These nodes tend to use system resources in a balanced way. Individual nodes +in this class do not need to be particularly powerful. + +**Frontend Servers**: Run HTTP and SSH on the same host. These are easy to set +up, stateless, and you can scale the pool up or down easily to meet demand. +Routing both types of ingress traffic through the same initial tier can +simplify load balancing. + +These nodes tend to need relatively little RAM. + + +Cluster Recipes +=============== + +This section provides some guidance on reasonable ways to scale up a cluster. + +The smallest possible cluster is **two hosts**. Run everything (web, ssh, +database, repositories, and daemons) on each host. One host will serve as the +master; the other will serve as a replica. + +Ideally, you should physically separate these hosts to reduce the chance that a +natural disaster or infrastructure disruption could disable or destroy both +hosts at the same time. + +From here, you can choose how you expand the cluster. + +To improve **scalability and performance**, separate loaded services onto +dedicated hosts and then add more hosts of that type to increase capacity. If +you have a two-node cluster, the best way to improve scalability by adding one +host is likely to separate the master database onto its own host. + +Note that increasing scale may //decrease// availability by leaving you with +too little capacity after a failure. If you have three hosts handling traffic +and one datacenter fails, too much traffic may be sent to the single remaining +host in the surviving datacenter. You can hedge against this by mirroring new +hosts in other datacenters (for example, also separate the replica database +onto its own host). + +After separating databases, separating repository + daemon nodes is likely +the next step. + +To improve **availability**, add another copy of everything you run in one +datacenter to a new datacenter. For example, if you have a two-node cluster, +the best way to improve availability is to run everything on a third host in a +third datacenter. If you have a 6-node cluster with a web node, a database node +and a repo + daemon node in two datacenters, add 3 more nodes to create a copy +of each node in a third datacenter. + +You can continue adding hosts until you run out of hosts. + + +Next Steps +========== + +Continue by: + + - learning how Phacility configures and operates a large, multi-tenant + production cluster in ((cluster)). diff --git a/src/docs/user/cluster/cluster_daemons.diviner b/src/docs/user/cluster/cluster_daemons.diviner new file mode 100644 --- /dev/null +++ b/src/docs/user/cluster/cluster_daemons.diviner @@ -0,0 +1,59 @@ +@title Cluster: Daemons +@group intro + +Configuring Phabricator to use multiple daemon hosts. + +Overview +======== + +WARNING: This feature is a very early prototype; the features this document +describes are mostly speculative fantasy. + +You can run daemons on multiple hosts. The advantages of doing this are: + + - you can completely survive the loss of multiple daemon hosts; and + - worker queue throughput may improve. + +This configuration is simple, but you must configure repositories first. For +details, see @{article:Cluster: Repositories}. + +Since repository hosts must run daemons anyway, you usually do not need to do +any additional work and can skip this entirely. + + +Adding Daemon Hosts +=================== + +After configuring repositories for clustering, launch daemons on every +repository host according to the documentation in +@{article:Cluster: Repositories}. These daemons are necessary: repositories +will not fetch, update, or synchronize properly without them. + +If your repository clustering is redundant (you have at least two repsoitory +hosts), these daemons are also likely to be sufficient in most cases. If you +want to launch additional hosts anyway (for example, to increase queue capacity +for unusual workloads), see "Dedicated Daemon Hosts" below. + + +Dedicated Daemon Hosts +====================== + +You can launch additional daemon hosts without any special configuration. +Daemon hosts must be able to reach other hosts on the network, but do not need +to run any services (like HTTP or SSH). Simply deploy the Phabricator software +and configuration and start the daemons. + +Normally, there is little reason to deploy dedicated daemon hosts. They can +improve queue capacity, but generally do not improve availability or increase +resistance to data loss on their own. Instead, consider deploying more +repository hosts: repository hosts run daemons, so this will increase queue +capacity but also improve repository availability and cluster resistance. + + +Next Steps +========== + +Continue by: + + - returning to @{article:Clustering Introduction}; or + - configuring repositories first with @{article:Cluster: Repositories}. diff --git a/src/docs/user/cluster/cluster_webservers.diviner b/src/docs/user/cluster/cluster_webservers.diviner new file mode 100644 --- /dev/null +++ b/src/docs/user/cluster/cluster_webservers.diviner @@ -0,0 +1,42 @@ +@title Cluster: Web Servers +@group intro + +Configuring Phabricator to use multiple web servers. + +Overview +======== + +WARNING: This feature is a very early prototype; the features this document +describes are mostly speculative fantasy. + +You can run Phabricator on multiple web servers. The advantages of doing this +are: + + - you can completely survive the loss of multiple web hosts; and + - performance and capacity may improve. + +This configuration is simple, but you must configure repositories first. For +details, see @{article:Cluster: Repositories}. + + +Adding Web Hosts +================ + +After configuring repositories in cluster mode, you can add more web hosts +at any time: simply deploy the Phabricator software and configuration to a +host, start the web server, and then add the host to the load balancer pool. + +Phabricator web servers are stateless, so you can pull them in and out of +production freely. + +You may also want to run SSH services on these hosts, since the service is very +similar to HTTP, also stateless, and it may be simpler to load balance the +services together. + + +Next Steps +========== + +Continue by: + + - returning to @{article:Clustering Introduction}. diff --git a/src/docs/user/configuration/cluster.diviner b/src/docs/user/configuration/cluster.diviner deleted file mode 100644 --- a/src/docs/user/configuration/cluster.diviner +++ /dev/null @@ -1,50 +0,0 @@ -@title User Guide: Phabricator Clusters -@group config - -Guide on scaling Phabricator across multiple machines. - -Overview -======== - -IMPORTANT: Phabricator clustering is in its infancy and does not work at all -yet. This document is mostly a placeholder. - -IMPORTANT: DO NOT CONFIGURE CLUSTER SERVICES UNLESS YOU HAVE **TWENTY YEARS OF -EXPERIENCE WITH PHABRICATOR** AND **A MINIMUM OF 17 PHABRICATOR PHDs**. YOU -WILL BREAK YOUR INSTALL AND BE UNABLE TO REPAIR IT. - -See also @{article:Almanac User Guide}. - - -Managing Cluster Configuration -============================== - -Cluster configuration is managed primarily from the **Almanac** application. - -To define cluster services and create or edit cluster configuration, you must -have the **Can Manage Cluster Services** application permission in Almanac. If -you do not have this permission, all cluster services and all connected devices -will be locked and not editable. - -The **Can Manage Cluster Services** permission is stronger than service and -device policies, and overrides them. You can never edit a cluster service if -you don't have this permission, even if the **Can Edit** policy on the service -itself is very permissive. - - -Locking Cluster Configuration -============================= - -IMPORTANT: Managing cluster services is **dangerous** and **fragile**. - -If you make a mistake, you can break your install. Because the install is -broken, you will be unable to load the web interface in order to repair it. - -IMPORTANT: Currently, broken clusters must be repaired by manually fixing them -in the database. There are no instructions available on how to do this, and no -tools to help you. Do not configure cluster services. - -If an attacker gains access to an account with permission to manage cluster -services, they can add devices they control as database servers. These servers -will then receive sensitive data and traffic, and allow the attacker to -escalate their access and completely compromise an install. diff --git a/src/docs/user/configuration/managing_daemons.diviner b/src/docs/user/configuration/managing_daemons.diviner --- a/src/docs/user/configuration/managing_daemons.diviner +++ b/src/docs/user/configuration/managing_daemons.diviner @@ -113,25 +113,16 @@ - See @{article:Diffusion User Guide} for details about tuning the repository daemon. -== Multiple Machines == -If you have multiple machines, you should use `phd launch` to tweak which -daemons launch, and split daemons across machines like this: +Multiple Hosts +============== - - `PhabricatorRepositoryPullLocalDaemon`: Run one copy on any machine. - On each web frontend which is not running a normal copy, run a copy - with the `--no-discovery` flag. - - `PhabricatorTriggerDaemon`: Run one copy on any machine. - - `PhabricatorTaskmasterDaemon`: Run as many copies as you need to keep - tasks from backing up. You can run them all on one machine or split them - across machines. +For information about running daemons on multiple hosts, see +@{article:Cluster: Daemons}. -A gratuitously wasteful install might have a dedicated daemon machine which -runs `phd start` with a large pool of taskmasters set in the config, and then -runs `phd launch PhabricatorRepositoryPullLocalDaemon -- --no-discovery` on each -web server. This is grossly excessive in normal cases. -= Next Steps = +Next Steps +========== Continue by: