Changeset View
Changeset View
Standalone View
Standalone View
src/docs/user/cluster/cluster_partitioning.diviner
- This file was added.
@title Cluster: Partitioning and Advanced Configuration | |||||
@group cluster | |||||
Guide to partitioning Phabricator applications across multiple database hosts. | |||||
Overview | |||||
======== | |||||
WARNING: Partitioning is a prototype. | |||||
You can partition Phabricator's applications across multiple databases. For | |||||
example, you can move an application like Files or Maniphest to a dedicated | |||||
database host. | |||||
The advantages of doing this are: | |||||
- moving heavily used applications to dedicated hardware can help you | |||||
scale; and | |||||
- you can match application workloads to hardware or configuration to make | |||||
operating the cluster easier. | |||||
This configuration is complex, and very few installs need to pursue it. | |||||
Phabricator will normally run comfortably with a single database master even | |||||
for large organizations. | |||||
Partitioning generally does not do much to increase resiliance or make it | |||||
easier to recover from disasters, and is primarily a mechanism for scaling. | |||||
If you are considering partitioning, you likely want to configure replication | |||||
with a single master first. Even if you choose not to deploy replication, you | |||||
should review and understand how replication works before you partition. For | |||||
details, see @{Cluster:Databases}. | |||||
What Partitioning Does | |||||
====================== | |||||
When you partition Phabricator, you move all of the data for one or more | |||||
applications (like Maniphest) to a new master database host. This is possible | |||||
because Phabricator stores data for each application in its own logical | |||||
database (like `phabricator_maniphest`) and performs no joins between databases. | |||||
If you're running into scale limits on a single master database, you can move | |||||
one or more of your most commonly-used applications to a second database host | |||||
and continue adding users. You can keep partitioning applications until all | |||||
heavily used applications have dedicated database servers. | |||||
Alternatively or additionally, you can partition applications to make operating | |||||
the cluster easier. Some applications have unusual workloads or requirements, | |||||
and moving them to separate hosts may make things easier to deal with overall. | |||||
For example: if Files accounts for most of the data on your install, you might | |||||
move it to a different host to make backing up everything else easier. | |||||
Configuration Overview | |||||
====================== | |||||
To configure partitioning, you will add multiple entries to `cluster.databases` | |||||
with the `master` role. Each `master` should specify a new `partition` key, | |||||
which contains a list of application databases it should host. | |||||
One master may be specified as the `default` partition. Applications not | |||||
explicitly configured to be assigned elsewhere will be assigned here. | |||||
When you define multiple `master` databases, you must also specify which master | |||||
each `replica` database follows. Here's a simple example config: | |||||
```lang=json | |||||
... | |||||
"cluster.databases": [ | |||||
{ | |||||
"host": "db001.corporation.com", | |||||
"role": "master", | |||||
"user": "phabricator", | |||||
"pass": "hunter2!trustno1", | |||||
"port": 3306, | |||||
"partition": [ | |||||
"default" | |||||
] | |||||
}, | |||||
{ | |||||
"host": "db002.corporation.com", | |||||
"role": "replica", | |||||
"user": "phabricator", | |||||
"pass": "hunter2!trustno1", | |||||
"port": 3306, | |||||
"master": "db001.corporation.com:3306" | |||||
}, | |||||
{ | |||||
"host": "db003.corporation.com", | |||||
"role": "master", | |||||
"user": "phabricator", | |||||
"pass": "hunter2!trustno1", | |||||
"port": 3306, | |||||
"partition": [ | |||||
"file", | |||||
"passphrase", | |||||
"slowvote" | |||||
] | |||||
}, | |||||
{ | |||||
"host": "db004.corporation.com", | |||||
"role": "replica", | |||||
"user": "phabricator", | |||||
"pass": "hunter2!trustno1", | |||||
"port": 3306, | |||||
"master": "db003.corporation.com:3306" | |||||
} | |||||
], | |||||
... | |||||
``` | |||||
In this configuration, `db001` is a master and `db002` replicates it. | |||||
`db003` is a second master, replicated by `db004`. | |||||
Applications have been partitioned like this: | |||||
- `db003`/`db004`: Files, Passphrase, Slowvote | |||||
- `db001`/`db002`: Default (all other applications) | |||||
Not all of the database partition names are the same as the application | |||||
names. You can get a list of databases with `bin/storage databases` to identify | |||||
the correct database names. | |||||
Launching a new Partition | |||||
========================= | |||||
To add a new partition, follow these steps: | |||||
- Set up the new database host or hosts. | |||||
- Add the new database to `cluster.database`, but keep its "partition" | |||||
configuration empty (just an empty list). If this is the first time you | |||||
are partitioning, you will need to configure your existing master as the | |||||
new "default". This will let Phabricator interact with it, but won't send | |||||
any traffic to it yet. | |||||
- Run `bin/storage upgrade` to initialize the schemata on the new hosts. | |||||
- Stop writes to the applications you want to move by putting Phabricator | |||||
in read-only mode, or shutting down the webserver and daemons, or telling | |||||
everyone not to touch anything. | |||||
- Dump the data from the application databases on the old master. | |||||
- Load the data into the application databases on the new master. | |||||
- Reconfigure the "partition" setup so that Phabricator knows the databases | |||||
have moved. | |||||
- While still in read-only mode, check that all the data appears to be | |||||
intact. | |||||
- Resume writes. | |||||
You can do this with a small, rarely-used application first (on most installs, | |||||
Slowvote might be a good candidate) if you want to run through the process | |||||
end-to-end before performing a larger, higher-stakes migration. | |||||
How Partitioning Works | |||||
====================== | |||||
If you have multiple masters, Phabricator keeps the entire set of schemata up | |||||
to date on all of them. When you run `bin/storage upgrade` or other storage | |||||
management commands, they generally affect all masters (if they do not, they | |||||
will prompt you to be more specific). | |||||
When the application goes to read or write normal data (for example, to query a | |||||
list of tasks) it only connects to the master which the application it is | |||||
acting on behalf of is assigned to. | |||||
In most cases, a masters will not have any data in most the databases which are | |||||
not assigned to it. If they do (for example, because they previously hosted the | |||||
application) the data is ignored. This approach (of maintaining all schemata on | |||||
all hosts) makes it easier to move data and to quickly revert changes if a | |||||
configuration mistake occurs. | |||||
There are some exceptions to this rule. For example, all masters keep track | |||||
of which patches have been applied to that particular master so that | |||||
`bin/storage upgrade` can upgrade hosts correctly. | |||||
Phabricator does not perform joins across logical databases, so there are no | |||||
meaningful differences in runtime behavior if two applications are on the same | |||||
physical host or different physical hosts. | |||||
Next Steps | |||||
========== | |||||
Continue by: | |||||
- returning to @{article:Clustering Introduction}. |