Page Menu
Home
Phabricator
Search
Configure Global Search
Log In
Files
F15425650
D15763.id.diff
No One
Temporary
Actions
View File
Edit File
Delete File
View Transforms
Subscribe
Mute Notifications
Award Token
Flag For Later
Size
5 KB
Referenced Files
None
Subscribers
None
D15763.id.diff
View Options
diff --git a/src/docs/user/cluster/cluster_databases.diviner b/src/docs/user/cluster/cluster_databases.diviner
--- a/src/docs/user/cluster/cluster_databases.diviner
+++ b/src/docs/user/cluster/cluster_databases.diviner
@@ -6,31 +6,76 @@
Overview
========
-WARNING: This feature is a very early prototype; the features this document
-describes are mostly speculative fantasy.
-
You can deploy Phabricator with multiple database hosts, configured as a master
and a set of replicas. The advantages of doing this are:
- faster recovery from disasters by promoting a replica;
- - graceful degradation if the master fails;
- - reduced load on the master; and
+ - graceful degradation if the master fails; and
- some tools to help monitor and manage replica health.
This configuration is complex, and many installs do not need to pursue it.
-Phabricator can not currently be configured into a multi-master mode, nor can
-it be configured to automatically promote a replica to become the new master.
-
If you lose the master, Phabricator can degrade automatically into read-only
mode and remain available, but can not fully recover without operational
intervention unless the master recovers on its own.
+Phabricator will not currently send read traffic to replicas unless the master
+has failed, so configuring a replica will not currently spread any load away
+from the master. Future versions of Phabricator are expected to be able to
+distribute some read traffic to replicas.
+
+Phabricator can not currently be configured into a multi-master mode, nor can
+it be configured to automatically promote a replica to become the new master.
+There are no current plans to support multi-master mode or autonomous failover,
+although this may change in the future.
+
Setting up MySQL Replication
============================
-TODO: Write this section.
+To begin, set up a replica database server and configure MySQL replication.
+
+If you aren't sure how to do this, refer to the MySQL manual for instructions.
+The MySQL documentation is comprehensive and walks through the steps and
+options in good detail. You should understand MySQL replication before
+deploying it in production: Phabricator layers on top of it, and does not
+attempt to abstract it away.
+
+Some useful notes for configuring replication for Phabricator:
+
+**Binlog Format**: Phabricator issues some queries which MySQL will detect as
+unsafe if you use the `STATEMENT` binlog format (the default). Instead, use
+`MIXED` (recommended) or `ROW` as the `binlog_format`.
+
+**Grant `REPLICATION CLIENT` Privilege**: If you give the user that Phabricator
+will use to connect to the replica database server the `REPLICATION CLIENT`
+privilege, Phabricator's status console can give you more information about
+replica health and state.
+
+**Copying Data to Replicas**: Phabricator currently uses a mixture of MyISAM
+and InnoDB tables, so it can be difficult to guarantee that a dump is wholly
+consistent and suitable for loading into a replica because MySQL uses different
+consistency mechanisms for the different storage engines.
+
+An approach you may want to consider to limit downtime but still produce a
+consistent dump is to leave Phabricator running but configured in read-only
+mode while dumping:
+
+ - Stop all the daemons.
+ - Set `cluster.read-only` to `true` and deploy the new configuration. The
+ web UI should now show that Phabricator is in "Read Only" mode.
+ - Dump the database. You can do this with `bin/storage dump --for-replica`
+ to add the `--master-data` flag to the underlying command and include a
+ `CHANGE MASTER ...` statement in the dump.
+ - Once the dump finishes, turn `cluster.read-only` off again to restore
+ service. Continue loading the dump into the replica normally.
+
+**Log Expiration**: You can configure MySQL to automatically clean up old
+binary logs on startup with the `expire_logs_days` option. If you do not
+configure this and do not explicitly purge old logs with `PURGE BINARY LOGS`,
+the binary logs on disk will grow unboundedly and relatively quickly.
+
+Once you have a working replica, continue below to tell Phabricator about it.
Configuring Replicas
@@ -207,7 +252,38 @@
Promoting a Replica
===================
-TODO: Write this section.
+If you lose access to the master database, Phabricator will degrade into
+read-only mode. This is described in greater detail below.
+
+The easiest way to get out of read-only mode is to restore the master database.
+If the database recovers on its own or operations staff can revive it,
+Phabricator will return to full working order after a few moments.
+
+If you can't restore the master or are unsure you will be able to restore the
+master quickly, you can promote a replica to become the new master instead.
+
+Before doing this, you should first assess how far behind the master the
+replica was when the link died. Any data which was not replicated will either
+be lost or become very difficult to recover after you promote a replica.
+
+For example, if some `T1234` had been created on the master but had not yet
+replicated and you promote the replica, a new `T1234` may be created on the
+replica after promotion. Even if you can recover the master later, merging
+the data will be difficult because each database may have conflicting changes
+which can not be merged easily.
+
+If there was a significant replication delay at the time of the failure, you
+may wait to try harder or spend more time attempting to recover the master
+before choosing to promote.
+
+If you have made a choice to promote, disable replication on the replica and
+mark it as the `master` in `cluster.databases`. Remove the original master and
+deploy the configuration change to all surviving hosts.
+
+Once write service is restored, you should provision, deploy, and configure a
+new replica by following the steps you took the first time around. You are
+critically vulnerable to a second disruption until you have restored the
+redundancy.
Unreachable Masters
File Metadata
Details
Attached
Mime Type
text/plain
Expires
Mon, Mar 24, 5:16 AM (1 w, 5 d ago)
Storage Engine
blob
Storage Format
Encrypted (AES-256-CBC)
Storage Handle
7713443
Default Alt Text
D15763.id.diff (5 KB)
Attached To
Mode
D15763: Fill in missing cluster database documentation
Attached
Detach File
Event Timeline
Log In to Comment