Diviner Phabricator User Docs Cluster: Repositories

Contents
Overview
How Reads and Writes Work
HTTP vs HTTPS
Repository Hosts
Setting up Cluster Services
Migrating to Clustered Services
Expanding a Cluster
Contracting a Cluster
Monitoring Services
Monitoring Repositories
Cluster Failure Modes
Write Interruptions
Loss of Leaders
Ambiguous Leaders
Backups
Ad-Hoc Maintenance Locks
Next Steps

Cluster: Repositories
Phabricator User Documentation (Cluster Configuration)

Configuring Phabricator to use multiple repository hosts.

Overview

If you use Git, you can deploy Phabricator with multiple repository hosts, configured so that each host is readable and writable. The advantages of doing this are:

you can completely survive the loss of repository hosts;
reads and writes can scale across multiple machines; and
read and write performance across multiple geographic regions may improve.

This configuration is complex, and many installs do not need to pursue it.

This configuration is not currently supported with Subversion or Mercurial.

How Reads and Writes Work

Phabricator repository replicas are multi-master: every node is readable and writable, and a cluster of nodes can (almost always) survive the loss of any arbitrary subset of nodes so long as at least one node is still alive.

Phabricator maintains an internal version for each repository, and increments it when the repository is mutated.

Before responding to a read, replicas make sure their version of the repository is up to date (no node in the cluster has a newer version of the repository). If it isn't, they block the read until they can complete a fetch.

Before responding to a write, replicas obtain a global lock, perform the same version check and fetch if necessary, then allow the write to continue.

Additionally, repositories passively check other nodes for updates and replicate changes in the background. After you push a change to a repository, it will usually spread passively to all other repository nodes within a few minutes.

Even if passive replication is slow, the active replication makes acknowledged changes sequential to all observers: after a write is acknowledged, all subsequent reads are guaranteed to see it. The system does not permit stale reads, and you do not need to wait for a replication delay to see a consistent view of the repository no matter which node you ask.

HTTP vs HTTPS

Intracluster requests (from the daemons to repository servers, or from webservers to repository servers) are permitted to use HTTP, even if you have set security.require-https in your configuration.

It is common to terminate SSL at a load balancer and use plain HTTP beyond that, and the security.require-https feature is primarily focused on making client browser behavior more convenient for users, so it does not apply to intracluster traffic.

Using HTTP within the cluster leaves you vulnerable to attackers who can observe traffic within a datacenter, or observe traffic between datacenters. This is normally very difficult, but within reach for state-level adversaries like the NSA.

If you are concerned about these attackers, you can terminate HTTPS on repository hosts and bind to them with the "https" protocol. Just be aware that the security.require-https setting won't prevent you from making configuration mistakes, as it doesn't cover intracluster traffic.

Other mitigations are possible, but securing a network against the NSA and similar agents of other rogue nations is beyond the scope of this document.

Repository Hosts

Repository hosts must run a complete, fully configured copy of Phabricator, including a webserver. They must also run a properly configured sshd.

If you are converting existing hosts into cluster hosts, you may need to revisit Diffusion User Guide: Repository Hosting and make sure the system user accounts have all the necessary sudo permissions. In particular, cluster devices need sudo access to ssh so they can read device keys.

Generally, these hosts will run the same set of services and configuration that web hosts run. If you prefer, you can overlay these services and put web and repository services on the same hosts. See Clustering Introduction for some guidance on overlaying services.

When a user requests information about a repository that can only be satisfied by examining a repository working copy, the webserver receiving the request will make an HTTP service call to a repository server which hosts the repository to retrieve the data it needs. It will use the result of this query to respond to the user.

Setting up Cluster Services

To set up clustering, first register the devices that you want to use as part of the cluster with Almanac. For details, see Cluster: Devices.

NOTE: Once you create a service, new repositories will immediately allocate on it. You may want to disable repository creation during initial setup.

NOTE: To create clustered services, your account must have the "Can Manage Cluster Services" capability. By default, no accounts have this capability, and you must enable it by changing the configuration of the Almanac application. Navigate to the Alamanc application configuration as follows: Home → Applications → Almanac → Configure → Edit Policies → Can Manage Cluster Services

Once the hosts are registered as devices, you can create a new service in Almanac:

First, register at least one device according to the device clustering instructions.
Create a new service of type Phabricator Cluster: Repository in Almanac.
Bind this service to all the interfaces on the device or devices.
For each binding, add a protocol key with one of these values: ssh, http, https.

For example, a service might look like this:

Service: repos001.mycompany.net
- Binding: repo001.mycompany.net:80, protocol=http
- Binding: repo001.mycompany.net:2222, protocol=ssh

The service itself has a closed property. You can set this to true to disable new repository allocations on this service (for example, if it is reaching capacity).

Migrating to Clustered Services

To convert existing repositories on an install into cluster repositories, you will generally perform these steps:

Register the existing host as a cluster device.
Configure a single host repository service using only that host.

This puts you in a transitional state where repositories on the host can work as either on-host repositories or cluster repositories. You can move forward from here slowly and make sure services still work, with a quick path back to safety if you run into trouble.

To move forward, migrate one repository to the service and make sure things work correctly. If you run into issues, you can back out by migrating the repository off the service.

To migrate a repository onto a cluster service, use this command:

$ ./bin/repository clusterize <repository> --service <service>

To migrate a repository back off a service, use this command:

$ ./bin/repository clusterize <repository> --remove-service

This command only changes how Phabricator connects to the repository; it does not move any data or make any complex structural changes.

When Phabricator needs information about a non-clustered repository, it just runs a command like git log directly on disk. When Phabricator needs information about a clustered repository, it instead makes a service call to another server, asking that server to run git log instead.

In a single-host cluster the server will make this service call to itself, so nothing will really change. But this is an effective test for most possible configuration mistakes.

If your canary repository works well, you can migrate the rest of your repositories when ready (you can use bin/repository list to quickly get a list of all repository monograms).

Once all repositories are migrated, you've reached a stable state and can remain here as long as you want. This state is sufficient to convert daemons, SSH, and web services into clustered versions and spread them across multiple machines if those goals are more interesting.

Obviously, your single-device "cluster" will not be able to survive the loss of the single repository host, but you can take as long as you want to expand the cluster and add redundancy.

After creating a service, you do not need to clusterize new repositories: they will automatically allocate onto an open service.

When you're ready to expand the cluster, continue below.

Expanding a Cluster

To expand an existing cluster, follow these general steps:

Register new devices in Almanac.
Add bindings to the new devices to the repository service, also in Almanac.
Start the daemons on the new devices.

For instructions on configuring and registering devices, see Cluster: Devices.

As soon as you add active bindings to a service, Phabricator will begin synchronizing repositories and sending traffic to the new device. You do not need to copy any repository data to the device: Phabricator will automatically synchronize it.

If you have a large amount of repository data, you may want to help this process along by copying the repository directory from an existing cluster device before bringing the new host online. This is optional, but can reduce the amount of time required to fully synchronize the cluster.

You do not need to synchronize the most up-to-date data or stop writes during this process. For example, loading the most recent backup snapshot onto the new device will substantially reduce the amount of data that needs to be synchronized.

Contracting a Cluster

If you want to remove working devices from a cluster (for example, to take hosts down for maintenance), first do this for each device:

Change the writable property on the bindings to "Prevent Writes".
Wait a few moments until the cluster synchronizes (see "Monitoring Services" below).

This will ensure that the device you're about to remove is not the only cluster leader, even if the cluster is receiving a high write volume. You can skip this step if the device isn't working property to start with.

Once you've stopped writes and waited for synchronization (or if the hosts are not working in the first place) do this for each device:

Disable the bindings from the service to the device in Almanac.

If you are removing a device because it failed abruptly (or removing several devices at once; or you skip the "Prevent Writes" step), it is possible that some repositories will have lost all their leaders. See "Loss of Leaders" below to understand and resolve this.

If you want to put the hosts back in service later:

Enable the bindings again.
Change writable back to "Allow Writes".

This will restore the cluster to the original state.

Monitoring Services

You can get an overview of repository cluster status from the Config → Repository Servers screen. This table shows a high-level overview of all active repository services.

Repos: The number of repositories hosted on this service.

Sync: Synchronization status of repositories on this service. This is an at-a-glance view of service health, and can show these values:

Synchronized: All nodes are fully synchronized and have the latest version of all repositories.
Partial: All repositories either have at least two leaders, or have a very recent write which is not expected to have propagated yet.
Unsynchronized: At least one repository has changes which are only available on one node and were not pushed very recently. Data may be at risk.
No Repositories: This service has no repositories.
Ambiguous Leader: At least one repository has an ambiguous leader.

If this screen identifies problems, you can drill down into repository details to get more information about them. See the next section for details.

Monitoring Repositories

You can get a more detailed view the current status of a specific repository on cluster devices in Diffusion → (Repository) → Manage Repository → Cluster Configuration.

This screen shows all the configured devices which are hosting the repository and the available version on that device.

Version: When a repository is mutated by a push, Phabricator increases an internal version number for the repository. This column shows which version is on disk on the corresponding device.

After a change is pushed, the device which received the change will have a larger version number than the other devices. The change should be passively replicated to the remaining devices after a brief period of time, although this can take a while if the change was large or the network connection between devices is slow or unreliable.

You can click the version number to see the corresponding push logs for that change. The logs contain details about what was changed, and can help you identify if replication is slow because a change is large or for some other reason.

Writing: This shows that the device is currently holding a write lock. This normally means that it is actively receiving a push, but can also mean that there was a write interruption. See "Write Interruptions" below for details.

Last Writer: This column identifies the user who most recently pushed a change to this device. If the write lock is currently held, this user is the user whose change is holding the lock.

Last Write At: When the most recent write started. If the write lock is currently held, this shows when the lock was acquired.

Cluster Failure Modes

There are three major cluster failure modes:

Write Interruptions: A write started but did not complete, leaving the disk state and cluster state out of sync.
Loss of Leaders: None of the devices with the most up-to-date data are reachable.
Ambiguous Leaders: The internal state of the repository is unclear.

Phabricator can detect these issues, and responds by freezing the repository (usually preventing all reads and writes) until the issue is resolved. These conditions are normally rare and very little data is at risk, but Phabricator errs on the side of caution and requires decisions which may result in data loss to be confirmed by a human.

The next sections cover these failure modes and appropriate responses in more detail. In general, you will respond to these issues by assessing the situation and then possibly choosing to discard some data.

Write Interruptions

A repository cluster can be put into an inconsistent state by an interruption in a brief window during and immediately after a write. This looks like this:

A change is pushed to a server.
The server acquires a write lock and begins writing the change.
During or immediately after the write, lightning strikes the server and destroys it.

Phabricator can not commit changes to a working copy (stored on disk) and to the global state (stored in a database) atomically, so there is necessarily a narrow window between committing these two different states when some tragedy can befall a server, leaving the global and local views of the repository state possibly divergent.

In these cases, Phabricator fails into a frozen state where further writes are not permitted until the failure is investigated and resolved. When a repository is frozen in this way it remains readable.

You can use the monitoring console to review the state of a frozen repository with a held write lock. The Writing column will show which device is holding the lock, and whoever is named in the Last Writer column may be able to help you figure out what happened by providing more information about what they were doing and what they observed.

Because the push was not acknowledged, it is normally safe to resolve this issue by demoting the device. Demoting the device will undo any changes committed by the push, and they will be lost forever.

However, the user should have received an error anyway, and should not expect their push to have worked. Still, data is technically at risk and you may want to investigate further and try to understand the issue in more detail before continuing.

There is no way to explicitly keep the write, but if it was committed to disk you can recover it manually from the working copy on the device (for example, by using git format-patch) and then push it again after recovering.

If you demote the device, the in-process write will be thrown away, even if it was complete on disk. To demote the device and release the write lock, run this command:

phabricator/ $ ./bin/repository thaw <repository> --demote <device>

Any committed but unacknowledged data on the device will be lost.

Loss of Leaders

A more straightforward failure condition is the loss of all servers in a cluster which have the most up-to-date copy of a repository. This looks like this:

There is a cluster setup with two devices, X and Y.
A new change is pushed to server X.
Before the change can propagate to server Y, lightning strikes server X and destroys it.

Here, all of the "leader" devices with the most up-to-date copy of the repository have been lost. Phabricator will freeze the repository refuse to serve requests because it can not serve reads consistently and can not accept new writes without data loss.

The most straightforward way to resolve this issue is to restore any leader to service. The change will be able to replicate to other devices once a leader comes back online.

If you are unable to restore a leader or unsure that you can restore one quickly, you can use the monitoring console to review which changes are present on the leaders but not present on the followers by examining the push logs.

If you are comfortable discarding these changes, you can instruct Phabricator that it can forget about the leaders by doing this:

Disable the service bindings to all of the leader devices so they are no longer part of the cluster.
Then, use bin/repository thaw to --demote the leaders explicitly.

To demote a device, run this command:

phabricator/ $ ./bin/repository thaw rXYZ --demote repo002.corp.net

Any data which is only present on the demoted device will be lost.

If you do this, you will lose unreplicated data. You will discard any changes on the affected leaders which have not replicated to other devices in the cluster.

If you have lost an entire cluster and replaced it with new devices that you have restored from backups, you can aggressively wipe all memory of the old devices by using --demote <service> and --all-repositories. This is dangerous and discards all unreplicated data in any repository on any device.

phabricator/ $ ./bin/repository thaw --demote repo.corp.net --all-repositories

After you do this, continue below to promote a leader and restore the cluster to service.

Ambiguous Leaders

Repository clusters can also freeze if the leader devices are ambiguous. This can happen if you replace an entire cluster with new devices suddenly, or make a mistake with the --demote flag. This may arise from some kind of operator error, like these:

Someone accidentally uses bin/repository thaw ... --demote to demote every device in a cluster.
Someone accidentally deletes all the version information for a repository from the database by making a mistake with a DELETE or UPDATE query.
Someone accidentally disables all of the devices in a cluster, then adds entirely new ones before repositories can propagate.

If you are moving repositories into cluster services, you can also reach this state if you use clusterize to associate a repository with a service that is bound to multiple active devices. In this case, Phabricator will not know which device or devices have up-to-date information.

When Phabricator can not tell which device in a cluster is a leader, it freezes the cluster because it is possible that some devices have less data and others have more, and if it chooses a leader arbitrarily it may destroy some data which you would prefer to retain.

To resolve this, you need to tell Phabricator which device has the most up-to-date data and promote that device to become a leader. If you know all devices have the same data, you are free to promote any device.

If you promote a device, you may lose data if you promote the wrong device and some other device really had more up-to-date data. If you want to double check, you can examine the working copies on disk before promoting by connecting to the machines and using commands like git log to inspect state.

Once you have identified a device which has data you're happy with, use bin/repository thaw to --promote the device. The data on the chosen device will become authoritative:

phabricator/ $ ./bin/repository thaw rXYZ --promote repo002.corp.net

Any data which is only present on other devices will be lost.

Backups

Even if you configure clustering, you should still consider retaining separate backup snapshots. Replicas protect you from data loss if you lose a host, but they do not let you rewind time to recover from data mutation mistakes.

If something issues a --force push that destroys branch heads, the mutation will propagate to the replicas.

You may be able to manually restore the branches by using tools like the Phabricator push log or the Git reflog so it is less important to retain repository snapshots than database snapshots, but it is still possible for data to be lost permanently, especially if you don't notice the problem for some time.

Retaining separate backup snapshots will improve your ability to recover more data more easily in a wider range of disaster situations.

Ad-Hoc Maintenance Locks

Occasionally, you may want to perform maintenance to a clustered repository which requires you modify the actual content of the repository.

For example: you might want to delete a large number of old or temporary branches; or you might want to merge a very large number of commits from another source.

These operations may be prohibitively slow or complex to perform using normal pushes. In cases where you would prefer to directly modify a working copy, you can use a maintenance lock to safely make a working copy mutable.

If you simply perform this kind of content-modifying maintenance by directly modifying the repository on disk with commands like git update-ref, your changes may either encounter conflicts or encounter problems with change propagation.

You can encounter conflicts because directly modifying the working copy on disk won't prevent users or Phabricator itself from performing writes to the same working copy at the same time. Phabricator does not compromise the lower-level locks provided by the VCS so this is theoretically safe -- and this rarely causes any significant problems in practice -- but doesn't make things any simpler or easier.

Your changes may fail to propagate because writing directly to the repository doesn't turn it into the new cluster leader after your writes complete. If another node accepts the next push, it will become the new leader -- without your changes -- and all other nodes will synchronize from it.

Note that some maintenance operations (like git gc, git prune, or git repack) do not modify repository content. In theory, these operations do not require a maintenance lock: lower-level Git locks should protect them from conflicts, and they can not be affected by propagation issues because they do not propagate. In practice, these operations are not conflict-free in all circumstances. Using a maintenance lock may be overkill, but it's probably still a good idea.

To use a maintenance lock:

Open two terminal windows. You'll use one window to hold the lock and a second window to perform maintenance.
Run bin/repository lock <repository> ... in one terminal.
When the process reports that repositories are locked, switch to the second terminal and perform maintenance. The repository lock process should still be running in your first terminal.
After maintenance completes, switch back to the first terminal and answer the prompt to confirm maintenance is complete.

The workflow looks something like this:

$ ./bin/repository lock R2

These repositories will be locked:

      - R2 Git Test Repository

While the lock is held: users will be unable to write to this repository,
and you may safely perform working copy maintenance on this node in another
terminal window.

    Lock repositories and begin maintenance? [y/N] y

Repositories are now locked. You may begin maintenance in another terminal
window. Keep this process running until you complete the maintenance, then
confirm that you are ready to release the locks.

    Ready to release the locks? [y/N] y

Done.

As maintenance completes, the push log for the repository will be updated to reflect that you performed maintenance.

If the lock is interrupted, you may encounter a "Write Interruptions" condition described earlier in this document. See that section for details. In most cases, you can resolve this issue by demoting the node you are working on.

Next Steps

Continue by:

returning to Clustering Introduction.

Defined: src/docs/user/cluster/cluster_repositories.diviner:1

Cluster: RepositoriesPhabricator User Documentation (Cluster Configuration)