Changeset View
Standalone View
src/docs/user/cluster/cluster_repositories.diviner
| Show First 20 Lines • Show All 408 Lines • ▼ Show 20 Lines | |||||
| comes back online. | comes back online. | ||||
| If you are unable to restore a leader or unsure that you can restore one | If you are unable to restore a leader or unsure that you can restore one | ||||
| quickly, you can use the monitoring console to review which changes are | quickly, you can use the monitoring console to review which changes are | ||||
| present on the leaders but not present on the followers by examining the | present on the leaders but not present on the followers by examining the | ||||
| push logs. | push logs. | ||||
| If you are comfortable discarding these changes, you can instruct Phabricator | If you are comfortable discarding these changes, you can instruct Phabricator | ||||
| that it can forget about the leaders in two ways: disable the service bindings | that it can forget about the leaders by doing this: | ||||
| to all of the leader devices so they are no longer part of the cluster, or use | |||||
| `bin/repository thaw` to `--demote` the leaders explicitly. | |||||
| If you do this, **you will lose data**. Either action will discard any changes | |||||
| on the affected leaders which have not replicated to other devices in the | |||||
| cluster. | |||||
| To remove a device from the cluster, disable all of the bindings to it | - Disable the service bindings to all of the leader devices so they are no | ||||
| in Almanac, using the web UI. | longer part of the cluster. | ||||
| - Then, use `bin/repository thaw` to `--demote` the leaders explicitly. | |||||
| {icon exclamation-triangle, color="red"} Any data which is only present on | To demote a device, run this command: | ||||
| the disabled device will be lost. | |||||
| To demote a device without removing it from the cluster, run this command: | |||||
| ``` | ``` | ||||
| phabricator/ $ ./bin/repository thaw rXYZ --demote repo002.corp.net | phabricator/ $ ./bin/repository thaw rXYZ --demote repo002.corp.net | ||||
| ``` | ``` | ||||
| {icon exclamation-triangle, color="red"} Any data which is only present on | {icon exclamation-triangle, color="red"} Any data which is only present on | ||||
| **this** device will be lost. | the demoted device will be lost. | ||||
| If you do this, **you will lose unreplicated data**. You will discard any | |||||
amckinley: Maybe "you will lose unreplicated data"? Only a Sith deals in absolutes. | |||||
| changes on the affected leaders which have not replicated to other devices | |||||
| in the cluster. | |||||
| Ambiguous Leaders | Ambiguous Leaders | ||||
| ================= | ================= | ||||
Not Done Inline ActionsI'm not sure I understand the use case for this. Isn't the sequence ("demote", "disable") strictly worse than ("disable", "demote")? amckinley: I'm not sure I understand the use case for this. Isn't the sequence ("demote", "disable")… | |||||
Done Inline ActionsIn this case, you don't disable at all: "we know what's wrong and the host is going to be back online in a few minutes, so I don't want to bother disabling and re-enabling later". This is pretty marginal and I tried to hedge it with the line below ("...safer to disable the device..."). I should perhaps remove this block completely (and partly kept it just because it was sort of already there), but there's also maybe some value if you accidentally demoted without disabling and are freaking out now? I'll give it a little more thought. Maybe also like "someone pushed a 200GB commit and it's having trouble replicating". Also pretty marginal. epriestley: In this case, you don't disable at all: "we know what's wrong and the host is going to be back… | |||||
Not Done Inline ActionsYeah I'd probably just remove this block entirely, but I also don't think it's really hurting anyone. amckinley: Yeah I'd probably just remove this block entirely, but I also don't think it's really hurting… | |||||
| Repository clusters can also freeze if the leader devices are ambiguous. This | Repository clusters can also freeze if the leader devices are ambiguous. This | ||||
| can happen if you replace an entire cluster with new devices suddenly, or make | can happen if you replace an entire cluster with new devices suddenly, or make | ||||
| a mistake with the `--demote` flag. This may arise from some kind of operator | a mistake with the `--demote` flag. This may arise from some kind of operator | ||||
| error, like these: | error, like these: | ||||
| - Someone accidentally uses `bin/repository thaw ... --demote` to demote | - Someone accidentally uses `bin/repository thaw ... --demote` to demote | ||||
| every device in a cluster. | every device in a cluster. | ||||
| ▲ Show 20 Lines • Show All 62 Lines • Show Last 20 Lines | |||||
Maybe "you will lose unreplicated data"? Only a Sith deals in absolutes.