Upgrading: Changes to Aphlict (Real-Time Notifications) Configuration
Closed, ResolvedPublic

Description

The way Aphlict (the real-time notification server) is configured has changed.

What Has Changed

Previously, configuration was controlled primarily by these config values in Phabricator:

  • notification.enabled
  • notification.log
  • notification.client-uri
  • notification.server-uri
  • notification.pidfile
  • notification.ssl-key
  • notification.ssl-cert

Some additional configuration was provided via flags to bin/aphlict:

  • --client-host
  • --client-port

All of these options and flags have been removed. They have been replaced with two new configuration sources:

  • On the Phabricator/frontend side, the new notification.servers configuration option now provides a comprehensive description of notification services from a client/frontend viewpoint.
  • On the Aphlict/backend, Aphlict is now configured with a JSON file, which provides a comprehensive description of notification services from a server/backend viewpoint.

How to Upgrade

After you update, you'll probably receive setup warnings about the removed configuration options. Copy the values down if they aren't self-evident, then remove them according to the instructions. That should clear the setup warnings.

When you start Aphlict, it now reads a configuration file. The default file it reads is phabricator/conf/aphlict/aphlict.default.json. Here's what the file looks like:

{
  "servers": [
    {
      "type": "client",
      "port": 22280,
      "listen": "0.0.0.0",
      "ssl.key": null,
      "ssl.cert": null
    },
    {
      "type": "admin",
      "port": 22281,
      "listen": "127.0.0.1",
      "ssl.key": null,
      "ssl.cert": null
    }
  ],
  "logs": [
    {
      "path": "/var/log/aphlict.log"
    }
  ],
  "pidfile": "/var/tmp/aphlict/pid/aphlict.pid"
}

If your setup was fairly similar to the old defaults, it's possible you don't need to change this file at all. However, if you changed ports, added SSL, moved logfiles, etc., you'll need to update this file for those customizations.

You can either copy it into phabricator/conf/aphlict/aphlict.custom.json (this path is .gitignored, and read instead of aphlict.default.json if it exists) and modify it, or put it somewhere else and use bin/aphlict --config path/to/config.json to specify the file explicitly when starting Aphlict.

If you had customized notification.pidfile or notification.log, copy the values into pidfile or the path value in the log list respectively.

If you had customized notification-ssl-key or notification.ssl-cert, update the appropriate keys in the "client" server definition (the "admin" server did not previously support or use SSL keys, so don't touch those if you're just updating).

If you had customized ports, adjust them in the corresponding "port" keys. The old client-uri corresponds to the "client" server, and the old server-uri corresponds to the "admin" server.

Once you've updated the file, run bin/aphlict debug to check it. You should get output confirming your logfiles and servers are set up reasonably, or useful messages if you have configuration errors.

If things look good, use bin/aphlict start (possibly with a --config flag) to start the server.

Now, you need to configure notification.servers in Phabricator. This will have similar values, except from a client/frontend perspective. Here's the default value suggested as an example:

[
  {
    "type": "client",
    "host": "phabricator.mycompany.com",
    "port": 22280,
    "protocol": "https"
  },
  {
    "type": "admin",
    "host": "127.0.0.1",
    "port": 22281,
    "protocol": "http"
  }
]

Generally, you'll specify a "client" server that has the same values as your old notification.client-uri, and a "admin" server that has the same values as your old notification.server-uri.

Note that these frontend/client values no longer have to match the backend/server Aphlict values and are free to vary in whatever way you desire. Suppose you want a setup like this:

  • Browsers connect to loadbalancer.com:1234 over SSL.
  • The load balancer strips SSL.
  • The load balancer forwards the traffic to backend.com:6789.

Previously, this was difficult to configure, but it is now straightforward: put loadbalancer.com, 1234 and https in the Phabricator frontend configuration. Set up your load balancer. Put 6789 and no SSL in the backend configuration.

Why Things Changed

T10697 has more technical details about this change. The major motivating factors were:

More Explicit Configuration: Users frequently had difficulty configuring Aphlict. A major source of confusion and difficulty was its use of a single set of configuration options to define both the frontend and backend behaviors.

The original intent of having less configuration was to make things easier, but in practice there was widespread interest in routing websocket requests through some sort of middle layer which meant that the client and server views of the world were often not the same. The old method took too many shortcuts, had two many weird pieces of implicit magic, and frequently led users astray (tasks merged into T10697 are a boneyard of these issues).

Preparation for Clustering: We plan to allow multiple Aphlict servers on different hosts to work together soon, discussed in T6915, so that Phabricator can transparently survive the loss of some subset of the notification hosts. This wouldn't have worked with the old configuration, since Phabricator needs to know about multiple nodes -- it can't reasonably survive the loss of a node if that's the only node it knows about. We could have added new configuration on top of the old configuration, but given the other problems it seemed better in the long run to rewrite everything.

More Flexibility: There are some additional configuration options (like log verbosity and Node memory tweaks) that I'd like to add at some point. Beyond clustering, turning everything into lists of dictionaries gives us more options to add these features in the future without needing to fill top-level configuration up with a ton of new options.

epriestley renamed this task from Upgrading: Changes to Aphlict Configuration to Upgrading: Changes to Aphlict (Real-Time Notifications) Configuration.Apr 13 2016, 6:18 PM
epriestley updated the task description. (Show Details)Apr 13 2016, 10:52 PM
chad added a subscriber: chad.Apr 13 2016, 11:32 PM
nevogd added a subscriber: nevogd.Apr 14 2016, 6:29 AM
epriestley moved this task from Backlog to vNext on the Aphlict board.Apr 14 2016, 12:22 PM

Example: secure.phabricator.com

Here's how I modified the configuration on this host (secure.phabricator.com) to account for this change.

Today, we use a simple configuration here: notification traffic is sent through an ELB unmodified (no SSL termination, no port changes) directly to a backend host, and SSL is terminated by Aphlict on the backend host. The admin server is a default setup on the local host.

Here's what the config looked like before the change (most of these values are defaults):

To create a new secure.aphlict.json, I copied phabricator/conf/aphlict/aphlict.default.json and modified it slightly. The only values I needed to change where the SSL values for the "client" server to add SSL termination to Aphlict. Here's the new file:

secure.aphlict.json
{
  "servers": [
    {
      "type": "client",
      "port": 22280,
      "listen": "0.0.0.0",
      "ssl.key": "/core/conf/ssl/secure.phabricator.com.key",
      "ssl.cert": "/core/conf/ssl/secure.phabricator.com.crt"
    },
    {
      "type": "admin",
      "port": 22281,
      "listen": "127.0.0.1",
      "ssl.key": null,
      "ssl.cert": null
    }
  ],
  "logs": [
    {
      "path": "/var/log/aphlict.log"
    }
  ],
  "pidfile": "/var/tmp/aphlict/pid/aphlict.pid"
}

To create new notification.servers configuration, I copied the default example from the config UI and modified it. The only value I needed to change was host. Here's the new config (we specify this in a file, not the web UI, so it's PHP instead of JSON, but has the same structure):

notification.servers (secure.phabricator.com)
'notification.servers' => array(
  array(
    'type' => 'client',
    'host' => 'secure.phabricator.com',
    'port' => 22280,
    'protocol' => 'https',
  ),
  array(
    'type' => 'admin',
    'host' => '127.0.0.1',
    'port' => 22281,
    'protocol' => 'http',
  ),
),

It should be fixed in HEAD, it just only regenerates once every 24 hours or something I think.

Following the upgrade, is there a way to send test notifications and /notification/status/ is now a 404 error?

There's a new status panel in ConfigNotification Servers. I'm consolidating cluster status information there.

Nothing should direct you to the old /notification/status/ panel any more -- let me know if I missed something and I'll update it.

I plan to rework test notifications to fix T10743, and possibly provide a "broadcast to all users" mechanism at the same time. In the meantime, an easy way to test notifications is to join a room with only yourself in Conpherence in two browser windows and chat. If notifications are working, the messages will appear in both windows.

Thank you for the update, I didn't think of using Conference. I haven't found anything that directed me to /status, just had the url as part of our validation after an upgrade, and went to use it.

Note that it is possible to set a path to the websocket for every server as well. This is handy, for example, when your notification websocket is hosted behind the same URL as Phabricator itself. We have https://phabricator.ourcompany.net/ for Phabricator itself, and https://phabricator.ourcompany.net/ws/ for the notifictions websocket. This corresponds to the following client config:

notification.servers
[
  {
    "type": "client",
    "host": "phabricator.ourcompany.net",
    "port": 443,
    "protocol": "https",
    "path": "/ws/"
  },
  {
    "type": "admin",
    "host": "127.0.0.1",
    "port": 22281,
    "protocol": "http"
  }
]

(The other undocumented option according to the file PhabricatorNotificationServerRef.php is disabled.)

urzds added a subscriber: urzds.Aug 26 2016, 7:22 PM
epriestley closed this task as Resolved.Feb 18 2017, 2:32 AM
epriestley claimed this task.

This has been live for a while now.