Page MenuHomePhabricator

WebSockets access problems with reverse SSL proxy
Closed, WontfixPublic

Assigned To
Authored By
sgielen
Aug 31 2015, 3:28 PM
Tags
Referenced Files
F767807: T9293.patch
Aug 31 2015, 3:31 PM
Tokens
"Manufacturing Defect?" token, awarded by renatomassaro."Love" token, awarded by cwang."Love" token, awarded by ablekh."Love" token, awarded by tycho.tatitscheff.

Description

We're having some problems configuring Aphlict for our site that uses reverse SSL proxies for access to Phabricator. It appears that Phabricator does not correctly convert the notification.client-uri scheme (to wss:// instead of ws://) in this case. I'll attach a proposed patch, but it requires some discussion.

In our setup, we have an internal machine called hawaii hosting Phabricator over HTTP. This machine is never accessed directly. Instead, we have configured a machine with an external IP address and hostname, say phabricator.example.org, which runs nginx listening for requests over HTTPS and forwarding them to the HTTP port. This is (part of) the Nginx configuration:

server {
	listen 443 ssl;
	[...SSL parameters...]

	server_name phabricator.example.org;

	location / {
		proxy_set_header Host phabricator.example.org;
		proxy_set_header X-Real-IP $remote_addr;
		proxy_set_header X-Forwarded-Proto $scheme;
		proxy_set_header Front-End-Https on;
		proxy_pass http://hawaii:8095;
	}
}

This machine is also supposed to relay WebSockets traffic to the Phabricator instance. Therefore, we configured the Aphlict server to listen on client port 22280 and admin port 22281, set the notification.client-uri to https://localhost/ws/ and the notification.server-uri to http://localhost:22281. We added (amongst others) the following block of Nginx configuration to the reverse SSL proxy:

upstream websocket_pool {
	ip_hash;
	server hawaii:22280;
}

server {
	[..other directives as above..]
	location = /ws/ {
		proxy_pass http://websocket_pool;
		proxy_http_version 1.1;
		proxy_set_header Upgrade $http_upgrade;
		proxy_set_header Connection "upgrade";
		proxy_set_header X-Real-IP $remote_addr;
		proxy_set_header X-Forwarded-Proto $scheme;
		proxy_set_header Front-End-Https on;
		proxy_read_timeout 9999999999;
	}
}

Even though we could now succesfully open a WebSocket and send messages via https://phabricator.example.org/ws/ that ended up in the Aphlict logs, the Phabricator main page could not connect. Eventually, I traced this down to the config.websocketURI being set to ws://phabricator.example.org/ws/ instead of wss://phabricator.example.org/ws; hotfixing this in Aphlict.js "fixed" the problem and made the notification server work beautifully.

Tracing it further, this (seemingly) wrong choice was made in src/view/page/PhabricatorStandardPageView.php. Here, notification.client-uri is read and the protocol is set to either "ws" or "wss". However, this choice is made based on whether the current connection, according to Phabricator, is HTTPS. It seems to me that this choice should depend on the scheme of notification.client-uri (i.e. $client_uri->getProtocol() == "https"), not on the current connection. And, making things even more interesting, the current connection *was* actually over HTTPS, but AprontRequest.php only checks $_SERVER['HTTPS'] and not the standard X-Forwarded-Proto / Front-End-Https headers. It's possible that this HTTPS choice is made incorrectly in more places because of this.

The attached patch fixes this problem by making the choice of scheme in $client_uri based on the current scheme in $client_uri. The two scenarios where this may break existing setups are:

  • Current request is HTTP, notification.client-uri describes a https:// URI but Aphlict doesn't use SSL. In that case, I wonder why notification.client-uri describes a HTTPS URI in the first place.
  • Current request is HTTPS, notification.client-uri describes a http:// URI and Aphlict uses SSL. This will probably occur in various places, as https://secure.phabricator.com/book/phabricator/article/notifications/ describes that the notification.client-uri can be set to http://localhost/ws/ if this type of setup is used.

Another possible fix would be to change AphrontRequest::isHTTPS to look at the X-Forwarded-Proto or Front-End-Https headers as well:

  • If current request is HTTPS but Aphront listens on HTTP, browser would not allow the connection currently as well, so we're not breaking anything by switching that to HTTPS
  • If current request is HTTP, then you're on your own if you set headers indicating your connection is actually HTTPS if it's not.

It does not sound to me like the right solution, but it might prevent breakage for some users.

Event Timeline

sgielen updated the task description. (Show Details)
sgielen added a subscriber: sgielen.

commit 33c3ebe0606580efc22c410f654a9fa76655779a
Author: Sjors Gielen <sjors@sjorsgielen.nl>
Date:   Mon Aug 31 17:29:00 2015 +0200

    Determine WebSocket URI scheme by looking at the configured client URI instead of looking at current request headers. Refs T9293.

diff --git a/src/view/page/PhabricatorStandardPageView.php b/src/view/page/PhabricatorStandardPageView.php
index 90325bd..0f04d23 100644
--- a/src/view/page/PhabricatorStandardPageView.php
+++ b/src/view/page/PhabricatorStandardPageView.php
@@ -455,7 +455,7 @@ final class PhabricatorStandardPageView extends PhabricatorBarePageView {
           $client_uri->setDomain($this_host->getDomain());
         }
 
-        if ($request->isHTTPS()) {
+        if ($client_uri->getProtocol() == "https") {
           $client_uri->setProtocol('wss');
         } else {
           $client_uri->setProtocol('ws');

T8982 is somewhat related, since it is about wrong parsing of notification.client-uri (but it is not related in that it may have the same fix).

I noticed that one too though, if you don't configure a port number it will set "" as the port number leading to (in our case) listening on a random port. This is why we added --client-port=22280 to our invocation of bin/aphlict start, to force it there regardless of the setting of notification.client-uri.

eadler added a project: Restricted Project.Jan 9 2016, 1:04 AM
eadler moved this task from Restricted Project Column to Restricted Project Column on the Restricted Project board.

I would vote for X-Forwarded-Proto support in AphrontRequest::isHTTPS

epriestley claimed this task.
epriestley added a subscriber: epriestley.

To resolve this, configure your server so that $_SERVER['HTTPS'] is set correctly. This impacts other checks, so it is much better to fix it than try to guess our way around it by examining aphlict.client-uri.

We will not trust X-Forwarded-Proto, X-Forwarded-For, etc., by default because they are not trustworthy in the general case (they are completely client-controlled if you aren't behind a load-balancer).

See this article in the documentation for help with trusting these headers:

https://secure.phabricator.com/book/phabricator/article/configuring_preamble/

We will not trust X-Forwarded-Proto, X-Forwarded-For, etc., by default because they are not trustworthy in the general case (they are completely client-controlled if you aren't behind a load-balancer).

I agree with this.

To resolve this, configure your server so that $_SERVER['HTTPS'] is set correctly. This impacts other checks, so it is much better to fix it than try to guess our way around it by examining aphlict.client-uri.

I'm not sure I fully agree with this. $_SERVER['HTTPS'] says something about the SSLness of the current page request. However, Phabricator (Aphlict?) will start new requests from the client browser, which can be SSL even if the original connection was not SSL. Since the configuration parameter notification.client-uri contains a scheme already, would it not be the least confusing option to use this scheme in deciding whether to start these requests over HTTP or HTTPS, instead of using the current request scheme?

$_SERVER['HTTPS'] should always reflect the SSL-ness of the original connection. If it does not, it is misconfigured.

$_SERVER['HTTPS'] should always reflect the SSL-ness of the original connection. If it does not, it is misconfigured.

I agree. But my point still stands even if it is configured correctly.

Using client-uri will cause us to use wss:// on pages which are http:// on the client. This seems very confusing/unexpected to me. Although this is probably fine usually, it defies user intent and may not work if, for example, the server certificate is self-signed and the user is using http specifically because their client does not recognize the CA. Such a user would be confused: "I'm connecting with HTTP and it says HTTP in my browser, but I'm getting an SSL certificate error. I specifically switched to HTTP because I know that this client can not trust the certificate."

Everything else that cares about SSL-ness uses the same source (options like security.require-https, security.strict-transport-security, etc). Using something else in this case, uniquely, seems confusing/unexpected to me.

Using client-uri will cause us to use wss:// on pages which are http:// on the client. This seems very confusing/unexpected to me. Although this is probably fine usually, it defies user intent and may not work if, for example, the server certificate is self-signed and the user is using http specifically because their client does not recognize the CA. Such a user would be confused: "I'm connecting with HTTP and it says HTTP in my browser, but I'm getting an SSL certificate error. I specifically switched to HTTP because I know that this client can not trust the certificate."

That is true, but will only happen if client-uri has an SSL-based scheme, which is very explainable and understandable and is unlikely to happen by accident, right? It's the first place where you'll look once you see the browser opening SSL connections to the configured notification URI. In the case of a self-signed certificate, the administrator should not use an SSL-based scheme in the client-uri because he knows that users will get these errors.

This case is different, I would say, because we are configuring a URI directly, and then ignoring the schema part of the configured URI (which carries the same meaning) for the sake of consistency between the page and the websocket requests...

I'll make a note on T10697 to split these options into explicit host and port options. The use of uri options which combined hosts and ports was intended to make specifying hosts and ports more convenient, but I think it's clear that it was a grievous error.