Aphlict does not work using SSL and node 6.9.1
Open, Needs TriagePublic

Description

NOTE: I am not sure this is something that Phabricator can fix, but may affect other installs

We've noticed that notifications haven't been working on our install the past several weeks. When attempting to do basic testing, the aphlict server quits unexpectedly:

Client running curl
cspeck@host ~> curl https://testing.company.com:22280 -vvvvv
* Rebuilt URL to: https://testing.company.com:22280/
*   Trying 10.x.x.x...
* Connected to testing.company.com (10.x.x.x) port 22280 (#0)
* Server aborted the SSL handshake
* Closing connection 0
curl: (35) Server aborted the SSL handshake
Server running in debug
[cspeck@testing phabricator]$ sudo -u nginx ./bin/aphlict debug --config ./conf/aphlict/aphlict.custom.json
Reading configuration from: /usr/local/phacility/phabricator/conf/aphlict/aphlict.custom.json
Starting Aphlict server in foreground...
Launching server:

    $ node '--max-old-space-size=256' -- '/usr/local/phacility/phabricator/support/aphlict/server/aphlict_server.js' '--config=/usr/local/phacility/phabricator/conf/aphlict/aphlict.custom.json'

[11/18/2016, 9:03:04 PM] Starting servers (service PID 10345).
[11/18/2016, 9:03:04 PM] Logging to "/var/log/aphlict.log".
[11/18/2016, 9:03:04 PM] Started client server (Port 22280, With SSL).
[11/18/2016, 9:03:04 PM] Started admin server (Port 22281, No SSL).
[11/18/2016, 9:03:04 PM] This server has fingerprint "um6kb7vpJTDQ88RM".
>>> Server exited!

NodeJS 6.9.1 is the current LTS version of Node, which is what yum installs/upgrades to by default. Looking at the Changelog for Node 6.9.1, nothing explicitly mentions SSL but there are references to some hashing things. I suppose it's possible the issue might also be related to which version of OpenSSL is installed on the system as well, though I'm not as certain I know the best way to go about changing OpenSSL versions for compiling Node or ws.

I tried adding logs allover the place in the aphlict server and ws. In all the places I tried (I know maybe a little javascript), all I could conclude was that it's very likely the error happens very very early in the SSL handshake that it's deep inside ws or NodeJS's internal usage of OpenSSL.

Workaround
  1. I've tested swapping out the node binary for both v6.9.0 and v7.1.0, re-installing ws (just in case). Doing this seems to get things working again.
  2. Let nginx handle the websocket request and terminate the SSL (by following these steps). I tried this and it does work.

Workarounds that don't work: I tried installing/using old versions of ws as the aphlict project does not specify which version it relies on. The current version is v1.1.1, and I tried v0.8, v0.7, and v0.6 all with the same results.

Environment
notification.servers
[
  {
    "type": "client",
    "host": "testing.company.com",
    "port": 22280,
    "protocol": "https"
  },
  {
    "type": "admin",
    "host": "127.0.0.1",
    "port": 22281,
    "protocol": "http"
  }
]
aphlict.custom.json
{
  "servers": [
    {
      "type": "client",
      "port": 22280,
      "listen": "0.0.0.0",
      "ssl.key": "/etc/ssl/company_com.key",
      "ssl.cert": "/etc/ssl/company_com.pem",
      "ssl.chain": null
    },
    {
      "type": "admin",
      "port": 22281,
      "listen": "127.0.0.1",
      "ssl.key": null,
      "ssl.cert": null,
      "ssl.chain": null
    }
  ],
  "logs": [
    {
      "path": "/var/log/aphlict.log"
    }
  ],
  "pidfile": "/var/tmp/aphlict/pid/aphlict.pid"
}
cspeckmim edited the task description. (Show Details)
cspeckmim edited the task description. (Show Details)Nov 19 2016, 2:39 AM
cspeckmim edited the task description. (Show Details)Nov 19 2016, 2:52 AM
cspeckmim edited the task description. (Show Details)Nov 19 2016, 2:59 AM
cspeckmim edited the task description. (Show Details)Nov 19 2016, 3:14 AM
$ nodejs --version
v0.10.25

uh they sure made a lot of versions all of a sudden

If you hit this again or have a dev environment or anyone else runs into it, one thing you can try is using openssl s_client instead of curl, which might give you more information about what is going awry in the SSL handshake. Something like this:

$ openssl s_client -connect secure.phabricator.com:22280
CONNECTED(00000003)
depth=1 /C=US/ST=Arizona/L=Scottsdale/O=GoDaddy.com, Inc./OU=http://certs.godaddy.com/repository//CN=Go Daddy Secure Certificate Authority - G2
verify error:num=20:unable to get local issuer certificate
verify return:0
---
Certificate chain
 0 s:/OU=Domain Control Validated/CN=secure.phabricator.com
   i:/C=US/ST=Arizona/L=Scottsdale/O=GoDaddy.com, Inc./OU=http://certs.godaddy.com/repository//CN=Go Daddy Secure Certificate Authority - G2
 1 s:/C=US/ST=Arizona/L=Scottsdale/O=GoDaddy.com, Inc./OU=http://certs.godaddy.com/repository//CN=Go Daddy Secure Certificate Authority - G2
   i:/C=US/ST=Arizona/L=Scottsdale/O=GoDaddy.com, Inc./CN=Go Daddy Root Certificate Authority - G2
---
Server certificate
-----BEGIN CERTIFICATE-----
MIIFTTCCBDWgAwIBAgIIYOYIzRD7ysYwDQYJKoZIhvcNAQELBQAwgbQxCzAJBgNV
...

Not sure if that will reveal anything or not, but sometimes it's informative.

chad added a subscriber: chad.Nov 19 2016, 9:24 PM

What about Node.js v0.10 and v0.12?

If you're still currently using Node.js v0.10 or v0.12, it is time to begin the transition to v4 or v6. Both v0.10 and v0.12 are considered to be in Maintenance mode currently and will fall off our support plan completely later this year.

I tried the openssl connection you suggested and it still bombed.

[cspeck@testing ~]$ openssl s_client -connect testing.company.com:22280
CONNECTED(00000003)
139702005200800:error:140790E5:SSL routines:SSL23_WRITE:ssl handshake failure:s23_lib.c:184:
---
no peer certificate available
---
No client certificate CA names sent
---
SSL handshake has read 0 bytes and written 247 bytes
---
New, (NONE), Cipher is (NONE)
Secure Renegotiation IS NOT supported
Compression: NONE
Expansion: NONE
---

Looking around internet for similar error code led me to http://stackoverflow.com/questions/6467182/ssl-trouble-openssl

You are receiving a handshake failure alert from the server, which means that some other error is occurring, it is not the certificate validation that fails. You should look at the server side logs for clues about what has failed.

Unfortunately I don't think nodejs has more logging details about the failure, that I know of.

avivey added a subscriber: avivey.Nov 21 2016, 10:03 PM

Tried again with node v6.9.4 (current version of node with CentOS7), and still has same problem 😢