Summary
I'm running multiple Phabricator daemons on two independent servers (one is Ubuntu 14.04, and other is 16.04) more than a year.
For updating Phabricator daily, I have been run a script which stops them all (not by --force), update Phabricator and restart it. This script had been worked well for a year.
However, recently (I don't know when the problem starts to appear, because I reboot the servers frequently in this month...), phd daemon does not shutdown properly like
$ ./bin/phd status Log Daemon Host Overseer Started Class Arguments 679 8006:2jacsm3 RND02 8006 Apr 12 2017, 4:53:39 AM PhabricatorTaskmasterDaemon 678 8006:akgm6dl RND02 8006 Apr 12 2017, 4:53:39 AM PhabricatorTriggerDaemon 677 8006:3pbxj62 RND02 8006 Apr 12 2017, 4:53:39 AM PhabricatorRepositoryPullLocalDaemon $ ./bin/phd stop There are processes running that look like Phabricator daemons but have no corresponding PID files: 7909 php ./phd-daemon 8006 php ./phd-daemon
When I tried only one Phabricator, there is no problem ( I can't reproduce it).
But, When there are more than two running Phabricator services, phd says there are no corresponding running Phabricator daemon, even bin/phd status response correct PID.
I've checked for each pid directory, and I confirm the PID file exist and those daemon.*** files describe proper PID.
But, when I give stop signal to phd, they say they have no corresponding PID files.
Of course, there were same messages like There are processes ... no corresponding PID files: when there are multiple Phabricator daemons, but phd also could find corresponding process and stop it which is described in /var/tmp/phd*/pid/. However, now it couldn't do it properly.
Reproducing Steps:
Repeat the script again and again
- Stops each Phabricator daemon
- (update Phabricators)
- Starts each Phabricator daemon
Expected results:
- Every times, each daemon stops and start again.
- No duplicated daemons for each environment/configuration.
Actual result:
- Even bin/phd status for each environment says they have the corresponding daemon, bin/phd stop does not stop it.
- There are more than one daemons for each environment.
- When there are more than two or three daemons (mostly two) for each environment, bin/phd stop stops a daemon sometimes, but not every duplicated daemons (I'm not sure the condition...)
Configurations/environmets
Phabricator version
phabricator cd7547dc5760bd0fde42f38118dcb9af3ddc17a0 (Wed, Apr 12)
arcanist a59cfca5f190c44403dfc7449c678a2aa1626bb4 (Wed, Apr 5)
phutil fb9e0642c4ea9065e68d2cd2b250c0fa71190e7b (Tue, Apr 11)
Phabricator installed paths
Have independent/specified pid-directory and log-directory for each environment
- /opt/phabA/phabricator/
- /var/tmp/phdA/pid
- /var/tmp/phdA/log
- /opt/phabB/phabricator/
- /var/tmp/phdB/pid
- /var/tmp/phdB/log
same for aphlict, but aphlict has no problem.
The script
echo "Stops Phabricators" su userA -c "cd /opt/phabA/phabricator; ./bin/phd status;./bin/phd stop;./bin/aphlict stop" su userB -c "cd /opt/phabB/phabricator; ./bin/phd status;./bin/phd stop;./bin/aphlict stop" sleep 4 echo "Update Phabricators" # for each # git pull ... # bin/storage upgrade --force echo "Restart Phabricators" su userA -c "cd /opt/phabA/phabricator; ./bin/phd start;./bin/aphlict start" # sleep for DB connection issue sleep 10 su userB -c "cd /opt/phabB/phabricator; ./bin/phd start;./bin/aphlict start"
Because I had a memory issue for the server, I periodically counted every process, and I sure that there was only exact number of Phabricator daemon instances for a year.
More logs
First time
$ sudo bash reproduce.sh There are no running Phabricator daemons. There are no running Phabricator daemons. Reading configuration from: phabricator/conf/aphlict/aphlict.default.json Stopping Aphlict Server (8982)... Aphlict Server (8982) exited normally. There are no running Phabricator daemons. There are no running Phabricator daemons. Reading configuration from: phabricator/conf/aphlict/aphlict.custom.json Stopping Aphlict Server (8481)... Aphlict Server (8481) exited normally. Storage is up to date. Use "storage status" for details. Synchronizing static tables... Verifying database schemata on "localhost:3306"... Found no adjustments for schemata. There are no running Phabricator daemons. Freeing active task leases... Freed 0 task lease(s). Launching daemons: (Logs will appear in "/var/tmp/phdA/log/daemons.log".) (Pool: 1) PhabricatorRepositoryPullLocalDaemon (Pool: 1) PhabricatorTriggerDaemon (Pool: 4) PhabricatorTaskmasterDaemon Done. Reading configuration from: phabricator/conf/aphlict/aphlict.default.json Aphlict is not running. Writing logs to: /var/log/aphlictA.log Aphlict Server started. Storage is up to date. Use "storage status" for details. Synchronizing static tables... Verifying database schemata on "databasehost"... Found no adjustments for schemata. There are processes running that look like Phabricator daemons but have no corresponding PID files: 9619 php ./phd-daemon 9626 php ./exec_daemon.php PhabricatorTaskmasterDaemon Stop these processes by re-running this command with the --force parameter. Freeing active task leases... Freed 0 task lease(s). Launching daemons: (Logs will appear in "/var/tmp/phdB/log/daemons.log".) (Pool: 1) PhabricatorRepositoryPullLocalDaemon (Pool: 1) PhabricatorTriggerDaemon (Pool: 4) PhabricatorTaskmasterDaemon Done. Reading configuration from: phabricator/conf/aphlict/aphlict.custom.json Aphlict is not running. Writing logs to: /var/log/aphlictB.log Aphlict Server started.
Second time
$ sudo bash reproduce.sh Log Daemon Host Overseer Started Class Arguments 997 9619:mm4c5h3 RND02 9619 Apr 12 2017, 5:45:20 AM PhabricatorTaskmasterDaemon 996 9619:funyzsh RND02 9619 Apr 12 2017, 5:45:20 AM PhabricatorTriggerDaemon 995 9619:csufoho RND02 9619 Apr 12 2017, 5:45:20 AM PhabricatorRepositoryPullLocalDaemon There are processes running that look like Phabricator daemons but have no corresponding PID files: 9619 php ./phd-daemon 9716 php ./phd-daemon Stop these processes by re-running this command with the --force parameter. Reading configuration from: phabricator/conf/aphlict/aphlict.default.json Stopping Aphlict Server (9640)... Aphlict Server (9640) exited normally. Log Daemon Host Overseer Started Class Arguments 688 9716:f5req5t RND02 9716 Apr 12 2017, 5:45:31 AM PhabricatorTaskmasterDaemon 687 9716:rfrbgrx RND02 9716 Apr 12 2017, 5:45:31 AM PhabricatorTriggerDaemon 686 9716:kj6ny5f RND02 9716 Apr 12 2017, 5:45:31 AM PhabricatorRepositoryPullLocalDaemon There are processes running that look like Phabricator daemons but have no corresponding PID files: 9619 php ./phd-daemon 9716 php ./phd-daemon Stop these processes by re-running this command with the --force parameter. Reading configuration from: phabricator/conf/aphlict/aphlict.custom.json Stopping Aphlict Server (9737)... Aphlict Server (9737) exited normally. Storage is up to date. Use "storage status" for details. Synchronizing static tables... Verifying database schemata on "localhost:3306"... Found no adjustments for schemata. There are processes running that look like Phabricator daemons but have no corresponding PID files: 9619 php ./phd-daemon 9716 php ./phd-daemon Stop these processes by re-running this command with the --force parameter. Freeing active task leases... Freed 0 task lease(s). Launching daemons: (Logs will appear in "/var/tmp/phdA/log/daemons.log".) (Pool: 1) PhabricatorRepositoryPullLocalDaemon (Pool: 1) PhabricatorTriggerDaemon (Pool: 4) PhabricatorTaskmasterDaemon Done. Reading configuration from: phabricator/conf/aphlict/aphlict.default.json Aphlict is not running. Writing logs to: /var/log/aphlictA.log Aphlict Server started. Storage is up to date. Use "storage status" for details. Synchronizing static tables... Verifying database schemata on "databasehost"... Found no adjustments for schemata. There are processes running that look like Phabricator daemons but have no corresponding PID files: 9619 php ./phd-daemon 9716 php ./phd-daemon 9843 php ./phd-daemon 9850 php ./exec_daemon.php PhabricatorTaskmasterDaemon Stop these processes by re-running this command with the --force parameter. Freeing active task leases... Freed 0 task lease(s). Launching daemons: (Logs will appear in "/var/tmp/phdB/log/daemons.log".) (Pool: 1) PhabricatorRepositoryPullLocalDaemon (Pool: 1) PhabricatorTriggerDaemon (Pool: 4) PhabricatorTaskmasterDaemon Done. Reading configuration from: phabricator/conf/aphlict/aphlict.custom.json Aphlict is not running. Writing logs to: /var/log/aphlictB.log
Third time
$ sudo bash reproduce.sh Log Daemon Host Overseer Started Class Arguments 998 9843:m7qiaxl localhost 9843 Apr 12 2017, 5:46:19 AM PhabricatorRepositoryPullLocalDaemon 1000 9843:7i2c3wx RND02 9843 Apr 12 2017, 5:46:20 AM PhabricatorTaskmasterDaemon 999 9843:evldzyb RND02 9843 Apr 12 2017, 5:46:20 AM PhabricatorTriggerDaemon 997 9619:mm4c5h3 RND02 9619 Apr 12 2017, 5:45:20 AM PhabricatorTaskmasterDaemon 996 9619:funyzsh RND02 9619 Apr 12 2017, 5:45:20 AM PhabricatorTriggerDaemon 995 9619:csufoho RND02 9619 Apr 12 2017, 5:45:20 AM PhabricatorRepositoryPullLocalDaemon There are processes running that look like Phabricator daemons but have no corresponding PID files: 9619 php ./phd-daemon 9716 php ./phd-daemon 9843 php ./phd-daemon 9940 php ./phd-daemon Stop these processes by re-running this command with the --force parameter. Reading configuration from: phabricator/conf/aphlict/aphlict.default.json Stopping Aphlict Server (9864)... Aphlict Server (9864) exited normally. Log Daemon Host Overseer Started Class Arguments 691 9940:um3qovz RND02 9940 Apr 12 2017, 5:46:30 AM PhabricatorTaskmasterDaemon 690 9940:ddskuvn RND02 9940 Apr 12 2017, 5:46:30 AM PhabricatorTriggerDaemon 689 9940:f6cwbq7 RND02 9940 Apr 12 2017, 5:46:30 AM PhabricatorRepositoryPullLocalDaemon 688 9716:f5req5t RND02 9716 Apr 12 2017, 5:45:31 AM PhabricatorTaskmasterDaemon 687 9716:rfrbgrx RND02 9716 Apr 12 2017, 5:45:31 AM PhabricatorTriggerDaemon 686 9716:kj6ny5f RND02 9716 Apr 12 2017, 5:45:31 AM PhabricatorRepositoryPullLocalDaemon There are processes running that look like Phabricator daemons but have no corresponding PID files: 9619 php ./phd-daemon 9716 php ./phd-daemon 9843 php ./phd-daemon 9940 php ./phd-daemon Stop these processes by re-running this command with the --force parameter. Reading configuration from: phabricator/conf/aphlict/aphlict.custom.json Stopping Aphlict Server (9961)... Aphlict Server (9961) exited normally. Storage is up to date. Use "storage status" for details. Synchronizing static tables... Verifying database schemata on "localhost:3306"... Found no adjustments for schemata. There are processes running that look like Phabricator daemons but have no corresponding PID files: 9619 php ./phd-daemon 9716 php ./phd-daemon 9843 php ./phd-daemon 9940 php ./phd-daemon Stop these processes by re-running this command with the --force parameter. Freeing active task leases... Freed 0 task lease(s). Launching daemons: (Logs will appear in "/var/tmp/phdA/log/daemons.log".) (Pool: 1) PhabricatorRepositoryPullLocalDaemon (Pool: 1) PhabricatorTriggerDaemon (Pool: 4) PhabricatorTaskmasterDaemon Done. Reading configuration from: phabricator/conf/aphlict/aphlict.default.json Aphlict is not running. Writing logs to: /var/log/aphlictA.log Aphlict Server started. Storage is up to date. Use "storage status" for details. Synchronizing static tables... Verifying database schemata on "databasehost"... Found no adjustments for schemata. There are processes running that look like Phabricator daemons but have no corresponding PID files: 9619 php ./phd-daemon 9716 php ./phd-daemon 9843 php ./phd-daemon 9940 php ./phd-daemon 10360 php ./phd-daemon 10367 php ./exec_daemon.php PhabricatorTaskmasterDaemon Stop these processes by re-running this command with the --force parameter. Freeing active task leases... Freed 0 task lease(s). Launching daemons: (Logs will appear in "/var/tmp/phdB/log/daemons.log".) (Pool: 1) PhabricatorRepositoryPullLocalDaemon (Pool: 1) PhabricatorTriggerDaemon (Pool: 4) PhabricatorTaskmasterDaemon Done. Reading configuration from: phabricator/conf/aphlict/aphlict.custom.json Aphlict is not running. Writing logs to: /var/log/aphlictB.log Aphlict Server started.
P.S
- I've tried changing bin/phd start to bin/phd restart, but the results were same. Also, I kill all daemons, cleans PID files for each daemon, and reboot, but I get same results.
- I host multiple Phabricator because those two team should break apart and would use different office/servers later.