Page MenuHomePhabricator

Service failures in JIRA can cascade into service failures in Phabricator
Open, NormalPublic

Description

See PHI1211. An install reports that a dead / hanging JIRA cascaded into major issues in Phabricator, likely by exhausting available PHP-FPM workers.

This kind of failure is rare, but the failure mode is bad, and this sort of failure is more likely to become severe as more use is made of Doorkeeper in the future.

  • Timeouts: Doorkeeper services, including JIRA, should have timeouts.
  • Health Checks: We can reasonably put a general rate limiting / health check layer in between Phabricator and service calls in this class. If an external JIRA degrades, we can reduce or eliminate traffic to the service.

Adjacently:

  • T5378 has added a new URI specialization extension point. The DoorkeeperJIRARemarkupRule rule should become a URI specialization rule instead of a standalone rule with elevated priority.