Page MenuHomePhabricator

Fast repository polling on multiple repositories causes 2Hz polling of remote host
Closed, InvalidPublic

Description

According to our GitLab statistics, our phabricator instance is polling GitLab (we have 46 repositories configured in phabricator) at over 2x per second. This is causing heavy load on the GitLab servers.

The top ten report of requests per hour according to GitLab:

GitLab report (Hits per hour)Diffusion Poll Interval
19415s
19315s
19322s
19015s
19015s
18722s
18515s
18315s
16915s
12515s

Event Timeline

award updated the task description. (Show Details)
award added projects: Daemons, Diffusion.
award mentioned this in Z1336: General Chat.
award added a subscriber: award.

@epriestley I know you mentioned that a diffusion.looksoon would trigger this, but we have no known users of this command.

I'm not sure I understand your math -- here's what I get:

(194 requests / hour) * (1 hour / 60 minutes) * (1 minute / 60 seconds) ~= 0.05hz ~= 1 request every 20 seconds

This seems like the expected behavior. At a poll rate of 2Hz, I would expect about 7,200 requests per hour for each repository.

Am I misunderstanding?

Can you explain what you're seeing which makes you think that this is causing "heavy load"?

Oops! Thats embarrassing. These numbers were gathered from multiple places/multiple people, and mass confusion ensued. The 2Hz number came from the total requests to GitLab instead of requests per repository. Ill update the title/description accordingly.

The issue here is that when we have so many repositories (46) with such a low polling rate we are causing a huge amount of traffic, just to check if there are new commits. The table only shows our top ten repos, many other repos poll at similar intervals.

I know in T8227 you mention that adding options is a sign of a bad design, and I would agree. In this case, I think the bad design comes from using a polling model. Adding a hook based system would allow for much better scaling of the relationship between a phabricator instance and an external repository.

Regardless of a hook based system, I am not sure why it is so important for phabricator to be so close to realtime on an active repository. Especially in an arc based workflow where arc can call diffusion.looksoon, is 15s really an appropriate minimum polling time?

award renamed this task from Repositories are being polled at 2Hz to Fast repository polling on multiple repositories causes 2Hz polling of remote host.Nov 19 2015, 2:41 AM
award updated the task description. (Show Details)

At that scale, is there any reason to not use Phabricator's repository hosting instead of GitLab? You can mirror repositories from Phabricator to GitLab if you still need some GitLab functionality (see the Mirrors option under Edit Repository).

huge amount of traffic

Can you quantify this more specifically? A total of 2 requests / second does not seem like a large amount of traffic to me. In cases where the poll doesn't find new data,git fetch should essentially just compare hashes and exit, and the cost should be roughly comparable to the cost of other requests like HTTP requests. Are these requests hugely expensive when executed via GitLab? Are you just assuming that they're expensive without actually measuring it? Are your repositories unusual in some way which makes git fetch operations fundamentally expensive?

I am not sure of the underlying implementation of GitLab, but our phabricator instance has been flagged as the #1 user of GitLab resources. Our GitLab instance hosts thousands of repositories, so being the single largest user of GitLab is concerning since we only have ~50 repositories.

@hach-que GitLab is the officially supported solution for us whereas phabricator is locally hosted. We would like to avoid developers pushing directly to phabricator as it has no enterprise support or uptime guarantees and is running on consumer grade hardware vs gitlab with a dedicated support team.

Sounds like you could easily fork and change the polling frequency; The calculation is at https://secure.phabricator.com/diffusion/P/browse/master/src/applications/repository/storage/PhabricatorRepository.php$1527, and you can drop it to an hour just by changing that line.

epriestley claimed this task.
epriestley added a subscriber: jcarrillo7.

Since we don't have access to whatever GitLab is measuring, we can't understand this issue in greater detail. Particularly, we can not reproduce the underlying heavy load in a way we can examine. We have not seen similar reports from other users, except @jcarrillo7, who has not filed any details. We have many installs with hundreds of repositories and my expectation is that the operation is cheap, so I suspect this may be an issue unique to GitLab.

This stuff used to be configurable and have hook support and users frequently had difficulty configuring and using it. I believe you and @jcarrillo7 are the only users who have encountered difficulty here since the last set of changes, circa D10323 (more than a year ago). Polling somewhat aggressively is extremely easy to configure, which is a paramount concern given the forces this project faces. If there are particular, reproducible circumstances under which this polling creates an unreasonable amount of load, please feel free to resubmit this issue with instructions on how to reproduce, measure, and observe this load. If this load exists in the general case, I'd prefer to find a way to continue polling (which is easy to configure) but reduce the load it creates; this may be possible, but we can't pursue it without a way to reproduce the issue.

You can fork Phabricator locally and adjust the polling behaviors. @avivey points at the algorithm.

We offer Phabricator hosting with dedicated support through Phacility.

For what it's worth, we have a very similar setup (40ish repositories in Phabricator, all hosted in Gitlab). We have not observed any performance issues such as those described in this ticket.