Page MenuHomePhabricator

high server load due to continuous hg calls
Open, Needs TriagePublic

Description

In my Phabricator instance I have several external hosting repositories. Some git repos, one small Mercurial, and one large Mercurial repo with over 1000 branches. CPUs load jump up to 100% whenever I have the large mercurial repo turned on. I looked at the process log and see around 100 - 200 hg calls at all times on the server.

The large Mercurial has been fully imported into the database already, at least as reported by PH. And i don't see any write going into the database while all the hg calls are happening.

It doesn't seem to occur with other repositories.

Uptime

root@andyt-ubuntu:/etc/apache2# uptime
 09:43:35 up 2 days, 18:10,  4 users,  load average: 58.86, 55.50, 49.40

hg call processes

ph-daem+ 22367  0.0  0.0  37264  5304 ?        R    09:44   0:00 /usr/bin/python /usr/bin/hg log --template={node} --rev '43fb2866610810f9d375c7fb830cb5187844c068'
ph-daem+ 22368  0.0  0.0  37264  5464 ?        R    09:44   0:00 /usr/bin/python /usr/bin/hg log --template={node} --rev '53ac7368a8706e2593005fd6f4f28c8f29d91ed2'
ph-daem+ 22369  0.0  0.0  37128  5260 ?        R    09:44   0:00 /usr/bin/python /usr/bin/hg log --template={node} --rev '62dcbd2c6d220b0f14b0f44134fb0090bff43ea4'
ph-daem+ 22370  0.0  0.0  37264  5468 ?        R    09:44   0:00 /usr/bin/python /usr/bin/hg log --template={node} --rev 'dd10c3338cf591fdae61005804bad0cae8916905'
ph-daem+ 22371  0.0  0.0   4444   656 ?        S    09:44   0:00 sh -c hg log --template='{node}' --rev ''\''5c75504853979ec2e31e5b1dd558519f589a440c'\'''
ph-daem+ 22373  0.0  0.0  29708  4376 ?        R    09:44   0:00 /usr/bin/python /usr/bin/hg log --template={node} --rev '5c75504853979ec2e31e5b1dd558519f589a440c'
ph-daem+ 22374  0.0  0.0   4444   656 ?        S    09:44   0:00 sh -c hg log --template='{node}' --rev ''\''a6f2142cd8cf328c3db6bd8bb4cc389195b11307'\'''
ph-daem+ 22375  0.0  0.0   4444   652 ?        S    09:44   0:00 sh -c hg log --template='{node}' --rev ''\''52238cf1a495429a7b32087de839c03269a03fc3'\'''
ph-daem+ 22376  0.0  0.0   4444   656 ?        S    09:44   0:00 sh -c hg log --template='{node}' --rev ''\''241ff3a36a6038d88b43c9ecdeaadc3f84d0361a'\'''
ph-daem+ 22377  0.0  0.0  20468  3152 ?        R    09:44   0:00 /usr/bin/python /usr/bin/hg log --template={node} --rev 'a6f2142cd8cf328c3db6bd8bb4cc389195b11307'
ph-daem+ 22378  0.0  0.0  20472  3400 ?        R    09:44   0:00 /usr/bin/python /usr/bin/hg log --template={node} --rev '52238cf1a495429a7b32087de839c03269a03fc3'
ph-daem+ 22379  0.0  0.0   4444   104 ?        R    09:44   0:00 sh -c hg log --template='{node}' --rev ''\''241ff3a36a6038d88b43c9ecdeaadc3f84d0361a'\'''
ph-daem+ 22380  0.0  0.0   4444   652 ?        S    09:44   0:00 sh -c hg log --template='{node}' --rev ''\''938c969d7b398ec5b7c5df43f9717622eb8d92f5'\'''

Event Timeline

wokuku raised the priority of this task from to Needs Triage.
wokuku updated the task description. (Show Details)
wokuku added a subscriber: wokuku.

We are experiencing this exact same issue after upgrading our Phabricator to the latest version last week.

@epriestley - hate to be "that guy", but can you triage this? It's really impacting our server in an "unbreak this!" kind of way. Slowing down all of Phabricator noticeably and, most importantly, it's preventing all email notifications due to the average CPU load constantly being too high which is crippling our workflow.

Does your remote also have an enormous number of branches?

@epriestley: Enormous is relative, but yeah I'd say it has an abnormal amount of branches.

D12548 probably fixes this. Let me know how things look after that lands.

epriestley claimed this task.

Presuming this is fixed since no one's yelling, but let us know if you're still seeing issues.

I still see a very large number of Mercurial processes. However, it seems to have helped with load. Is there a way to limit the number of hg processes that get run for this?

There's no way to limit the number of processes, no. What are the processes you're still seeing? We shouldn't generally be running the same hg log command anymore, so it would be surprising to me if you're seeing a smaller number of very similar processes.

Here is a quick shot of what appears in top.

Screen Shot 2015-05-04 at 10.51.16 AM.png (1×644 px, 524 KB)

Are those processes old, from before your most recent restart? (e.g., "STARTED" column in ps auxwww) I think <defunct> usually indicates that the process is not in good shape. Maybe try this:

  • Stop the daemons with phd stop.
  • Kill any hg process which still exists.
  • Restart the daemons with phd start.
  • See if things are in better shape.

After doing that this is what I am seeing within moments after issuing phd start and our idle slowly trickles to 0%. User 501 is our Phabricator user (daemon-user).

501       6898  0.2  0.5 319804 23108 ?        S    11:03   0:00 php /var/www/code.interworks.com/phabricator/scripts/daemon/phd-daemon
501       6901  2.3  0.7 320672 30364 ?        Ss   11:03   0:00 php /var/www/code.interworks.com/libphutil/scripts/daemon/exec/exec_daemon.php PhabricatorRepositoryPullLocalDaemon
501       6902  0.4  0.7 321200 30572 ?        Ss   11:03   0:00 php /var/www/code.interworks.com/libphutil/scripts/daemon/exec/exec_daemon.php PhabricatorTriggerDaemon
501       6903  0.7  0.7 317776 27524 ?        Ss   11:03   0:00 php /var/www/code.interworks.com/libphutil/scripts/daemon/exec/exec_daemon.php PhabricatorTaskmasterDaemon
501       7963  3.6  0.8 324060 33828 ?        R    11:03   0:00 php /var/www/code.interworks.com/phabricator/bin/repository update -- TBB
501       8105  1.8  0.7 319592 29240 ?        R    11:03   0:00 php /var/www/code.interworks.com/phabricator/bin/repository update -- FRBS
501       8106  0.8  0.0      0     0 ?        Z    11:03   0:00 [hg] <defunct>
501       8107  0.8  0.0      0     0 ?        Z    11:03   0:00 [hg] <defunct>
501       8110  0.8  0.0      0     0 ?        Z    11:03   0:00 [hg] <defunct>
501       8111  0.7  0.0      0     0 ?        Z    11:03   0:00 [hg] <defunct>
501       8112  0.8  0.0      0     0 ?        Z    11:03   0:00 [hg] <defunct>
501       8113  0.8  0.0      0     0 ?        Z    11:03   0:00 [hg] <defunct>
501       8117  1.0  0.0      0     0 ?        Z    11:03   0:00 [hg] <defunct>
501       8118  1.0  0.0      0     0 ?        Z    11:03   0:00 [hg] <defunct>
501       8119  1.0  0.0      0     0 ?        Z    11:03   0:00 [hg] <defunct>
501       8120  1.0  0.0      0     0 ?        Z    11:03   0:00 [hg] <defunct>
501       8124  1.0  0.0      0     0 ?        Z    11:03   0:00 [hg] <defunct>
501       8125  1.0  0.0      0     0 ?        Z    11:03   0:00 [hg] <defunct>
501       8126  1.0  0.0      0     0 ?        Z    11:03   0:00 [hg] <defunct>
501       8129  0.8  0.0      0     0 ?        Z    11:03   0:00 [hg] <defunct>
501       8130  1.0  0.0      0     0 ?        Z    11:03   0:00 [hg] <defunct>
501       8131  1.0  0.0      0     0 ?        Z    11:03   0:00 [hg] <defunct>
501       8137  1.0  0.0      0     0 ?        Z    11:03   0:00 [hg] <defunct>
501       8138  1.0  0.0      0     0 ?        Z    11:03   0:00 [hg] <defunct>
501       8139  1.0  0.0      0     0 ?        Z    11:03   0:00 [hg] <defunct>
501       8140  1.0  0.0      0     0 ?        Z    11:03   0:00 [hg] <defunct>
501       8141  1.0  0.0      0     0 ?        Z    11:03   0:00 [hg] <defunct>
501       8142  1.0  0.0      0     0 ?        Z    11:03   0:00 [hg] <defunct>
501       8143  1.0  0.0      0     0 ?        Z    11:03   0:00 [hg] <defunct>
501       8147  1.0  0.0      0     0 ?        Z    11:03   0:00 [hg] <defunct>
501       8148  1.1  0.0      0     0 ?        Z    11:03   0:00 [hg] <defunct>
501       8149  1.0  0.0      0     0 ?        Z    11:03   0:00 [hg] <defunct>
501       8153  1.0  0.0      0     0 ?        Z    11:03   0:00 [hg] <defunct>
501       8154  1.0  0.0      0     0 ?        Z    11:03   0:00 [hg] <defunct>
501       8155  1.0  0.0      0     0 ?        Z    11:03   0:00 [hg] <defunct>
501       8159  1.0  0.0      0     0 ?        Z    11:03   0:00 [hg] <defunct>
501       8160  1.0  0.0      0     0 ?        Z    11:03   0:00 [hg] <defunct>
501       8161  1.0  0.0      0     0 ?        Z    11:03   0:00 [hg] <defunct>
501       8165  1.0  0.0      0     0 ?        Z    11:03   0:00 [hg] <defunct>
501       8166  1.1  0.0      0     0 ?        Z    11:03   0:00 [hg] <defunct>
501       8167  1.1  0.0      0     0 ?        Z    11:03   0:00 [hg] <defunct>
501       8172  1.1  0.0      0     0 ?        Z    11:03   0:00 [hg] <defunct>
501       8173  1.1  0.0      0     0 ?        Z    11:03   0:00 [hg] <defunct>
501       8174  1.0  0.0      0     0 ?        Z    11:03   0:00 [hg] <defunct>
501       8175  1.1  0.0      0     0 ?        Z    11:03   0:00 [hg] <defunct>
501       8179  1.0  0.0      0     0 ?        Z    11:03   0:00 [hg] <defunct>
501       8180  1.1  0.0      0     0 ?        Z    11:03   0:00 [hg] <defunct>
501       8181  1.1  0.0      0     0 ?        Z    11:03   0:00 [hg] <defunct>
501       8185  1.0  0.0      0     0 ?        Z    11:03   0:00 [hg] <defunct>
501       8186  1.1  0.0      0     0 ?        Z    11:03   0:00 [hg] <defunct>
501       8187  1.1  0.0      0     0 ?        Z    11:03   0:00 [hg] <defunct>
501       8191  1.1  0.0      0     0 ?        Z    11:03   0:00 [hg] <defunct>
501       8192  1.1  0.0      0     0 ?        Z    11:03   0:00 [hg] <defunct>
501       8193  1.0  0.0      0     0 ?        Z    11:03   0:00 [hg] <defunct>
501       8197  1.1  0.0      0     0 ?        Z    11:03   0:00 [hg] <defunct>
501       8198  1.1  0.0      0     0 ?        Z    11:03   0:00 [hg] <defunct>
501       8199  1.0  0.0      0     0 ?        Z    11:03   0:00 [hg] <defunct>
501       8203  1.1  0.0      0     0 ?        Z    11:03   0:00 [hg] <defunct>
501       8204  1.4  0.0      0     0 ?        Z    11:03   0:00 [hg] <defunct>
501       8205  1.4  0.0      0     0 ?        Z    11:03   0:00 [hg] <defunct>
501       8209  1.4  0.0      0     0 ?        Z    11:03   0:00 [hg] <defunct>
501       8210  1.4  0.0      0     0 ?        Z    11:03   0:00 [hg] <defunct>
501       8215  1.4  0.0      0     0 ?        Z    11:03   0:00 [hg] <defunct>
501       8216  1.4  0.0      0     0 ?        Z    11:03   0:00 [hg] <defunct>
501       8217  1.4  0.0      0     0 ?        Z    11:03   0:00 [hg] <defunct>
501       8218  1.4  0.0      0     0 ?        Z    11:03   0:00 [hg] <defunct>
501       8223  1.4  0.0      0     0 ?        Z    11:03   0:00 [hg] <defunct>
501       8225  1.4  0.0      0     0 ?        Z    11:03   0:00 [hg] <defunct>
501       8226  1.4  0.0      0     0 ?        Z    11:03   0:00 [hg] <defunct>
501       8227  1.4  0.0      0     0 ?        Z    11:03   0:00 [hg] <defunct>
501       8228  1.4  0.0      0     0 ?        Z    11:03   0:00 [hg] <defunct>
501       8229  1.4  0.0      0     0 ?        Z    11:03   0:00 [hg] <defunct>
501       8230  1.8  0.6 316624 26332 ?        R    11:03   0:00 php /var/www/code.interworks.com/phabricator/bin/repository update -- TABSRVLOG
501       8231  1.4  0.0      0     0 ?        Z    11:03   0:00 [hg] <defunct>
501       8232  1.4  0.0      0     0 ?        Z    11:03   0:00 [hg] <defunct>
501       8233  1.4  0.0      0     0 ?        Z    11:03   0:00 [hg] <defunct>
501       8234  1.2  0.0      0     0 ?        Z    11:03   0:00 [hg] <defunct>
501       8235  1.5  0.0      0     0 ?        Z    11:03   0:00 [hg] <defunct>
501       8236  1.5  0.0      0     0 ?        Z    11:03   0:00 [hg] <defunct>
501       8237  1.7  0.0      0     0 ?        Z    11:03   0:00 [hg] <defunct>
501       8238  1.5  0.0      0     0 ?        Z    11:03   0:00 [hg] <defunct>
501       8239  1.7  0.0      0     0 ?        Z    11:03   0:00 [hg] <defunct>
501       8240  1.7  0.0      0     0 ?        Z    11:03   0:00 [hg] <defunct>
501       8241  1.7  0.0      0     0 ?        Z    11:03   0:00 [hg] <defunct>
501       8242  1.7  0.0      0     0 ?        Z    11:03   0:00 [hg] <defunct>
501       8243  1.7  0.0      0     0 ?        Z    11:03   0:00 [hg] <defunct>
501       8244  1.7  0.0      0     0 ?        Z    11:03   0:00 [hg] <defunct>
501       8245  1.7  0.0      0     0 ?        Z    11:03   0:00 [hg] <defunct>
501       8246  1.7  0.0      0     0 ?        Z    11:03   0:00 [hg] <defunct>
501       8247  1.5  0.0      0     0 ?        Z    11:03   0:00 [hg] <defunct>
501       8248  2.3  0.0      0     0 ?        Z    11:03   0:00 [hg] <defunct>
501       8249  2.3  0.0      0     0 ?        Z    11:03   0:00 [hg] <defunct>
501       8250  2.3  0.0      0     0 ?        Z    11:03   0:00 [hg] <defunct>
501       8251  2.3  0.0      0     0 ?        Z    11:03   0:00 [hg] <defunct>
501       8252  2.0  0.0      0     0 ?        Z    11:03   0:00 [hg] <defunct>
501       8253  2.3  0.0      0     0 ?        Z    11:03   0:00 [hg] <defunct>
501       8254  2.0  0.0      0     0 ?        Z    11:03   0:00 [hg] <defunct>
501       8255  2.3  0.0      0     0 ?        R    11:03   0:00 [hg]
501       8256  2.0  0.2 185188 11652 ?        R    11:03   0:00 /usr/bin/python /usr/bin/hg log --template={node} --rev '1fb605904c81758acb682cb73f31e85ca424856d'
501       8257  2.0  0.3 186244 12628 ?        R    11:03   0:00 /usr/bin/python /usr/bin/hg log --template={node} --rev '3d3bcbae2f3985aa6e945a3ef77e4afc42a8ead6'
501       8258  2.0  0.2 185192 11652 ?        R    11:03   0:00 /usr/bin/python /usr/bin/hg log --template={node} --rev 'de8cfd94e371cf4f98f69daba6cd0d49098fb624'
501       8260  2.0  0.3 186248 12632 ?        R    11:03   0:00 /usr/bin/python /usr/bin/hg log --template={node} --rev '4fede2f21075a4e5fa52d4af1a504ec91d6c20fe'
501       8261  2.0  0.3 186244 12628 ?        R    11:03   0:00 /usr/bin/python /usr/bin/hg log --template={node} --rev 'ba02ec7fcd9c53268c410ae9948c736856ce16a2'
501       8262  2.0  0.3 186244 12628 ?        R    11:03   0:00 /usr/bin/python /usr/bin/hg log --template={node} --rev '3ea3a3841652bfdf20c92a74317e2a74093ea954'
501       8263  2.0  0.3 186244 12628 ?        R    11:03   0:00 /usr/bin/python /usr/bin/hg log --template={node} --rev '144f352f01737b31ddadabfc03b0e4fe57587ff1'
501       8264  2.0  0.3 186244 12616 ?        R    11:03   0:00 /usr/bin/python /usr/bin/hg log --template={node} --rev 'f880aac8ed87569ab11176d07fe4acf808b5228f'
501       8266  3.0  0.3 185784 12368 ?        R    11:03   0:00 /usr/bin/python /usr/bin/hg log --template={node} --rev '8e3ea1d7545cbc74dbab682cb187ffc5fca38df2'
501       8267  3.0  0.3 186248 12632 ?        R    11:03   0:00 /usr/bin/python /usr/bin/hg log --template={node} --rev 'e5bc87c700b5522b4a7232395073c823ab5711fb'
501       8269  2.5  0.3 185652 12176 ?        R    11:03   0:00 /usr/bin/python /usr/bin/hg log --template={node} --rev '69c521620abd0c75753fd5097a1d8f26193c7a21'
501       8270  2.5  0.3 185652 12192 ?        R    11:03   0:00 /usr/bin/python /usr/bin/hg log --template={node} --rev 'bd4434ba8a6b60674c8ac5e22aaa59f0607d6fde'
501       8272  2.5  0.3 185648 12168 ?        R    11:03   0:00 /usr/bin/python /usr/bin/hg log --template={node} --rev '39f6268757a770f1256abf985aaeb292c061492c'
501       8273  2.5  0.3 185648 12192 ?        R    11:03   0:00 /usr/bin/python /usr/bin/hg log --template={node} --rev '56a099a5842316e1de921a1198982f130ac3ba73'
501       8275  2.0  0.3 185652 12052 ?        R    11:03   0:00 /usr/bin/python /usr/bin/hg log --template={node} --rev 'fde147c0afcfecde2fc8e32b62254dbc9584e0ff'
501       8276  2.0  0.3 185648 12032 ?        R    11:03   0:00 /usr/bin/python /usr/bin/hg log --template={node} --rev '920735f469877f6f9cb4de89caeaa1656abee82e'
501       8277  2.5  0.3 185648 12048 ?        R    11:03   0:00 /usr/bin/python /usr/bin/hg log --template={node} --rev 'ccf3de62b5de65196666a0ba519b4051dd299dfa'
501       8278  2.0  0.3 185648 12032 ?        R    11:03   0:00 /usr/bin/python /usr/bin/hg log --template={node} --rev '1ee309bedb432140b8b1292920a1836af0d18e5c'
501       8279  2.5  0.3 185648 12052 ?        R    11:03   0:00 /usr/bin/python /usr/bin/hg log --template={node} --rev 'ab0a56c28aa71dfad022df8321f3f5905db4f1be'
501       8280  2.0  0.3 185648 12024 ?        R    11:03   0:00 /usr/bin/python /usr/bin/hg log --template={node} --rev 'a2a7f90db42b3329a30b8dc867bbc885c5bb7cb6'
501       8282  2.0  0.2 185648 11740 ?        R    11:03   0:00 /usr/bin/python /usr/bin/hg log --template={node} --rev '761eaa98ff22ceda811b6cfc90a2a47f47b0364e'
501       8283  2.0  0.3 185648 11820 ?        R    11:03   0:00 /usr/bin/python /usr/bin/hg log --template={node} --rev '61bfbc58828477922f799eedfac1823999317a14'
501       8285  3.0  0.2 184960 11500 ?        R    11:03   0:00 /usr/bin/python /usr/bin/hg log --template={node} --rev 'feed2db6efb4f8449ff6ae760b15fcefb5a81f99'
501       8286  3.0  0.2 184112 10616 ?        R    11:03   0:00 /usr/bin/python /usr/bin/hg log --template={node} --rev '9ad74c8a54e1b5027bdb2732dfc93b12001c9584'
501       8288  3.0  0.2 184112 10580 ?        R    11:03   0:00 /usr/bin/python /usr/bin/hg log --template={node} --rev 'c4ab5a66336f754d124b91c0e20d0064a478c910'
501       8289  4.0  0.2 179976 10508 ?        R    11:03   0:00 /usr/bin/python /usr/bin/hg log --template={node} --rev 'f36aacdd954f0749bc3b9bcf16859ffb71782159'
501       8291  4.0  0.2 177580 10208 ?        R    11:03   0:00 /usr/bin/python /usr/bin/hg log --template={node} --rev '8809d164d3b7d84df45cd7b5fdf92dab32668683'
501       8293  3.0  0.2 176780  9456 ?        R    11:03   0:00 /usr/bin/python /usr/bin/hg log --template={node} --rev 'b64414a9db8873beeed6044226f08902b36060c5'
501       8294  3.0  0.2 176784  9460 ?        R    11:03   0:00 /usr/bin/python /usr/bin/hg log --template={node} --rev '2a8a5e8fb34a87d9b1a1f1ae6490c471bc80b392'
501       8295  3.0  0.2 176784  9532 ?        R    11:03   0:00 /usr/bin/python /usr/bin/hg log --template={node} --rev 'ff9e1674a46d21ad1ef8ea093b9ed4dff480d9cd'
501       8296  3.0  0.2 176780  9420 ?        R    11:03   0:00 /usr/bin/python /usr/bin/hg log --template={node} --rev '8140c6df4ed8bcf88561dc353a5200035e452b8d'
501       8297  2.0  0.2 176780  9452 ?        R    11:03   0:00 /usr/bin/python /usr/bin/hg log --template={node} --rev '8885cb1082debe55c11622ec6c204701a5bbebb5'
501       8298  2.0  0.2 176460  9224 ?        R    11:03   0:00 /usr/bin/python /usr/bin/hg log --template={node} --rev '6326605af7ce3994d5fdeccf25e8351bd3f2fd6f'
501       8300  2.0  0.2 175932  8628 ?        R    11:03   0:00 /usr/bin/python /usr/bin/hg log --template={node} --rev '6357919c8f20334d0eee78649bdcd76171218be7'
501       8301  3.0  0.2 176464  9284 ?        R    11:03   0:00 /usr/bin/python /usr/bin/hg log --template={node} --rev 'ed946fe4e428baf1e0fbb0a73bf260cd703aa034'
501       8302  2.0  0.2 175940  8564 ?        R    11:03   0:00 /usr/bin/python /usr/bin/hg log --template={node} --rev '9f26af5cb492a2b1bdece0231e8d55f9bac955f0'
501       8303  2.0  0.2 175940  8580 ?        R    11:03   0:00 /usr/bin/python /usr/bin/hg log --template={node} --rev '6c70a89a1526261c01a9d84b60275ec8c4db6938'
501       8304  2.0  0.2 175012  7888 ?        R    11:03   0:00 /usr/bin/python /usr/bin/hg log --template={node} --rev 'a01d58defa9dbb09d636eaf318d6a13cad2c50d3'
501       8305  2.0  0.1 172804  7608 ?        R    11:03   0:00 /usr/bin/python /usr/bin/hg log --template={node} --rev 'f173b9449d128f02a3c89f2721d9828022f282cb'
501       8306  2.0  0.1 172800  7612 ?        R    11:03   0:00 /usr/bin/python /usr/bin/hg log --template={node} --rev '3c4d7e1ae8e94900943ade804f697712504dbe37'
501       8307  1.0  0.1 170512  7332 ?        R    11:03   0:00 /usr/bin/python /usr/bin/hg log --template={node} --rev 'e12bf0e43f0d47ac25778e48e48552cbf0e4dfa1'
501       8308  2.0  0.1 170740  7384 ?        R    11:03   0:00 /usr/bin/python /usr/bin/hg log --template={node} --rev 'c9ee5954ed7d7d4e288fcd0bab9b2bbf03232348'
501       8309  1.0  0.1 172800  7612 ?        R    11:03   0:00 /usr/bin/python /usr/bin/hg log --template={node} --rev '4b6e74ee1cd3b30252bbae62bf81579ab075f734'
501       8312  0.0  0.1 168272  7288 ?        R    11:03   0:00 /usr/bin/python /usr/bin/hg log --template={node} --rev '414865523bcf060c72408a23cc601c0a5054a371'
501       8313  0.0  0.1 168272  7288 ?        R    11:03   0:00 /usr/bin/python /usr/bin/hg log --template={node} --rev '3b6e6d45349a1d46f54838058aa6ded578cae4f5'
501       8317  0.0  0.1 168276  7116 ?        R    11:03   0:00 /usr/bin/python /usr/bin/hg log --template={node} --rev 'a2a0e072bc59e83287dbcca06c2f82e0af7b7acb'
501       8319  0.0  0.1 159508  6600 ?        R    11:03   0:00 /usr/bin/python /usr/bin/hg log --template={node} --rev '5ed27aec9f4d67af079219ae89742eb8c1d8c6c9'
501       8320  0.0  0.1 157340  6472 ?        R    11:03   0:00 /usr/bin/python /usr/bin/hg log --template={node} --rev '446b06b777b5a2d097501ef73b9b7b3d2977d443'
501       8322  0.0  0.1 155180  5240 ?        R    11:03   0:00 /usr/bin/python /usr/bin/hg log --template={node} --rev '28727a2cf0298ceff850d1497c4ed57a38599637'
501       8323  0.0  0.1 155180  5984 ?        R    11:03   0:00 /usr/bin/python /usr/bin/hg log --template={node} --rev 'f6ced5583627f0ce583ee4d7301573833a5b7082'
501       8326  0.0  0.1 122252  4696 ?        R    11:03   0:00 /usr/bin/python /usr/bin/hg log --template={node} --rev 'adfd69853cdc7851d8bb0a2b0a652e1271df403e'
501       8327  0.0  0.1 122104  4552 ?        R    11:03   0:00 /usr/bin/python /usr/bin/hg log --template={node} --rev '72d00e07e029a645f55f09b631e810a674ee01aa'
501       8328  0.0  0.1 117552  4212 ?        R    11:03   0:00 /usr/bin/python /usr/bin/hg log --template={node} --rev 'ac38494b814dcdd4101da4273b09ac67ec693bd5'
501       8329  0.0  0.1 288160  6816 ?        R    11:03   0:00 php /var/www/code.interworks.com/phabricator/bin/repository update -- WSN

uptime

11:08:15 up 7 days, 18:32,  1 user,  load average: 33.90, 35.56, 36.91

Hmm, okay. I'll take another look at this and see if I can reproduce it.

@epriestley Any update on this, I'm seeing this issue with my hg repo, only have one repo and I don't think it has loads of branches and it is creating a load average of 25 and 100% CPU on the VM I'm running Phabricator on and it is almost unusable because of it. Thanks.

@andrew-dixon Sorry, there's no secret internal tracker full of juicy updates that we're hiding from the public. If we haven't updated a task, there aren't any updates on it. See Planning for a little more discussion on this in general.

We are having the same issue with one of our Mercurial repositories. This particular repository has 206 branches: 14 open, 192 closed. When a new commit is pushed to this repository, the repository update task is spawning an hg log process for every branch, causing high CPU. This process is repeated every 15 seconds for several hours.

After stepping though the code, I think I have traced the source of the problem to [[ https://secure.phabricator.com/source/phabricator/browse/master/src/applications/repository/engine/PhabricatorRepositoryRefEngine.php;c3bdcb4ca85487921909f0202aa760e8ed61404a$71 | the call to removeMissingCommits on line 71 of the PhabricatorRepositoryRefEngine class ]]. In the comments of this method I found a link to task T5839 and this comment:

In T5839#118045, @kdz13 wrote:

The newly added method removeMissingCommits was causing a lot of unneeded thrash on our mercurial repository. We have 140 branches, and phabricator was launching a process to check each revision every 15 seconds, and rather severely thrashing our disk. It seems to me that there's no need to call this method for mercurial branches, since the information contained in $branches was already obtained by running 'hg log' and should be up to date.

When commenting line 71 the high CPU issue is resolved for us (we don't use git repositories at the moment). However, I'm not sure how this should be fixed properly. Is @kdz13 right, does this method not apply to Mercurial repositories? Might the problem be hidden deeper, somewhere in [[https://secure.phabricator.com/source/phabricator/browse/master/src/applications/diffusion/query/lowlevel/DiffusionLowLevelResolveRefsQuery.php;c3bdcb4ca85487921909f0202aa760e8ed61404a$223-298|DiffusionLowLevelResolveRefsQuery]] for example?

Possibly related: T12129

Steps to reproduce:

  • Create a mercurial repository with a large number of branches (hosted in phabricator)
  • Make sure the repository if fully imported
  • Run repository update on the repository and observe the number of hg log processes. For instance use strace to trace the syscalls and count the number of execve syscalls to /usr/bin/hg:
strace -o log.txt -feprocess ./bin/repository update 1 | grep "/usr/bin/hg" log.txt | wc -l

Versioninfo:

ubuntu 14.04.5
phabricator c3bdcb4ca85487921909f0202aa760e8ed61404a (Sun, Feb 5)
arcanist 67a47acebd6b5e5809745db12955ad251c66004e (Mon, Feb 13)
phutil 58fb3b94464e1c00d0f348d1501bec6b8690253c (Mon, Feb 13)
apache 2.4.7
php 5.5.9
mercurial 2.8.2

Please let me know if I can provide you with more information.

↑ I second @rros, commenting line line 71 of the PhabricatorRepositoryRefEngine class fixed it for me. Don't know about possible secondary effect but I have the feeling it was not working as intended anyway.

I can confirm T12129 is indeed related, and the suggested workaround is very effective. Thanks for this!
For the record, our repo has about 60 branches total (about a dozen active only).