Page MenuHomePhabricator

Mercurial fails to parse (A or B or ... or Z) revsets with more than about 300 items
Closed, ResolvedPublic

Description

Hello!
I have CentOS 6.5 clean install with all necessary for Phab works.
When phabricator trying to update mercurial repo with over 11k commits, I get an error:

[root@Phabricator ~]# /var/www/html/phabricator/bin/repository refs F
Updating refs in 'F'...
[2014-08-18 11:06:51] EXCEPTION: (CommandException) Command failed with error #1!
COMMAND
hg -v log --template '{node}\n' --rev ''\''bc5dcb8f646d389ea9abaf49eeb91560d8dabc27'\'' - ('\''b6cd88f28626b6a1423132bc97dfc78f0d6dc235'\'' or '\''8a35c2ed530f744b5c6317a1329cd2df57bdf8e0'\'' or '\''2537f9caf097bf7cb81e35f56587250d24a725c1'\'' or '\''7e3331c441cff42b81bbec2d70c2e0a660de9a56'\'' or '\''5e0a4037a8d882a0578a8d66f160eaca47d328ce'\'' or '\''b368bd0d4a31b67fa52f68092f202e3ef22644a8'\'' or '\''f36c30cba31853790c1e8676f281fd06fca23c89'\'' or '\''ad32948904ae2c37702ab1cf143e1b34fa8624c2'\'' or '\''1839a2201de3d3e081f45bd091deeaca3c2cc9f0'\'' or '\''1b440288859b84995a78cd527857a6ab5f5f4044'\'' or '\''b336b09dc811ebca4a6dfa15d25182b60370a4ac'\'' or '\''59052dced4a961c421020710343d67f7b1a4b4be'\'' or '\''9a9272e2d090f6e794636fbfd436c47659713dbc'\'' or '\''9df3208832eadb42eb6cb50ff742efeb9616587a'\'' or '\''14e3d65e8afdce3472207c8551158e0bc8f86a94'\'' or '\''4397c0c356070be9a35304e93d5644b35d951053'\'' or '\''aadfac19bf5567225816ae2a84bfbd3b05ce6ded'\'' or '\''441805e0b5fbb6b9e7111... (54,937 more bytes) ...

STDOUT
(empty)

STDERR

  • unknown exception encountered, please report by visiting
  • http://mercurial.selenic.com/wiki/BugTracker
  • Python 2.6.6 (r266:84292, Jan 22 2014, 09:37:14) [GCC 4.4.7 20120313 (Red Hat 4.4.7-4)]
  • Mercurial Distributed SCM (version 3.1+4-aca137619a45)
  • Extensions loaded:

Traceback (most recent call last):
File "/usr/bin/hg", line 43, in
mercurial.dispatch.run()
File "/usr/lib/python2.6/site-packages/mercurial/dispatch.py", line 28, in run
sys.exit((dispatch(request(sys.argv[1:])) or 0) & 255)
File "/usr/lib/python2.6/site-packages/mercurial/dispatch.py", line 69, in dispatch
ret = _runcatch(req)
File "/usr/lib/python2.6/site-packages/mercurial/dispatch.py", line 138, in _runcatch
return _dispatch(req)
File "/usr/lib/python2.6/site-packages/mercurial/dispatch.py", line 820, in _dispatch
cmdpats, cmdoptions)
File "/usr/lib/python2.6/site-packages/mercurial/dispatch.py", line 600, in runcommand
ret = _runcommand(ui, options, cmd, d)
F... (120,024 more bytes) ... at [/src/future/exec/ExecFuture.php:397]
#0 ExecFuture::resolvex() called at [/src/applications/repository/storage/PhabricatorRepository.php:294]
#1 PhabricatorRepository::execxLocalCommand(string, string, string) called at [/src/applications/repository/engine/PhabricatorRepositoryRefEngine.php:243]
#2 PhabricatorRepositoryRefEngine::loadNewCommitIdentifiers(string, array) called at [/src/applications/repository/engine/PhabricatorRepositoryRefEngine.php:182]
#3 PhabricatorRepositoryRefEngine::updateCursors(array, array, string, array) called at [/src/applications/repository/engine/PhabricatorRepositoryRefEngine.php:69]
#4 PhabricatorRepositoryRefEngine::updateRefs() called at [/src/applications/repository/management/PhabricatorRepositoryManagementRefsWorkflow.php:41]
#5 PhabricatorRepositoryManagementRefsWorkflow::execute(PhutilArgumentParser) called at [/src/parser/argument/PhutilArgumentParser.php:394]
#6 PhutilArgumentParser::parseWorkflowsFull(array) called at [/src/parser/argument/PhutilArgumentParser.php:290]
#7 PhutilArgumentParser::parseWorkflows(array) called at [/scripts/repository/manage_repositories.php:22]

is there any way to fix it?

Event Timeline

Jaraill raised the priority of this task from to Needs Triage.
Jaraill triaged this task as Normal priority.
Jaraill updated the task description. (Show Details)
Jaraill added a subscriber: Jaraill.

Can you run the command again with the --trace parameter specified?

[root@Phabricator ~]# /var/www/html/phabricator/bin/repository refs F --trace

Hopefully there's something a bit more usable in the more spurious debug output.

Otherwise, I would mess around with a newer / older version of Mercurial, but that's really shooting in the dark.

Do you have ~10,000 branch heads or bookmarks in this repository?

Specifically, I think this issue is related to the enormous length of the command. The command is supposed to look like:

hg log --rev X - (A or B or C)

...where "X" is a newly discovered branch head and A, B and C are previously known branch heads. This allows us to find commits which are new in the repository since the last time we looked at it.

In this case, it looks like (A or B or C) is actually (A or B or ... 10,000 refs ... or Z9999). I would expect us to try to issue this command only if the repository actually has that number of branch heads.

Trace: {F192790}
I tried Phab with CentOS7 and python 2.7.5 and hg 2.6.2 on the same repo, and have the same result
In branch history there is offset 1100, so it's near this number of branch heads

http://phabricator/diffusion/F/branches/default/?offset=1100

But

[root@Phabricator F]# hg branches | wc -l
221

And there are no bookmarks

Do the branches shown in the web UI make sense to you? Are they obviously wrong?

In T5896#7, @epriestley wrote:

Do you have ~10,000 branch heads or bookmarks in this repository?

No, I'm just talking about that in repo there are near 1100 branch heads (opened and closed)

I'm sorry for that info which confused you (about 221 opened branches), I'm just trying to give as much as possible useful information

I'm running into this problem too, with Mercurial 3.1.2 (with Python 2.7.6) and phabricator 3463ce8a514f87287cd961ded284e60153e851d8 (from Fri 3rd Oct).

hg heads -c --template "{branch}\n" | wc -l

shows 1412 existing heads (only 5 of which are open), and the 'hg log' command that Phabricator runs is 65028 characters long.
The trace doesn't appear to show it, but running the same command manually, hg fails with:

  File "/usr/lib64/python2.7/site-packages/mercurial/revset.py", line 1930, in _getaliasarg
    if (len(tree) == 3 and tree[:2] == _aliasarg
RuntimeError: maximum recursion depth exceeded in cmp

I had hoped I might be able to avoid it by telling Phabricator to only check our most recent branches, but it seems it still tries to look at the entire repo.

It's hard to wait, so I found my way to solve this problem.

in phabricator/src/applications/repository/engine/PhabricatorRepositoryRefEngine.php
after

private function loadNewCommitIdentifiers(
  $new_head,
  array $all_closing_heads) {

  $repository = $this->getRepository();
  $vcs = $repository->getVersionControlSystem();
  switch ($vcs) {
    case PhabricatorRepositoryType::REPOSITORY_TYPE_MERCURIAL:

I've add/change

$size = 20;
$arr_size = sizeof($all_closing_heads);
$iteration = round($arr_size / $size);
for($i=0; $i<=$iteration; $i++){
   $all_closing_heads_sub[$i] = array_slice($all_closing_heads, $i*$size, $size);

        if ($all_closing_heads_sub[$i]) {
          $escheads = array();
          foreach ($all_closing_heads_sub[$i] as $head) {
            $escheads[] = hgsprintf('%s', $head);
          }
          $escheads = implode(' or ', $escheads);
          list($stdout) = $this->getRepository()->execxLocalCommand(
            'log --template %s --rev %s',
            '{node}\n',
            hgsprintf('%s', $new_head).' - ('.$escheads.')');
        } else {
          list($stdout) = $this->getRepository()->execxLocalCommand(
            'log --template %s --rev %s',
            '{node}\n',
            hgsprintf('%s', $new_head));
        }

This change splits $all_closing_heads on subarrays, so phabricator can work with them without command line overload.
You can change the size of subarray by editing $size value.

So now I have Updates OK

P.S.:
Don't forget closing bracket

}
      case PhabricatorRepositoryType::REPOSITORY_TYPE_GIT:

This appears to solve the problem for me, though Phabricator appears to ignore the 'Track Only' setting and instead imports all hundred or so branches; I don't know if that's at all related to @Jaraill's change though. And I'm also running into T7100.

If this change doesn't cause any regressions, could it be committed?

The change breaks "Track Only", breaks repositories without "Track Only", breaks new imports, runs 75x slower than necessary in the cases above, etc.

This is rooted in a limitation with Mercurial. I've filed a report upstream:

http://bz.selenic.com/show_bug.cgi?id=4624

Here's an example of a "reasonable" command which Mercurial can not parse on my system:

hg log --rev 'tip - (1 or 1 or 1 or 1 or 1 or 1 or 1 or 1 or 1 or 1 or 1 or 1 or 1 or 1 or 1 or 1 or 1 or 1 or 1 or 1 or 1 or 1 or 1 or 1 or 1 or 1 or 1 or 1 or 1 or 1 or 1 or 1 or 1 or 1 or 1 or 1 or 1 or 1 or 1 or 1 or 1 or 1 or 1 or 1 or 1 or 1 or 1 or 1 or 1 or 1 or 1 or 1 or 1 or 1 or 1 or 1 or 1 or 1 or 1 or 1 or 1 or 1 or 1 or 1 or 1 or 1 or 1 or 1 or 1 or 1 or 1 or 1 or 1 or 1 or 1 or 1 or 1 or 1 or 1 or 1 or 1 or 1 or 1 or 1 or 1 or 1 or 1 or 1 or 1 or 1 or 1 or 1 or 1 or 1 or 1 or 1 or 1 or 1 or 1 or 1 or 1 or 1 or 1 or 1 or 1 or 1 or 1 or 1 or 1 or 1 or 1 or 1 or 1 or 1 or 1 or 1 or 1 or 1 or 1 or 1 or 1 or 1 or 1 or 1 or 1 or 1 or 1 or 1 or 1 or 1 or 1 or 1 or 1 or 1 or 1 or 1 or 1 or 1 or 1 or 1 or 1 or 1 or 1 or 1 or 1 or 1 or 1 or 1 or 1 or 1 or 1 or 1 or 1 or 1 or 1 or 1 or 1 or 1 or 1 or 1 or 1 or 1 or 1 or 1 or 1 or 1 or 1 or 1 or 1 or 1 or 1 or 1 or 1 or 1 or 1 or 1 or 1 or 1 or 1 or 1 or 1 or 1 or 1 or 1 or 1 or 1 or 1 or 1 or 1 or 1 or 1 or 1 or 1 or 1 or 1 or 1 or 1 or 1 or 1 or 1 or 1 or 1 or 1 or 1 or 1 or 1 or 1 or 1 or 1 or 1 or 1 or 1 or 1 or 1 or 1 or 1 or 1 or 1 or 1 or 1 or 1 or 1 or 1 or 1 or 1 or 1 or 1 or 1 or 1 or 1 or 1 or 1 or 1 or 1 or 1 or 1 or 1 or 1 or 1 or 1 or 1 or 1 or 1 or 1 or 1 or 1 or 1 or 1 or 1 or 1 or 1 or 1 or 1 or 1 or 1 or 1 or 1 or 1 or 1 or 1 or 1 or 1 or 1 or 1 or 1 or 1 or 1 or 1 or 1 or 1 or 1 or 1 or 1 or 1 or 1 or 1 or 1 or 1 or 1 or 1 or 1 or 1 or 1 or 1 or 1 or 1 or 1 or 1 or 1 or 1 or 1 or 1 or 1 or 1 or 1 or 1 or 1 or 1 or 1 or 1 or 1 or 1 or 1 or 1 or 1 or 1 or 1 or 1 or 1 or 1 or 1 or 1 or 1 or 1 or 1 or 1 or 1 or 1 or 1 or 1 or 1 or 1 or 1 or 1 or 1 or 1 or 1 or 1 or 1 or 1 or 1 or 1 or 1 or 1 or 1 or 1 or 1 or 1 or 1 or 1 or 1 or 1 or 1 or 1 or 1 or 1 or 1 or 1 or 1 or 1)'

Mercurial has run into this internally before, e.g. here:

http://bz.selenic.com/show_bug.cgi?id=3604

...but in that case it was easier to fix the revset than remove the parser limitation.

Since this is fundamentally a Mercurial parser limitation, we can work around it by tricking the parser. This construction parses fine:

hg log --rev 'tip - ((1 or 2 or ... or 127) or (128 or 128 or ... or 255))'

That should let us survive until someone has a repository with about 90,000 branch heads. Hopefully Mercurial will have removed the parser limitation by then.

D12549 applies the strategy recursively, although we'll probably hit shell argument length limits before we hit 65K branch heads.

epriestley renamed this task from Update error on mercurial repo with 11k+ commits to Mercurial fails to parse (A or B or ... or Z) revsets with more than about 300 items.Apr 25 2015, 9:43 PM

The Mercurial upstream fixed this but we need to retain the (1 or 2) or (3 or 4) code for backward compatibility anyway, so I don't plan to remove it until we run into a real problem it causes other than "it is icky".