Page MenuHomePhabricator

Populating Symbol indexes fail
Closed, DuplicatePublic

Description

I am configuring phabricator/differential to use symbols as per instructions @ http://www.phabricator.com/docs/phabricator/article/Diffusion_User_Guide_Symbol_Indexes.html#configuring-differential and I keep getting an XHPAST Exception:

[2012-10-20 13:27:56] EXCEPTION: (Exception) XHPAST: failed to decode tree. at [/home/jeffery/src/libphutil/src/parser/xhpast/api/XHPASTTree.php:86]

#0 XHPASTTree::newFromDataAndResolvedExecFuture(<?php

class RatingCertificationImportExport
{
public static function getImportExportDefinition()
{

		return ImportExportDefinition::create()
			->exportReorderable( TRUE )
			->type( 'project' )->label( 'Projects' )
				->field( 'certificationNumber' )->label( 'Certification Number' )->int()->importKey()
			<!-- SNIP -->
				->field( 'certificateType' )->label( 'Certificate Type' );

}
}
, Array of size 3 starting with: { 0 => 0 }) called at [/home/jeffery/src/phabricator/scripts/symbols/generate_php_symbols.php:45]

Is this because of configuration/memory allocation or a bug in the symbols generator?

Revisions and Commits

Event Timeline

jeffery added a project: Differential.
jeffery added a subscriber: jeffery.

Is the <!-- SNIP --> a very large number of -> operators (a hundred or more)?

I think the problem is that PHP's json_decode() fails to decode structures above a certain depth, depending on which version of PHP you're running:

  • Prior to 5.2.3, the depth was hardcoded at 20.
  • From 5.2.3 until 5.3.0, the depth was hardcoded at 128.
  • As of 5.3.0, the default is 512 and the depth is configurable.

However, while it's configurable, we don't currently pass a $depth parameter to json_decode() even if we're running PHP 5.3.0 or newer. So, workarounds:

  • You can simply not import symbols from this file, if it's an unusual file that's rare in your codebase. For example, find ... | grep -v path/to/that/file | import_symbols.php.
  • You could restructure the file to have a lower maximum tree depth by breaking the chained calls apart.
  • Or you can increase the tree depth so the import works. If you're running a version older than 5.3.0, you'll need to upgrade to 5.3.0 or newer to do this. If that doesn't solve the problem on its own, the tree depth is presumably larger than 512. You can manually adjust it here:

https://secure.phabricator.com/diffusion/PHU/browse/master/src/parser/xhpast/api/XHPASTTree.php;b4c3f04174119ade$84

Modify that call to have a third parameter, like this:

$data = json_decode($stdout, true, 65535);

If that works, I think we could do it conditionally in the upstream.

Let me know if that's helpful.

There are about 53 -> operators in that statement.

I have PHP 5.3.17 and increased the json_decode "depth" and it fixed the above error. But now I get a new error:

Parsing input from stdin...
[2012-10-21 00:16:00] ERROR 2: proc_open(): fork failed - Cannot allocate memory at [/home/jeffery/src/libphutil/src/future/exec/ExecFuture.php:510]
  #0 proc_open('/home/jeffery/src/libphutil/src/parser/xhpast/bin/xhpast', Array of size 3 starting with: { 0 => Array of size 2 starting with: { 0 => pipe } }, Array , null) called at [/home/jeffery/src/libphutil/src/future/exec/ExecFuture.php:510]
  #1 ExecFuture::isReady() called at [/home/jeffery/src/libphutil/src/future/FutureIterator.php:336]
  #2 FutureIterator::updateWorkingSet() called at [/home/jeffery/src/libphutil/src/future/FutureIterator.php:272]
  #3 FutureIterator::next() called at [/home/jeffery/src/phabricator/scripts/symbols/generate_php_symbols.php:42]
[2012-10-21 00:16:00] EXCEPTION: (Exception) Failed to open process. at [/home/jeffery/src/libphutil/src/future/exec/ExecFuture.php:512]
  #0 ExecFuture::isReady() called at [/home/jeffery/src/libphutil/src/future/FutureIterator.php:336]
  #1 FutureIterator::updateWorkingSet() called at [/home/jeffery/src/libphutil/src/future/FutureIterator.php:272]
  #2 FutureIterator::next() called at [/home/jeffery/src/phabricator/scripts/symbols/generate_php_symbols.php:42]
Looking up path IDs...
Preparing queries...
Purging old symbols...
Loading 9,893 symbols...
Done.

I tried upping the php cli memory to 256M, but that didn't help either

It might be the parallelism that's killing you -- try setting the 8 here to 1:

https://secure.phabricator.com/diffusion/P/browse/master/scripts/symbols/generate_php_symbols.php;468aeaef0db76b7c$42

i.e.:

- foreach (Futures($futures)->limit(8) as $file => $future) {
+ foreach (Futures($futures)->limit(1) as $file => $future) {

If that works, we can make it a flag.

Hi Evan, That change didn't make any difference (or maybe it did?, I am not sure) :( But I came across similar issue for an other project (https://github.com/composer/composer/issues/945), so I added 2GB extra swap (Total swap 3GB) and this time the "fork failed" error disappeared. But now it failed with a parse error.

Parsing input from stdin...
[2012-10-21 10:25:36] EXCEPTION: (XHPASTSyntaxErrorException) XHPAST Parse Error: syntax error, unexpected ',' on line 377
 at [/home/jeffery/src/libphutil/src/parser/xhpast/api/XHPASTTree.php:78]
  #0 XHPASTTree::newFromDataAndResolvedExecFuture(<

I was watching the memory consumption in "top" and the php process topped 1912MB before the failure. I have only 2GB on this VM. I will try to execute the generator when I have more memory.

BTW can I execute the generator from an other machine (which has more memory) and populate the symbols to the remote phabricator machine?

Yeah, you can run the symbol import script from any machine which has access to the database, and you can run the symbol identification script from any machine and just save the file, then pass the file to the import script.

ok after much testing on my mac with plenty of memory to play with, I encountered one particular file which was always failing. I have created a test case for you. Take note of the usage of single and double quotes in the $data array. This is perfectly valid code, but it fails to parse.

<?php

class FooBar
{
    public function __construct()
    {
        $data = array();
        $data["foo"]['bar'] = 'table';
        $sql = "SELECT * from {$data["foo"]['bar']}";
    }
}

$fooBar = new FooBar;

The code I am dealing with is legacy and there are all sorts of oddities in it. When trying to generate the symbols, it fail with the following error:

[2012-10-21 22:22:49] EXCEPTION: (XHPASTSyntaxErrorException) XHPAST Parse Error: syntax error, unexpected T_STRING on line 9
 at [/Users/jeffery/Documents/Code/libphutil/src/parser/xhpast/api/XHPASTTree.php:78]
  #0 XHPASTTree::newFromDataAndResolvedExecFuture(<?php

class FooBar
{
	public function __construct()
	{
		$data = array();
		$data["foo"]['bar'] = 'table';
		$sql = "SELECT * from {$data["foo"]['bar']}";
	}
}

$fooBar = new FooBar;
, Array of size 3 starting with: { 0 => 1 }) called at [/Users/jeffery/Documents/Code/phabricator/scripts/symbols/generate_php_symbols.php:45]

Also would it be a good idea to echo a backtrace when such exceptions occur? When you are dealing with large amounts of files, it is hard to recognise the path of the problem file just by looking at the content.

epriestley lowered the priority of this task from Normal to Wishlist.Oct 21 2012, 8:33 PM

Ah, yeah, this is an actual bug with the parser. We use as simplified version of the rules for double-quoted strings which do not correctly parse strings like:

"{$x[""]}"

Since this is rare/awkward/suspicious syntax anyway ("${x["}"]}" is a syntax error) I'll probably wait until the next time I'm making other changes to the parser to fix it -- we're due for an update before too long for the new PHP 5.4 features like traits. I'll note this on T635 which is a sort of catchall for xhpast stuff.

We also definitely should raise better errors. Ideally we could do all the symbol extraction in C in the parser itself which would make everything at least an order of magnitude faster and more memory-efficient, even...

✘ Merged into T635.

joshuaspence changed the visibility from "All Users" to "Public (No Login Required)".Jan 12 2016, 3:55 AM