Page MenuHomePhabricator

PCRE segfaults readily with default "pcre.backtrack_limit" and "pcre.recursion_limit" values
Closed, ResolvedPublic

Description

An install reported a segfault which I traced to this regex in a Herald rule:

@^((?!xyz/).)*\.abc?$@

I think this simplifies to /^(.)*$/, which captures every character, although I'm not certain that's enough to cause backtracking/recursion on its own.

I avoided this by changing pcre.backtrack_limit and pcre.recursion_limit to 10000. I expect to deploy some flavor of this as a configuration change to the cluster.

Open questions:

  • What are reasonable limits?
  • What's a test case for hitting the backtrace limit? The recursion limit?
  • How can we distinguish between hitting the backtrack/recursion limit and other failures? Does preg_match() just return false nicely?

Event Timeline

epriestley triaged this task as Normal priority.Mar 6 2018, 7:43 PM
epriestley created this task.

On my system, this script:

<?php

$n = $argv[1];

$backtrack_limit = ini_get('pcre.backtrack_limit');
$recursion_limit = ini_get('pcre.recursion_limit');

echo "Backtrack Limit: {$backtrack_limit}\n";
echo "Recursion Limit: {$recursion_limit}\n";

$pattern = '/^(.)*$/';
$subject = str_repeat('a', $n);

var_dump(preg_match($pattern, $subject));

...reports the default values (1,000,000 and 100,000 respectively) and matches when run with values 0-2729, and fails (returns false) with values 2730+.

Adjusting either limit to 1000 has no effect on when it fails, so this is probably some internal "maximum number of captures" which is unaffected by configuration.

If I change the regexp to /^.*x$/, it begins failing at the backtrack limit. It can still match above this limit if the subject actually matches.

I'm not having much luck tricking PCRE into crashing locally.

Since I can't really figure out how to make this stuff segfault or what reasonable ranges for these values are, I'm tentatively going to lower these values to 10000 in production (based on sage advice from Rasmus in https://bugs.php.net/bug.php?id=36463) and see what that breaks.

I'd ideally like to issue setup guidance about this, too, but I'd like a better understanding of it so I don't just have to write something like "u should probably set this to magic value X or spooky ghosts will attack ur computer".

We can detect this by testing the preg_match() return value for false. Our consistency on doing this isn't great.

I've deployed these configuration changes to production and they appear to have resolved the issue.

epriestley renamed this task from PCRE segfaults readily with default "pcre.backtrace_limit" and "pcre.recursion_limit" values to PCRE segfaults readily with default "pcre.backtrack_limit" and "pcre.recursion_limit" values.Nov 3 2020, 7:36 PM
epriestley updated the task description. (Show Details)

Calls to preg_* can still be swapped to phutil_preg_* to improve behavior, but this seems basically resolved.