See comments. The regexp based implementation segfaults unpreventably on small inputs. Do it nice and slow in PHP instead.
Details
Details
Ran unit tests.
Diff Detail
Diff Detail
- Repository
- rPHU libphutil
- Branch
- bmp2
- Lint
Lint Passed - Unit
Tests Passed
Event Timeline
Comment Actions
Unfortunate.
This test case doesn't reproduce for me (debian box), but the php bug indicates that capturing triggers the segfault. Does the same issue reproduce if capturing is disabled?
"/^(:?". ...
src/utils/utf8.php | ||
---|---|---|
72–73 | This function could be condensed a bit (and possibly sped up) written with bitwise operators instead of comparison, but this looks good to me if you prefer this style. Likely more readable as-is, too. |
Comment Actions
Yeah, I wasn't able to get anything that had even approximately the same behavior. Here's a simple case on my machine:
>>> orbital ~ $ php -r "preg_match('/^(?:a)+$/', str_repeat('a', 1024 * 64));" Segmentation fault: 11 >>> orbital ~ $ php -v PHP 5.5.8 (cli) (built: Jan 21 2014 11:16:51) Copyright (c) 1997-2013 The PHP Group Zend Engine v2.5.0, Copyright (c) 1998-2013 Zend Technologies
If it shows up in profiles we can provide an extension (T2312) or do something faster for short strings. This isn't really that slow, it's just dramatically slower than it would be in C.