Home
Phabricator
Search
Configure Global Search
Log In
Transactions
T9732
Change Details
Change Details
Old
New
Diff
{F952841} {F952844} I tracked the source code, found out that UserName and RealName will be tokenized, and the tokenized source code located in [[ https://secure.phabricator.com/diffusion/P/browse/master/src/applications/typeahead/datasource/PhabricatorTypeaheadDatasource.php$110 | applications/typeahead/datasource/PhabricatorTypeaheadDatasource.php$110 ]]. The problem is that the preg regexp "/\s+/" will split one unicode "忠" into two. I created a [[https://gist.github.com/qiu8310/6b31f2fde6eea35fadce| gist ]] to describe why unicode "忠" will be splited. I wonder if there is a setting in php which can disable "/\s/" to match code points in the range 128-255 ? If not, I think "\s" should be replaced with "[\t\n\f\r]".
{F952841} {F952844} I tracked the source code, found out that UserName and RealName will be tokenized, and the tokenized source code located in [[ https://secure.phabricator.com/diffusion/P/browse/master/src/applications/typeahead/datasource/PhabricatorTypeaheadDatasource.php$110 | applications/typeahead/datasource/PhabricatorTypeaheadDatasource.php$110 ]]. The problem is that the preg regexp "/\s+/" will split one unicode "忠" into two. I created a [[https://gist.github.com/qiu8310/6b31f2fde6eea35fadce| gist ]] to describe why unicode "忠" will be splited. I wonder if there is a setting in php which can disable "/\s/" to match code points in the range 128-255 ? If not, I think "\s" should be replaced with "[\t\n\f\r]".
Continue