HomePhabricator

[Wilds] Fix phutil_is_utf8_slowly() to reject reserved UTF16 surrogate…

Description

[Wilds] Fix phutil_is_utf8_slowly() to reject reserved UTF16 surrogate character ranges

Summary:
Ref T13209. See T11525. We want to reject certain 3-byte characters as "invalid" unicode, primarily because json_decode() does not accept them.

We currently reject them correctly if we go down the fast path in phutil_is_utf8() via mb_check_encoding(), but incorrectly accept them if we go down the slow path.

Add test coverage that the slow path has the same behavior as the fast path, and then make the slow path reject these byte sequences.

Test Plan:

  • Added failing tests.
  • Made them pass on OSX and Windows 10.

Reviewers: amckinley

Reviewed By: amckinley

Maniphest Tasks: T13209

Differential Revision: https://secure.phabricator.com/D19724