Page MenuHomePhabricator

Reject nonminmal representations of UTF8 at the beginning of the 3-byte BMP range
ClosedPublic

Authored by epriestley on Feb 23 2014, 8:35 PM.
Tags
None
Referenced Files
F15464327: D8313.id19764.diff
Wed, Apr 2, 8:30 AM
F15456427: D8313.id19764.diff
Sun, Mar 30, 9:37 AM
F15454790: D8313.id19779.diff
Sat, Mar 29, 8:26 PM
F15449164: D8313.id19774.diff
Fri, Mar 28, 8:40 AM
F15449005: D8313.id.diff
Fri, Mar 28, 8:07 AM
F15446747: D8313.diff
Thu, Mar 27, 8:01 PM
F15430557: D8313.id.diff
Mon, Mar 24, 8:10 AM
F15428638: D8313.id19774.diff
Sun, Mar 23, 9:26 PM
Subscribers

Details

Summary

Ref T1191. These byte ranges (\xE0\x80\x80 through \xE0\x9F\xBF) are alternate representations of characters with a different preferred minimal representation. MySQL and mbstring both reject them, and we should too.

Test Plan

Ran unit tests.

Diff Detail

Lint
Lint Skipped
Unit
Tests Skipped

Event Timeline

arice added inline comments.
src/utils/utf8.php
126

*like

src/utils/utf8.php
126

o why thank you

epriestley updated this revision to Unknown Object (????).Feb 23 2014, 11:24 PM
  • "lik" -> "like"