HomePhabricator

[Wilds] Sanitize UTF8 output in `tsprintf(...)` under Windows

Authored by epriestley on Oct 2 2018, 5:43 PM.

Description

[Wilds] Sanitize UTF8 output in tsprintf(...) under Windows

Summary:
Ref T13209. In PHP, when you echo or print certain invalid sequences to the cmd.exe terminal under Windows 10, the entire string just vanishes into the ether.

I ran into this because arc unit was reporting "1 failing test" but not actually printing a test failure. That's because the failing test was the surrogate filtering test, and the test failure contained a reserved UTF16 surrogate sequence ("Expected: <filtered result>; Actual: <unfiltered result>"). See D19724.

To try to limit the damage this can cause, explicitly phutil_utf8ize(...) the output under Windows. When we don't need to do this I think it's slightly better not to (occasionally, the raw input might be useful in debugging or understanding something) which is why I'm not just doing it unconditionally.

Test Plan:

  • Wrote a script which did echo tsprintf("%s", "<invalid surrogate sequence>");.
  • On Windows 10 in cmd.exe, saw it print something instead of printing nothing.

Reviewers: amckinley

Reviewed By: amckinley

Maniphest Tasks: T13209

Differential Revision: https://secure.phabricator.com/D19725