[Wilds] Sanitize UTF8 output in `tsprintf(...)` under Windows

Authored by epriestley on Oct 2 2018, 5:43 PM.


[Wilds] Sanitize UTF8 output in tsprintf(...) under Windows

Ref T13209. In PHP, when you echo or print certain invalid sequences to the cmd.exe terminal under Windows 10, the entire string just vanishes into the ether.

I ran into this because arc unit was reporting "1 failing test" but not actually printing a test failure. That's because the failing test was the surrogate filtering test, and the test failure contained a reserved UTF16 surrogate sequence ("Expected: <filtered result>; Actual: <unfiltered result>"). See D19724.

To try to limit the damage this can cause, explicitly phutil_utf8ize(...) the output under Windows. When we don't need to do this I think it's slightly better not to (occasionally, the raw input might be useful in debugging or understanding something) which is why I'm not just doing it unconditionally.

Test Plan:

  • Wrote a script which did echo tsprintf("%s", "<invalid surrogate sequence>");.
  • On Windows 10 in cmd.exe, saw it print something instead of printing nothing.

Reviewers: amckinley

Reviewed By: amckinley

Maniphest Tasks: T13209

Differential Revision: https://secure.phabricator.com/D19725