Wrote a bunch of test cases to cover this stuff, all of which now pass.
Fuzzed `json_encode(phutil_utf8ize($string))` on random strings in a loop. Before these changes it would fail after a handful of attempts, in less than a second. After these changes, I ran it for several minutes and didn't see any failures.