We recently fixed phutil_utf8_shorten() to account for combining characters, so it now does a generally reasonable job of shortening an input to a given number of characters and producing a valid UTF8 output string.
Shortening to a given number of characters is generally what we want, since we most often use this function to shorten titles or summaries and make things fit in a limited display area.
However, sometimes we want to shorten an input string to a given number of bytes. One example is D6118, where we want to reduce an email's length to under a certain size. phutil_utf8_shorten() does not guarantee a minimum byte size. Theoretically, an input might have one x followed by an arbitrarily large number of combining characters.
There are a couple of approaches here; we could introduce a second function (phutil_utf8_truncate(), or phutil_utf8_shorten_bytes()). We could also add another parameter to phutil_utf8_shorten(), e.g. add $byte_length.
The implementation of byte truncation is nearly identical to the existing implementation, we just need to count strlen($char) against the length instead of implicitly counting 1.