Page MenuHomePhabricator

In "phutil_loggable_string()", encode every byte above 0x7F
ClosedPublic

Authored by epriestley on Apr 14 2020, 9:52 PM.
Tags
None
Referenced Files
Unknown Object (File)
Fri, Dec 20, 5:05 PM
Unknown Object (File)
Thu, Dec 19, 7:34 PM
Unknown Object (File)
Thu, Dec 19, 3:48 AM
Unknown Object (File)
Wed, Dec 18, 2:04 PM
Unknown Object (File)
Tue, Dec 17, 10:04 PM
Unknown Object (File)
Sun, Dec 15, 1:57 PM
Unknown Object (File)
Fri, Dec 13, 7:46 AM
Unknown Object (File)
Wed, Dec 11, 3:30 AM
Subscribers
None

Details

Summary

Ref T13507. Currently, this function is a bit conservative about what it encodes, and passing it a string of binary garbage may result in an output which is not valid UTF8.

This could be refined somewhat, since it's less than ideal if the input has valid UTF8. The ideal behavior for byte sequences where all bytes are larger than 0x7F is probably a variation of "phutil_utf8ize()" that replaces bytes with "<0xXX>" instead of the Unicode error glyph.

For now, just err on the side of mangling.

Test Plan

Dumped various binary payloads in the new gzip setup check, saw sensible output in the web UI.

Diff Detail

Repository
rARC Arcanist
Branch
log1
Lint
Lint Passed
Unit
Tests Passed
Build Status
Buildable 24110
Build 33196: Run Core Tests
Build 33195: arc lint + arc unit

Event Timeline

This revision was not accepted when it landed; it landed in state Needs Review.Apr 14 2020, 11:03 PM
This revision was automatically updated to reflect the committed changes.