Page MenuHomePhabricator

In "phutil_loggable_string()", encode every byte above 0x7F
ClosedPublic

Authored by epriestley on Apr 14 2020, 9:52 PM.
Tags
None
Referenced Files
F12852419: D21117.id50292.diff
Fri, Mar 29, 7:16 AM
F12841279: D21117.id.diff
Thu, Mar 28, 9:05 PM
F12839726: D21117.id50293.diff
Thu, Mar 28, 7:48 PM
Unknown Object (File)
Thu, Mar 14, 6:10 PM
Unknown Object (File)
Tue, Mar 5, 1:17 AM
Unknown Object (File)
Feb 19 2024, 7:21 PM
Unknown Object (File)
Feb 4 2024, 12:02 AM
Unknown Object (File)
Jan 25 2024, 2:18 AM
Subscribers
None

Details

Summary

Ref T13507. Currently, this function is a bit conservative about what it encodes, and passing it a string of binary garbage may result in an output which is not valid UTF8.

This could be refined somewhat, since it's less than ideal if the input has valid UTF8. The ideal behavior for byte sequences where all bytes are larger than 0x7F is probably a variation of "phutil_utf8ize()" that replaces bytes with "<0xXX>" instead of the Unicode error glyph.

For now, just err on the side of mangling.

Test Plan

Dumped various binary payloads in the new gzip setup check, saw sensible output in the web UI.

Diff Detail

Repository
rARC Arcanist
Branch
log1
Lint
Lint Passed
Unit
Tests Passed
Build Status
Buildable 24110
Build 33196: Run Core Tests
Build 33195: arc lint + arc unit

Event Timeline

This revision was not accepted when it landed; it landed in state Needs Review.Apr 14 2020, 11:03 PM
This revision was automatically updated to reflect the committed changes.