Changeset View
Standalone View
src/view/phui/PHUIInvisibleCharacterView.php
- This file was added.
<?php | |||||
/** | |||||
* API for replacing whitespace characters and some control characters with | |||||
* their printable representations. This is useful for debugging and | |||||
* displaying more helpful error messages to users. | |||||
* | |||||
*/ | |||||
final class PHUIInvisibleCharacterView extends AphrontView { | |||||
private $inputText; | |||||
private $plainText = false; | |||||
// This is a list of the common invisible characters that are | |||||
jcox: I think some of these might cause more confusion if we convert them to the actual name vs. just… | |||||
Done Inline ActionsYeah, I agree on only spelling out the commonly-used ones. It's more clear while debugging to see something like <0x14> (which maybe you can guess at the origin of based on the value -- a bad string length or constant getting added to the string incorrectly or something) than "DEVICE CONTROL 4" which no one uses regularly. I'd probably be fairly conservative: spell out NULL and stuff that has actual keys on a keyboard, and just leave most of the other stuff as hex escapes? epriestley: Yeah, I agree on only spelling out the commonly-used ones. It's more clear while debugging to… | |||||
// actually typeable. Other invisible characters will simply | |||||
Done Inline ActionsI initially had this as a const but apparently you can't have const arrays until PHP 5.6. I currently have 5.5.9 installed. Is this the right way to do const arrays in phab's codebase or should I use the 5.6 way? jcox: I initially had this as a `const` but apparently you can't have `const` arrays until PHP 5.6. I… | |||||
Done Inline ActionsThis is fine/correct. We support PHP way back to the stone ages so you can never use anything cool (well, cool for PHP, like "const"). Sorry. epriestley: This is fine/correct. We support PHP way back to the stone ages so you can never use anything… | |||||
// be displayed as their hex representations. | |||||
private static $invisibleChars = array( | |||||
"\x00" => 'NULL', | |||||
"\t" => 'TAB', | |||||
"\n" => 'NEWLINE', | |||||
"\x20" => 'SPACE', | |||||
); | |||||
public function __construct($input_text) { | |||||
$this->inputText = $input_text; | |||||
} | |||||
Not Done Inline ActionsConsider making this a separate property with accessors (setPlainText(true)). This avoids new Thing($object, true, true, true, false, 3, -1, null); if we add more stuff later -- or, realistically, avoids needing to find/replace every constructor once we get to 2-3 parameters so we don't ever end up in 7-parameter territory. (Or, make renderAsPlainText() a public method, and keep render() as the way to get HTML.) epriestley: Consider making this a separate property with accessors (`setPlainText(true)`). This avoids… | |||||
public function setPlainText($plain_text) { | |||||
$this->plainText = $plain_text; | |||||
return $this; | |||||
} | |||||
public function getStringParts() { | |||||
$input_text = $this->inputText; | |||||
$text_array = phutil_utf8v($input_text); | |||||
for ($ii = 0; $ii < count($text_array); $ii++) { | |||||
$char = $text_array[$ii]; | |||||
$char_hex = bin2hex($char); | |||||
if (array_key_exists($char, self::$invisibleChars)) { | |||||
$text_array[$ii] = array( | |||||
'special' => true, | |||||
'value' => '<'.self::$invisibleChars[$char].'>', | |||||
); | |||||
} else if (ord($char) < 32) { | |||||
$text_array[$ii] = array( | |||||
'special' => true, | |||||
'value' => '<0x'.$char_hex.'>', | |||||
); | |||||
} else { | |||||
$text_array[$ii] = array( | |||||
'special' => false, | |||||
'value' => $char, | |||||
); | |||||
} | |||||
} | |||||
return $text_array; | |||||
} | |||||
private function renderHtmlArray() { | |||||
$html_array = array(); | |||||
$parts = $this->getStringParts(); | |||||
foreach ($parts as $part) { | |||||
if ($part['special']) { | |||||
$html_array[] = phutil_tag( | |||||
'span', | |||||
array('class' => 'invisible-special'), | |||||
$part['value']); | |||||
} else { | |||||
$html_array[] = $part['value']; | |||||
} | |||||
} | |||||
return $html_array; | |||||
} | |||||
Done Inline ActionsThis should probably be phutil_utf8v() ("turn a byte-string of UTF8 into a list/vector of UTF8 characters") to handle UTF8 correctly. I'm also not really sure what we should have this method do for weird UTF8. There are a lot of bizarre things that users can type that won't be obvious when displayed -- T6003 has a couple of examples like all this garbage: X̀́̂̃̄̅̆̇̈̉̊̋̌̍̎̏ Probably fine to largely leave it as-is for now and we can deal with it once it becomes a real problem. epriestley: This should probably be `phutil_utf8v()` ("turn a byte-string of UTF8 into a list/vector of… | |||||
Done Inline ActionsBy convention, prefer $ii over $i -- it responds slightly better to "find / find next" in non-syntax-aware editors, but also brings fortune and good luck. epriestley: By convention, prefer `$ii` over `$i` -- it responds slightly better to "find / find next" in… | |||||
private function renderPlainText() { | |||||
$parts = $this->getStringParts(); | |||||
$res = ''; | |||||
foreach ($parts as $part) { | |||||
$res .= $part['value']; | |||||
} | |||||
return $res; | |||||
} | |||||
public function render() { | |||||
require_celerity_resource('phui-invisible-character-view-css'); | |||||
if ($this->plainText) { | |||||
return $this->renderPlainText(); | |||||
} else { | |||||
return $this->renderHtmlArray(); | |||||
} | |||||
} | |||||
} |
I think some of these might cause more confusion if we convert them to the actual name vs. just rendering the codepoint representation (ie. \x11 should probably just be displayed as \x11 rather than <DEVICE CONTROL 1>).
That could be just because I'm more familiar with some characters than others. Consistency might be the best bet here. Any thoughts on which ones I should remove/keep?