Details
- Reviewers
epriestley - Group Reviewers
Blessed Reviewers - Maniphest Tasks
- T5297: Create a native JSON parser.
- Commits
- rPHU5559a21fe307: Add a native JSON parser.
Wrote some basic unit tests and ran arc unit.
Diff Detail
- Repository
- rPHU libphutil
- Branch
- json
- Lint
Lint Passed - Unit
Tests Passed - Build Status
Buildable 1225 Build 1225: [Placeholder Plan] Wait for 30 Seconds
Event Timeline
Here is a small example:
<?php require_once(__DIR__.'/src/__phutil_library_init__.php'); $parser = new PhutilJSONParser(); var_dump($parser->parse('{}')); var_dump($parser->parse('{"foo": "bar"}')); var_dump($parser->parse('{"foo": "bar",}'));
Output
array(0) {
}
array(1) {
["foo"]=>
string(3) "bar"
}
PHP Fatal error: Uncaught exception 'JsonLintParsingException' with message 'Parse error on line 1:
{"foo": "bar",}
-------------^
Expected: 'STRING' - It appears you have an extra trailing comma' in /home/joshua/workspace/github.com/phacility/libphutil/externals/jsonlint/src/Seld/JsonLint/JsonParser.php:327
Stack trace:
#0 /home/joshua/workspace/github.com/phacility/libphutil/externals/jsonlint/src/Seld/JsonLint/JsonParser.php(224): JsonLintJsonParser->parseError('Parse error on ...', Array)
#1 /home/joshua/workspace/github.com/phacility/libphutil/src/parser/PhutilJSONParser.php(17): JsonLintJsonParser->parse('{"foo": "bar",}')
#2 /home/joshua/workspace/github.com/phacility/libphutil/test.php(8): PhutilJSONParser->parse('{"foo": "bar",}')
#3 {main}
thrown in /home/joshua/workspace/github.com/phacility/libphutil/externals/jsonlint/src/Seld/JsonLint/JsonParser.php on line 327| src/parser/PhutilJSONParser.php | ||
|---|---|---|
| 4–9 | Prefer to make these calls/includes inside the parse() function, in the interest of consistency and having includes be as close to side-effect free as possible. None of the files in this library do anything shady (e.g., use globals) so it shouldn't affect behavior. | |
I'm not sure on whether we want to keep the nested directory structure in externals/jsonlint/?
Some other minor inlines.
| externals/jsonlint/src/Seld/JsonLint/JsonParser.php | ||
|---|---|---|
| 26 | I'm not set on this naming convention... basically I just dropped the namespace and prepended all of the class names with JsonLint. | |
| src/parser/PhutilJSONParser.php | ||
| 18 | We should probably catch JsonLintParsingExceptions and throw our own exception. Perhaps: try { return (array)$parser->parse($json); } catch (JsonLintParsingException $ex) { throw new PhuitilJSONParserException($ex); } | |
Also, we now will have PhutilJSON and PhutilJSONParser... these should maybe be consolidated?
Consolidating the classes and throwing a more tailored exception both seem reasonable.
I think the naming change is fine, and retaining the directory structure will make diffing/updating a little bit easier if we want to pull in patches from the upstream (or try to send them anything).
One inline, looks good otherwise.
| src/parser/PhutilJSONParser.php | ||
|---|---|---|
| 19–21 | This seems a little odd. I would expect this method to throw if you call it with empty input. What's the reasoning here? | |
| src/parser/PhutilJSONParser.php | ||
|---|---|---|
| 19–21 | I figured that PhutilJSONParser::parse($json) should behave identically (where possible) to json_decode($json, true). json_decode('', true) returns null. | |
| src/parser/PhutilJSONParser.php | ||
|---|---|---|
| 19–21 | json_decode(null, true) and json_decode(false, true) also return null | |
Oh, I don't think they should behave similarly -- phutil_json_decode() is the desired behavior in my book. Particularly, inputs like 3, true, "0", quoted strings, etc., produce "best-effort" answers from json_decode() that are never what I expect from a reasonable API.
Upon actually reading the diff, I guess this parser also handles all that stuff?
I think the default mode should require the input to be a real JSON object ({...}) and throw if it isn't. There could be some alternate "whatever man, it's probably JSON" mode (or set of modes?) which parses bare strings, arrays, integers, etc., but I think this is rarely desired.
| src/parser/PhutilJSONParser.php | ||
|---|---|---|
| 19–21 | Without this conditional, the following happens when trying to parse an empty string: PHP Warning: str_repeat(): Second argument has to be greater than or equal to 0 in /home/joshua/workspace/github.com/phacility/libphutil/externals/jsonlint/src/Seld/JsonLint/Lexer.php on line 82
PHP Fatal error: Uncaught exception 'PhutilJSONParserException' with message 'Parse error on line 1:
^
Expected one of: 'STRING', 'NUMBER', 'NULL', 'TRUE', 'FALSE', '{', '['' in /home/joshua/workspace/github.com/phacility/libphutil/src/parser/PhutilJSONParser.php:22
Stack trace:
#0 /home/joshua/workspace/github.com/phacility/libphutil/test3.php(8): PhutilJSONParser->parse('')
#1 {main}
thrown in /home/joshua/workspace/github.com/phacility/libphutil/src/parser/PhutilJSONParser.php on line 22 | |
Bumping this back to "Request Review" because I'm am now a little unclear if this is what you are expecting.
| src/parser/PhutilJSONParser.php | ||
|---|---|---|
| 33 | (After future patches, the instanceof goes away, right?) | |
| 36 | For clarity, maybe this should say: "... is not a valid JSON object.". | |
| 37 | This should use PhutilReadableSerializer::printShort(), or something like that, instead of var_export(), since the behavior of var_export() in the presence of recursive datastructures is very bad (they shouldn't be possible here) and not great in the presence of very long strings (they are possible here). | |
I think this is desirable, and we'll basically end up with a "shallow/UI" usage:
try {
$result = parse($input);
} catch (PhutilJSONParserException $ex) {
throw new PhutilProxyException("details about exactly what you did wrong", $ex);
}...and a "deep/continue" usage:
try {
$result = parse($input);
} catch (PhutilJSONParserException $ex) {
$result = array();
}These are a little verbose, but consistent, hard to get wrong, and seem generally reasonable to me.
| src/parser/PhutilJSONParser.php | ||
|---|---|---|
| 37 | Oh, yuck. Fix printShort(), I guess? | |