Details
- Reviewers
epriestley - Group Reviewers
Blessed Reviewers - Maniphest Tasks
- T5297: Create a native JSON parser.
- Commits
- rPHU5559a21fe307: Add a native JSON parser.
Wrote some basic unit tests and ran arc unit.
Diff Detail
- Repository
- rPHU libphutil
- Branch
- json
- Lint
Lint Passed - Unit
Tests Passed - Build Status
Buildable 1224 Build 1224: [Placeholder Plan] Wait for 30 Seconds
Event Timeline
Here is a small example:
<?php require_once(__DIR__.'/src/__phutil_library_init__.php'); $parser = new PhutilJSONParser(); var_dump($parser->parse('{}')); var_dump($parser->parse('{"foo": "bar"}')); var_dump($parser->parse('{"foo": "bar",}'));
Output
array(0) { } array(1) { ["foo"]=> string(3) "bar" } PHP Fatal error: Uncaught exception 'JsonLintParsingException' with message 'Parse error on line 1: {"foo": "bar",} -------------^ Expected: 'STRING' - It appears you have an extra trailing comma' in /home/joshua/workspace/github.com/phacility/libphutil/externals/jsonlint/src/Seld/JsonLint/JsonParser.php:327 Stack trace: #0 /home/joshua/workspace/github.com/phacility/libphutil/externals/jsonlint/src/Seld/JsonLint/JsonParser.php(224): JsonLintJsonParser->parseError('Parse error on ...', Array) #1 /home/joshua/workspace/github.com/phacility/libphutil/src/parser/PhutilJSONParser.php(17): JsonLintJsonParser->parse('{"foo": "bar",}') #2 /home/joshua/workspace/github.com/phacility/libphutil/test.php(8): PhutilJSONParser->parse('{"foo": "bar",}') #3 {main} thrown in /home/joshua/workspace/github.com/phacility/libphutil/externals/jsonlint/src/Seld/JsonLint/JsonParser.php on line 327
src/parser/PhutilJSONParser.php | ||
---|---|---|
3–8 | Prefer to make these calls/includes inside the parse() function, in the interest of consistency and having includes be as close to side-effect free as possible. None of the files in this library do anything shady (e.g., use globals) so it shouldn't affect behavior. |
I'm not sure on whether we want to keep the nested directory structure in externals/jsonlint/?
Some other minor inlines.
externals/jsonlint/src/Seld/JsonLint/JsonParser.php | ||
---|---|---|
25 | I'm not set on this naming convention... basically I just dropped the namespace and prepended all of the class names with JsonLint. | |
src/parser/PhutilJSONParser.php | ||
17 | We should probably catch JsonLintParsingExceptions and throw our own exception. Perhaps: try { return (array)$parser->parse($json); } catch (JsonLintParsingException $ex) { throw new PhuitilJSONParserException($ex); } |
Also, we now will have PhutilJSON and PhutilJSONParser... these should maybe be consolidated?
Consolidating the classes and throwing a more tailored exception both seem reasonable.
I think the naming change is fine, and retaining the directory structure will make diffing/updating a little bit easier if we want to pull in patches from the upstream (or try to send them anything).
One inline, looks good otherwise.
src/parser/PhutilJSONParser.php | ||
---|---|---|
19–21 | This seems a little odd. I would expect this method to throw if you call it with empty input. What's the reasoning here? |
src/parser/PhutilJSONParser.php | ||
---|---|---|
19–21 | I figured that PhutilJSONParser::parse($json) should behave identically (where possible) to json_decode($json, true). json_decode('', true) returns null. |
src/parser/PhutilJSONParser.php | ||
---|---|---|
19–21 | json_decode(null, true) and json_decode(false, true) also return null |
Oh, I don't think they should behave similarly -- phutil_json_decode() is the desired behavior in my book. Particularly, inputs like 3, true, "0", quoted strings, etc., produce "best-effort" answers from json_decode() that are never what I expect from a reasonable API.
Upon actually reading the diff, I guess this parser also handles all that stuff?
I think the default mode should require the input to be a real JSON object ({...}) and throw if it isn't. There could be some alternate "whatever man, it's probably JSON" mode (or set of modes?) which parses bare strings, arrays, integers, etc., but I think this is rarely desired.
src/parser/PhutilJSONParser.php | ||
---|---|---|
19–21 | Without this conditional, the following happens when trying to parse an empty string: PHP Warning: str_repeat(): Second argument has to be greater than or equal to 0 in /home/joshua/workspace/github.com/phacility/libphutil/externals/jsonlint/src/Seld/JsonLint/Lexer.php on line 82 PHP Fatal error: Uncaught exception 'PhutilJSONParserException' with message 'Parse error on line 1: ^ Expected one of: 'STRING', 'NUMBER', 'NULL', 'TRUE', 'FALSE', '{', '['' in /home/joshua/workspace/github.com/phacility/libphutil/src/parser/PhutilJSONParser.php:22 Stack trace: #0 /home/joshua/workspace/github.com/phacility/libphutil/test3.php(8): PhutilJSONParser->parse('') #1 {main} thrown in /home/joshua/workspace/github.com/phacility/libphutil/src/parser/PhutilJSONParser.php on line 22 |
Bumping this back to "Request Review" because I'm am now a little unclear if this is what you are expecting.
src/parser/PhutilJSONParser.php | ||
---|---|---|
33 | (After future patches, the instanceof goes away, right?) | |
36 | For clarity, maybe this should say: "... is not a valid JSON object.". | |
37 | This should use PhutilReadableSerializer::printShort(), or something like that, instead of var_export(), since the behavior of var_export() in the presence of recursive datastructures is very bad (they shouldn't be possible here) and not great in the presence of very long strings (they are possible here). |
I think this is desirable, and we'll basically end up with a "shallow/UI" usage:
try { $result = parse($input); } catch (PhutilJSONParserException $ex) { throw new PhutilProxyException("details about exactly what you did wrong", $ex); }
...and a "deep/continue" usage:
try { $result = parse($input); } catch (PhutilJSONParserException $ex) { $result = array(); }
These are a little verbose, but consistent, hard to get wrong, and seem generally reasonable to me.
src/parser/PhutilJSONParser.php | ||
---|---|---|
37 | Oh, yuck. Fix printShort(), I guess? |