Page MenuHomePhabricator

Add a native JSON parser.
ClosedPublic

Authored by joshuaspence on Jun 18 2014, 5:07 PM.
Tags
None
Referenced Files
Unknown Object (File)
Thu, Nov 21, 8:19 AM
Unknown Object (File)
Sun, Nov 17, 2:16 PM
Unknown Object (File)
Sun, Nov 10, 9:29 PM
Unknown Object (File)
Fri, Nov 8, 12:10 PM
Unknown Object (File)
Wed, Oct 30, 7:34 PM
Unknown Object (File)
Sat, Oct 26, 9:02 PM
Unknown Object (File)
Fri, Oct 25, 6:29 PM
Unknown Object (File)
Oct 23 2024, 11:25 PM
Subscribers
Tokens
"The World Burns" token, awarded by joshuaspence.

Details

Summary

Ref T5297. Add JsonLint as an external library and provide PhutilJSONParser as a wrapper around the library's functionality.

Test Plan

Wrote some basic unit tests and ran arc unit.

Diff Detail

Repository
rPHU libphutil
Lint
Lint Skipped
Unit
Tests Skipped

Event Timeline

joshuaspence retitled this revision from to Add a native JSON parser..
joshuaspence updated this object.
joshuaspence edited the test plan for this revision. (Show Details)
joshuaspence added a reviewer: epriestley.

Here is a small example:

<?php

require_once(__DIR__.'/src/__phutil_library_init__.php');

$parser = new PhutilJSONParser();
var_dump($parser->parse('{}'));
var_dump($parser->parse('{"foo": "bar"}'));
var_dump($parser->parse('{"foo": "bar",}'));

Output

array(0) {
}
array(1) {
  ["foo"]=>
  string(3) "bar"
}
PHP Fatal error:  Uncaught exception 'JsonLintParsingException' with message 'Parse error on line 1:
{"foo": "bar",}
-------------^
Expected: 'STRING' - It appears you have an extra trailing comma' in /home/joshua/workspace/github.com/phacility/libphutil/externals/jsonlint/src/Seld/JsonLint/JsonParser.php:327
Stack trace:
#0 /home/joshua/workspace/github.com/phacility/libphutil/externals/jsonlint/src/Seld/JsonLint/JsonParser.php(224): JsonLintJsonParser->parseError('Parse error on ...', Array)
#1 /home/joshua/workspace/github.com/phacility/libphutil/src/parser/PhutilJSONParser.php(17): JsonLintJsonParser->parse('{"foo": "bar",}')
#2 /home/joshua/workspace/github.com/phacility/libphutil/test.php(8): PhutilJSONParser->parse('{"foo": "bar",}')
#3 {main}
  thrown in /home/joshua/workspace/github.com/phacility/libphutil/externals/jsonlint/src/Seld/JsonLint/JsonParser.php on line 327
epriestley edited edge metadata.
epriestley added inline comments.
src/parser/PhutilJSONParser.php
4–9

Prefer to make these calls/includes inside the parse() function, in the interest of consistency and having includes be as close to side-effect free as possible. None of the files in this library do anything shady (e.g., use globals) so it shouldn't affect behavior.

This revision is now accepted and ready to land.Jun 18 2014, 5:11 PM

I'm not sure on whether we want to keep the nested directory structure in externals/jsonlint/?

Some other minor inlines.

externals/jsonlint/src/Seld/JsonLint/JsonParser.php
26

I'm not set on this naming convention... basically I just dropped the namespace and prepended all of the class names with JsonLint.

src/parser/PhutilJSONParser.php
18

We should probably catch JsonLintParsingExceptions and throw our own exception. Perhaps:

try {
  return (array)$parser->parse($json);
} catch (JsonLintParsingException $ex) {
  throw new PhuitilJSONParserException($ex);
}
joshuaspence edited edge metadata.
  • Move requires inside PhutilJSONParser::json

Also, we now will have PhutilJSON and PhutilJSONParser... these should maybe be consolidated?

Consolidating the classes and throwing a more tailored exception both seem reasonable.

I think the naming change is fine, and retaining the directory structure will make diffing/updating a little bit easier if we want to pull in patches from the upstream (or try to send them anything).

OK, I'm just writing a bunch of test cases and then this should be good to go.

  • Add unit tests
  • Add a PhutilJSONException class to wrap JsonLintParsingException
  • Add a warning for developers.
  • Rename exception to be more explicit.
  • Slightly better error handling in tests
joshuaspence edited the test plan for this revision. (Show Details)

One inline, looks good otherwise.

src/parser/PhutilJSONParser.php
19–21

This seems a little odd. I would expect this method to throw if you call it with empty input. What's the reasoning here?

src/parser/PhutilJSONParser.php
19–21

I figured that PhutilJSONParser::parse($json) should behave identically (where possible) to json_decode($json, true). json_decode('', true) returns null.

src/parser/PhutilJSONParser.php
19–21

json_decode(null, true) and json_decode(false, true) also return null

Oh, I don't think they should behave similarly -- phutil_json_decode() is the desired behavior in my book. Particularly, inputs like 3, true, "0", quoted strings, etc., produce "best-effort" answers from json_decode() that are never what I expect from a reasonable API.

Upon actually reading the diff, I guess this parser also handles all that stuff?

I think the default mode should require the input to be a real JSON object ({...}) and throw if it isn't. There could be some alternate "whatever man, it's probably JSON" mode (or set of modes?) which parses bare strings, arrays, integers, etc., but I think this is rarely desired.

I think the default mode should require the input to be a real JSON object ({...}) and throw if it isn't.

I think that JSON arrays ([...]) should be allowed too?

src/parser/PhutilJSONParser.php
19–21

Without this conditional, the following happens when trying to parse an empty string:

PHP Warning:  str_repeat(): Second argument has to be greater than or equal to 0 in /home/joshua/workspace/github.com/phacility/libphutil/externals/jsonlint/src/Seld/JsonLint/Lexer.php on line 82
PHP Fatal error:  Uncaught exception 'PhutilJSONParserException' with message 'Parse error on line 1:

^
Expected one of: 'STRING', 'NUMBER', 'NULL', 'TRUE', 'FALSE', '{', '['' in /home/joshua/workspace/github.com/phacility/libphutil/src/parser/PhutilJSONParser.php:22
Stack trace:
#0 /home/joshua/workspace/github.com/phacility/libphutil/test3.php(8): PhutilJSONParser->parse('')
#1 {main}
  thrown in /home/joshua/workspace/github.com/phacility/libphutil/src/parser/PhutilJSONParser.php on line 22
  • Minor change to comment
  • Throw more exceptions.
joshuaspence edited edge metadata.

Bumping this back to "Request Review" because I'm am now a little unclear if this is what you are expecting.

epriestley edited edge metadata.
epriestley added inline comments.
src/parser/PhutilJSONParser.php
33

(After future patches, the instanceof goes away, right?)

36

For clarity, maybe this should say: "... is not a valid JSON object.".

37

This should use PhutilReadableSerializer::printShort(), or something like that, instead of var_export(), since the behavior of var_export() in the presence of recursive datastructures is very bad (they shouldn't be possible here) and not great in the presence of very long strings (they are possible here).

This revision is now accepted and ready to land.Jun 19 2014, 8:15 PM

I think this is desirable, and we'll basically end up with a "shallow/UI" usage:

try {
  $result = parse($input);
} catch (PhutilJSONParserException $ex) {
  throw new PhutilProxyException("details about exactly what you did wrong", $ex);
}

...and a "deep/continue" usage:

try {
  $result = parse($input);
} catch (PhutilJSONParserException $ex) {
  $result = array();
}

These are a little verbose, but consistent, hard to get wrong, and seem generally reasonable to me.

src/parser/PhutilJSONParser.php
33

Yep, after D9623.

36

Agreed.

37

Using PhutilReadableSerializer::printShort() means that an empty string will throw with ' is not a valid JSON object.'.

src/parser/PhutilJSONParser.php
37

Oh, yuck. Fix printShort(), I guess?

joshuaspence edited edge metadata.
  • Update error message
  • Use PhutilReadableSerializer::printShort instead of var_export