Page MenuHomePhabricator

ArcanistJSONLintRenderer fails to produce JSON output with non-UTF-8 encodings
Open, Needs TriagePublic

Description

Test case

Create an ISO8859-15 text file, like e.g.

Use the following ~/.arclint configuration:

{
        "linters": {
                "text": {
                        "type": "text",
                        "include": "(\\.txt$)"
                }
        }
}

Now try to lint.

This looks fine:

% arc lint jsonrenderer-test.txt
>>> Lint for jsonrenderer-test.txt:


   Error  (TXT5) Bad Charset
    Source code should contain only ASCII bytes with ordinal decimal values
    between 32 and 126 inclusive, plus linefeed. Do not use UTF-8 or other
    multibyte charsets.

    >>>        1 {"aaa": "a"}

This just outputs a newline, instead of an JSON-encoded linter message:

% arc lint --output json jsonrenderer-test.txt

It seems to me that the return code here is not checked.

Other related problem

Further problem is that even raiseLintAtPath() adds context (start of the file) to the JSON renderer and will fail to encode any lint message in this case. Use this file (it is also containing merge conflict markers, so you can play with that linter, too, but this is not relevant for this test):

Use this .arclint:

{
        "linters": {
                "filename": {
                        "type": "filename",
                        "include": "(\\.txt$)"
                }
        }
}

Then run:

% arc lint jsonrenderer+test.txt
>>> Lint for jsonrenderer+test.txt:


   Error  (NAME1) Bad Filename
    Name files using only letters, numbers, period, hyphen and underscore.

While JSON output is again a newline:

% arc lint --output json jsonrenderer+test.txt

Remove the offending chars and re-run:

% arc lint --output json jsonrenderer+test.txt
{"jsonrenderer+test.txt":[{"line":null,"char":null,"code":"NAME1","severity":"error","name":"Bad Filename","description":"Name files using only letters, numbers, period, hyphen and underscore.","original":null,"replacement":null,"granularity":1,"locations":[],"bypassChangedLineFiltering":null,"context":"<<<<<<<\n=======\n>>>>>>>\n"}]}

As you can see, it adds context and this is why it fails above with non-UTF-8 chars.

Version info

libphutil: 796cb1c2ee274397a8a7bc6c10566fd751619d6d
arcanist: 822bc53ca306e06314560d8a76f68771d732e8e0