## The Problem
In the Glasgow Haskell Compiler, we have a number of ideas for small linting checks which we'd like to fold into Arcanist's linter. Here are a few,
1. Checking whether testsuite results are updated in a consistent manner (e.g. if a test's expected output on 32-bit output is changed, we would also expect that the output for 64-bit platforms should change).
2. Checking that submodule are updated in a consistent way
3. Ensure that AST unique identifiers don't overlap
4. Checking that build artifacts aren't accidentally committed to the repository
We already have scripts which check these invariants and would like to integrate them into the standard Arcanist workflow.
## Current options
Currently users who want to write project-specific linters have two options
1. Write an `ArcanistLinter` implementation in PHP
2. Define an ad-hoc format for producing linter messages which can be parsed with `ArcanistScriptAndRegexLinter`
Approach (1) can be a substantial hurdle for projects with little experience (or desire to work with) PHP. Approach (2) is essentially a workaround to the fact that there exists no standardized way of serializing Arcanist's linting messages. Such a standard serialization would make the Arcanist's linting mechanism substantially more flexible at little cost.
JSON is a widely used format for serializing structured data and for good reason: it's extensible, reasonably normalizing, and self-describing. It would be nice if Arcanist would provide a linter type similar to `ArcanistScriptAndRegexLinter` but accepting JSON-serialized messages from the invoked script.
## Specification
Similar to `ArcanistScriptAndRegexLinter` the linter type would have minimal configuration surface, consisting solely of a command line for invoking an external linter.
The external tool would produce a standard JSON array on standard output. The array's elements would be JSON objects with a small vocabulary of attributes (following the model of the matches currently accepted by `ArcanistScriptAndRegexLinter`),
- `message` (required) Text describing the lint message. For example,
"This is a syntax error.".
- `name` (optional) Text summarizing the lint message. For example,
"Syntax Error".
- `severity` (optional) The word "error", "warning", "autofix", "advice",
or "disabled", in any combination of upper and lower case.
- `file` (optional) The name of the file to raise the lint message in. If
not specified, defaults to the linted file. It is generally not necessary
to specify this unless the linter can raise messages in files other than
the one it is linting.
- `line` (optional) The line number of the message.
- `char` (optional) The character offset of the message.
- `offset` (optional) The byte offset of the message. If provided, this
supersedes `line` and `char`.
- `original` (optional) The text the message affects.
- `replacement` (optional) The text that the range captured by `original`
should be automatically replaced by to resolve the message.
- `code` (optional) A short error type identifier which can be used
elsewhere to configure handling of specific types of messages. For
example, "EXAMPLE1", "EXAMPLE2", etc., where each code identifies a
class of message like "syntax error", "missing whitespace", etc. This
allows configuration to later change the severity of all whitespace
messages, for example.
- `throw` (optional) If set with a string error message `arc` will throw
the given message. You can use this to fail abruptly if you
encounter unexpected output. All processing will abort.