Diviner Phabricator User Docs Arcanist User Guide: Lint

Arcanist User Guide: Lint
Phabricator User Documentation (Application User Guides)

Guide to lint, linters, and linter configuration.

This is a configuration guide that helps you set up advanced features. If you're just getting started, you don't need to look at this yet. Instead, start with the Arcanist User Guide.

This guide explains how lint works when configured in an arc project. If you haven't set up a project yet, do that first. For instructions, see Arcanist User Guide: Configuring a New Project.

Overview

"Lint" refers to a general class of programming tools which analyze source code and raise warnings and errors about it. For example, a linter might raise warnings about syntax errors, uses of undeclared variables, calls to deprecated functions, spacing and formatting conventions, misuse of scope, implicit fallthrough in switch statements, missing license headers, use of dangerous language features, or a variety of other issues.

Integrating lint into your development pipeline has two major benefits:

  • you can detect and prevent a large class of programming errors; and
  • you can simplify code review by addressing many mechanical and formatting problems automatically.

When arc is integrated with a lint toolkit, it enables the arc lint command and runs lint on changes during arc diff. The user is prompted to fix errors and warnings before sending their code for review, and lint issues which are not fixed are visible during review.

There are many lint and static analysis tools available for a wide variety of languages. Arcanist ships with bindings for many popular tools, and you can write new bindings fairly easily if you have custom tools.

Available Linters

To see a list of available linters, run:

$ arc linters

Arcanist ships with bindings for a number of linters that can check for errors or problems in JS, CSS, PHP, Python, C, C++, C#, Less, Puppet, Ruby, JSON, XML, and several other languages.

Some general purpose linters are also available. These linters can check for cross-language issues like sensible filenames, trailing or mixed whitespace, character sets, spelling mistakes, and unresolved merge conflicts.

If you have a tool you'd like to use as a linter that isn't supported by default, you can write bindings for it. For information on writing new linter bindings, see Arcanist User Guide: Customizing Lint, Unit Tests and Workflows.

Configuring Lint

To configure lint integration for your project, create a file called .arclint at the project root. This file should be in JSON format, and look like this:

{
  "linters": {
    "sample": {
      "type": "pep8"
    }
  }
}

Here, the key ("sample") is a human-readable label identifying the linter. It does not affect linter behavior, so just choose something that makes sense to you.

The type specifies which linter to run. Use arc linters to find the names of the available linters.

Including and Excluding Files: By default, a linter will run on every file. This is appropriate for some linters (like the Filename linter), but normally you only want to run a linter like pep8 on Python files. To include or exclude files, use include and exclude:

{
  "linters": {
    "sample": {
      "type": "pep8",
      "include": "(\\.py$)",
      "exclude": "(^third-party/)"
    }
  }
}

The include key is a regular expression (or list of regular expressions) identifying paths the linter should be run on, while exclude is a regular expression (or list of regular expressions) identifying paths which it should not run on.

Thus, this configures a pep8 linter named "sample" which will run on files ending in ".py", unless they are inside the "third-party/" directory.

In these examples, regular expressions are written in this style:

"(example/path)"

They can be specified with any delimiters, but using ( and ) means you don't have to escape slashes in the expression, so it may be more convenient to specify them like this. If you prefer, these are all equivalent:

"(example/path)i"
"/example\\/path/i"
"@example/path@i"

You can also exclude files globally, so no linters run on them at all. Do this by specifying exclude at top level:

{
  "exclude": "(^tests/data/)",
  "linters": {
    "sample": {
      "type": "pep8",
      "include": "(\\.py$)",
      "exclude": "(^third-party/)"
    }
  }
}

Here, the addition of a global exclude rule means no linter will be run on files in "tests/data/".

Running Multiple Linters: Often, you will want to run several different linters. Perhaps your project has a mixture of Python and Javascript code, or you have some PHP and some JSON files. To run multiple linters, just list them in the linters map:

{
  "linters": {
    "jshint": {
      "type": "jshint",
      "include": "(\\.js$)"
    },
    "xml": {
      "type": "xml",
      "include": "(\\.xml$)"
    }
  }
}

This will run JSHint on .js files, and SimpleXML on .xml files.

Adjusting Message Severities: Arcanist raises lint messages at various severities. Each message has a different severity: for example, lint might find a syntax error and raise an error about it, and find trailing whitespace and raise a warning about it.

Normally, you will be presented with lint messages as you are sending code for review. In that context, the severities behave like this:

  • error When a file contains lint errors, they are always reported. These are intended to be severe problems, like a syntax error. Unresolved lint errors require you to confirm that you want to continue.
  • warning When a file contains warnings, they are reported by default only if they appear on lines that you have changed. They are intended to be minor problems, like unconventional whitespace. Unresolved lint warnings require you to confirm that you want to continue.
  • autofix This level is like warning, but if the message includes patches they will be applied automatically without prompting.
  • advice Like warnings, these messages are only reported on changed lines. They are intended to be very minor issues which may merit a note, like a "TODO" comment. They do not require confirmation.
  • disabled This level suppresses messages. They are not displayed. You can use this to turn off a message if you don't care about the issue it detects.

By default, Arcanist tries to select reasonable severities for each message. However, you may want to make a message more or less severe, or disable it entirely.

For many linters, you can do this by providing a severity map:

{
  "linters": {
    "sample": {
      "type": "pep8",
      "severity": {
        "E221": "disabled",
        "E401": "warning"
      }
    }
  }
}

Here, the lint message E221 (which is "multiple spaces before operator") is disabled, so it won't be shown. The message E401 (which is "multiple imports on one line") is set to "warning" severity.

If you want to remap a large number of messages, you can use severity.rules and specify regular expressions:

{
  "linters": {
    "sample": {
      "type": "pep8",
      "severity.rules": {
        "(^E)": "warning",
        "(^W)": "advice"
      }
    }
  }
}

This adjusts the severity of all "E" codes to "warning", and all "W" codes to "advice".

Locating Binaries and Interpreters: Normally, Arcanist expects to find external linters (like pep8) in $PATH, and be able to run them without any special qualifiers. That is, it will run a command similar to:

$ pep8 example.py

If you want to use a different copy of a linter binary, or invoke it in an explicit way, you can use interpreter and bin. These accept strings (or lists of strings) identifying places to look for linters. For example:

{
  "linters": {
    "sample": {
      "type": "pep8",
      "interpreter": ["python2.6", "python"],
      "bin": ["/usr/local/bin/pep8-1.5.6", "/usr/local/bin/pep8"]
    }
  }
}

When configured like this, arc will walk the interpreter list to find an available interpreter, then walk the bin list to find an available binary. If it can locate an appropriate interpreter and binary, it will execute those instead of the defaults. For example, this might cause it to execute a command similar to:

$ python2.6 /usr/local/bin/pep8-1.5.6 example.py

Additional Options: Some linters support additional options to configure their behavior. You can run this command get a list of these options and descriptions of what they do and how to configure them:

$ arc linters --verbose

This will show the available options for each linter in detail.

Running Different Rules on Different Files: Sometimes, you may want to run the same linter with different rulesets on different files. To do this, create two copies of the linter and just give them different keys in the linters map:

{
  "linters": {
    "pep8-relaxed": {
      "type": "pep8",
      "include": "(^legacy/.*\\.py$)",
      "severity.rules": {
        "(.*)": "advice"
      }
    },
    "pep8-normal": {
      "type": "pep8",
      "include": "(\\.py$)",
      "exclude": "(^legacy/)"
    }
  }
}

This example will run a relaxed version of the linter (which raises every message as advice) on Python files in "legacy/", and a normal version everywhere else.

Example .arclint Files: You can find a collection of example files in arcanist/resources/arclint/ to use as a starting point or refer to while configuring your own .arclint file.

Advanced Configuration: Lint Engines

If you need to specify how linters execute in greater detail than is possible with .arclint, you can write a lint engine in PHP to extend Arcanist. This is an uncommon, advanced use case. The remainder of this section overviews how the lint internals work, and discusses how to extend Arcanist with a custom lint engine. If your needs are met by .arclint, you can skip to the next section of this document.

The lint pipeline has two major components: linters and lint engines.

Linters are programs which detect problems in a source file. Usually a linter is an external script, which Arcanist runs and passes a path to, like jshint or pep8.

The script emits some messages, and Arcanist parses the output into structured errors. A piece of glue code (like ArcanistJSHintLinter or ArcanistPEP8Linter) handles calling the external script and interpreting its output.

Lint engines coordinate linters, and decide which linters should run on which files. For instance, you might want to run jshint on all your .js files, and pep8.py on all your .py files. And you might not want to lint anything in externals/ or third-party/, and maybe there are other files which you want to exclude or apply special rules for.

By default, Arcanist uses the ArcanistConfigurationDrivenLintEngine engine if there is an .arclint file present in the working copy. This engine reads the .arclint file and uses it to decide which linters should be run on which paths. If no .arclint is present, Arcanist does not select an engine by default.

You can write a custom lint engine instead, which can make a more powerful set of decisions about which linters to run on which paths. For instructions on writing a custom lint engine, see Arcanist User Guide: Customizing Lint, Unit Tests and Workflows.

To name an alternate lint engine, set lint.engine in your .arcconfig to the name of a class which extends ArcanistLintEngine. For more information on .arcconfig, see Arcanist User Guide: Configuring a New Project.

You can also set a default lint engine by setting lint.engine in your global user config with arc set-config lint.engine, or specify one explicitly with arc lint --engine <engine>. This can be useful for testing.

There are several other engines bundled with Arcanist, but they are primarily predate .arclint and are effectively obsolete.

Using Lint to Improve Code Review

Code review is most valuable when it's about the big ideas in a change. It is substantially less valuable when it devolves into nitpicking over style, formatting, and naming conventions.

The best response to receiving a review request full of style problems is probably to reject it immediately, point the author at your coding convention documentation, and ask them to fix it before sending it for review. But even this is a pretty negative experience for both parties, and less experienced reviewers sometimes go through the whole review and point out every problem individually.

Lint can greatly reduce the negativity of this whole experience (and the amount of time wasted arguing about these things) by enforcing style and formatting rules automatically. Arcanist supports linters that not only raise warnings about these problems, but provide patches and fix the problems for the author -- before the code goes to review.

Good linter integration means that code is pretty much mechanically correct by the time any reviewer sees it, provides clear rules about style which are especially helpful to new authors, and has the overall effect of pushing discussion away from stylistic nitpicks and toward useful examination of large ideas.

It can also provide a straightforward solution to arguments about style, if you adopt a policy like this:

  • If a rule is important enough that it should be enforced, the proponent must add it to lint so it is automatically detected or fixed in the future and no one has to argue about it ever again.
  • If it's not important enough for them to do the legwork to add it to lint, they have to stop complaining about it.

This may or may not be an appropriate methodology to adopt at your organization, but it generally puts the incentives in the right places.

Philosophy of Lint

Some general thoughts on how to develop lint effectively, based on building lint tools at Facebook:

  • Don't write regex-based linters to enforce language rules. Use a real parser or AST-based tool. This is not a domain you can get right at any nontrivial complexity with raw regexes. That is not a challenge. Just don't do this.
  • False positives are pretty bad and should be avoided. You should aim to implement only rules that have very few false positives, and provide ways to mark false positives as OK. If running lint always raises 30 warnings about irrelevant nonsense, it greatly devalues the tool.
  • Move toward autocorrect rules. Most linters do not automatically correct the problems they detect, but Arcanist supports this and it's quite valuable to have the linter not only say "the convention is to put a space after comma in a function call" but to fix it for you.

Next Steps

Continue by: