Page MenuHomePhabricator

Need a way to define common include/exclude patterns in lint
Open, WishlistPublic

Description

Need a way to define common include/exclude patterns in lint.

As it is, our .arclint file has a ton of copy and pasted regex entries. We should be able to define names /groups and use them so if someting needs to change we can do it in one place.

I was using traits for this in the linter itself but it would be nice to have it in the JSON.

Event Timeline

LegNeato raised the priority of this task from to Needs Triage.
LegNeato updated the task description. (Show Details)
LegNeato added a project: Arcanist.
LegNeato added a subscriber: LegNeato.

A way to support this would be to let .arclint be PHP as well as json. The json is super annoying to deal with, and with PHP we could use variables and such.

I'm curious as to what you mean... can you provide an example?

My opinion is that the .arclint file should stay as a JSON file. I think that allowing .arclint to be PHP would be very similar to just writing your own linter engine.

At scale the json format is a huge pain:

  1. Adding entries requires you to take care with commas, likely touching more lines than you need. This also causes rebase conflicts
  2. Can't embed comments without doing a hacky "_comment" entry, which isn't syntax highlighted
  3. It ends up looking like this in practice:

    "merge-conflict": { "type": "merge-conflict", "exclude": [ "((Vendor|assets|public)/)" ] },

    "nolint": { "type": "nolint", "exclude": [ "((Vendor|assets|public)/)" ] },

    "spelling": { "type": "spelling", "exclude": [ "((Vendor|assets|public)/)" ] },

Now if I want to go add "3rd-party" to the blacklist I need to edit a ton of entries. Also, this isn't just for blacklists...obviously I want certain classes of linters to operate on classes of files. So for "c-like" linters, I am repeatedly doing

"include": "(\\.(c|ccm|mm|h|hh)$)".

json seemed nice on paper but it is a huge pain to use in practice for anything non-trivial with multiple engineers and many linters.

OK. A couple of thoughts.

Whilst not strictly JSON, many JSON formats (like Sublime Text configurations for example) do, in fact, allow comments. We could create a custom JSON parser that strips out comments.

With regards to your common exclusions, there is actually a better way to do this. You can define a top level exclude configuration, which will be inherited by all linters. For example:

{
  "exclude": [
    "(^assets/)",
    "(^public/)",
    "(^Vendor/)",
  ],
  "linters": {
    "merge-conflict": {
      "type": "merge-conflict"
    },
    "nolint": {
      "type": "nolint"
    },
    "spelling": {
      "type": "spelling"
    }
  }
}

The global exclude stuff is actually not for *all* of the linters. We, for example, have linters that make sure vendored in stuff has a LICENSE file, etc. So that doesn't help here :-(

Possibly a bit out there, but can you maybe write a PHP (or python/bash/perl/whatever) script to generate the .arclint file? Presumably this stuff doesn't change very often, and whatever your favorite scripting language is should be able to dump out a JSON blob pretty easily.

For high-complexity use cases, writing Engines is still the recommended approach. There are a lot of rules that .arclint can't ever express, even if it's executable (e.g., linting based on file content or service lookups). While this isn't set in stone, I'd like to try to draw the line between .arclint and Engine subclassing at the point where code is executed: if you need execution, subclass Engines.

Ah, ok. In another task we were told to use the new way :D

Ah, sorry about that. Do you remember where it was? I can make sure it gets clarified for future readers.

joshuaspence triaged this task as Wishlist priority.Jun 20 2014, 3:45 PM

@nickz -- sorry, this is the actual ticket. T5057 is somewhat related, but this one is your issue.