Adds a basic linter for executing checkstyle and PMD, parsing the output into lint messages. These linters were based around the existing ArcanistCppcheckLinter.php.
Closes T9886
Differential D14632
Add Java linters, checkstyle and PMD cspeckmim on Dec 2 2015, 5:53 AM. Authored by Tags None Referenced Files
Tokens
Details
Adds a basic linter for executing checkstyle and PMD, parsing the output into lint messages. These linters were based around the existing ArcanistCppcheckLinter.php. Closes T9886 Checkstyle
"checkstyle": { "type": "checkstyle", "include": "(\\.java$)", "bin": "./misc-utils/lib/jars/checkstyle-6.13-all.jar", "flags": [ "-c", "/Users/cspeckrun/Desktop/google_checks.xml" ], "checkstyle.simplify-source-classname": true }
PMD
"pmd": { "type": "pmd", "include": "(\\.java$)", "bin": "./misc-utils/lib/pmd-5.4.1/lib/pmd-core-5.4.1.jar", "flags": [ "-rulesets", "rulesets/java/unusedcode.xml", "-dir" ] }
CPD
"pmd": { "pmd.command": "cpd", "type": "pmd", "include": "(\\.java$)", "bin": "./misc-utils/lib/pmd-5.4.1/lib/pmd-core-5.4.1.jar", "flags": [ "--minimum-tokens", "5", "--files" ] }
Diff Detail
Event TimelineThere are a very large number of changes, so older changes are hidden. Show Older Changes Comment Actions @cburroughs: I read through some of the discussion on T9223 and I'm not entirely sure I follow what's desired - whether maven would be invoking arc, vice-versa, and when, etc. I'm also not familiar with arc unit, as we have a jenkins cluster running our unit tests without any integration to phabricator. I'm guessing when that task gets figured out the XML parsing code here can just be reused as needed by maven. Comment Actions Thought a bit more about the maven side - are you imagining some thing where running arc lint runs a maven task? arc lint โ mvn basic-linting โ report back to arc In that case you would need a new ArcanistExternalLinter which would invoke maven, using the "flags" field to indicate which maven task to run. There's difficulty however in knowing how to parse the maven output -- since different tasks would have different outputs. Maybe something like: "my-maven-lint": { "type": "maven", "include": "(\\.java$)", "bin": "maven", "flags": [ "maven-lint-task" ], "maven.task_type": "checkstyle" } Where the maven.task_type field references another ArcanistExternalLinter which is used to parse the output of the maven task. Something similar could be done for gradle. We use gradle, but only for creating builds and not for linting or unit tests. Comment Actions
Comment Actions These changes are functionally complete and I have verified them according to the test plan - I removed the [WIP] flag from the title and I consider this now ready for review. Comment Actions Going to add PMD as another java linter as well - which provides more typical linting capabilities compared to code-style of checkstyle. Comment Actions My last commit was diff'd and reverted the initial checkstyle linter.. I wish I had more experience with git. Comment Actions OK This is ready to go. I added PMD support as well, allowing for it's sub-command 'CPD' for copy-paste detection. Comment Actions Fixed CPD to not throw an error when there's no copy/paste detected. Also adjusted the message to not spit out literal '\n' Comment Actions I apologize for the noise on this one. I've run through all the test cases I wrote up in the Test Plan field and everything appears to be working as expected.
Comment Actions I also use simplify classnames, but left the default to be explicit. I don't feel particularly strong one way about it -- mostly that I don't really know the extent to how users use it (would having colliding class names for checkstyle lint checks be common?). I'll plan to change it soon from your feedback. Comment Actions This is what it looks like without the simplify: i.e., the full message is there, and it's coming from the code (I assume). So even if someone is writing custom rules, and gives them the same class names for different issues (Which is pretty out there, imho), the long error would be different. I haven't actually looked into this - maybe checkstyle exposes something else that would be a better name / short message? Comment Actions The XML format doesn't have much else data -- here's the source that writes out the XML for an error. The source is the fully-qualified classname of the checker class. @Override public void addError(AuditEvent event) { if (event.getSeverityLevel() != SeverityLevel.IGNORE) { writer.print("<error" + " line=\"" + event.getLine() + "\""); if (event.getColumn() > 0) { writer.print(" column=\"" + event.getColumn() + "\""); } writer.print(" severity=\"" + event.getSeverityLevel().getName() + "\""); writer.print(" message=\"" + encode(event.getMessage()) + "\""); writer.println(" source=\"" + encode(event.getSourceName()) + "\"/>"); } } I just poked around at their other formatter (the default one) which does use this code, essentially same thing except also removes 'Check' from the beginning of the checker's classname. /** * Returns check name without 'Check' suffix. * @param event audit ivent. * @return check name without 'Check' suffix. */ private static String getCheckShortName(AuditEvent event) { final String checkFullName = event.getSourceName(); final String checkShortName; final int lastDotIndex = checkFullName.lastIndexOf('.'); if (lastDotIndex == -1) { if (checkFullName.endsWith(SUFFIX)) { checkShortName = checkFullName.substring(0, checkFullName.lastIndexOf(SUFFIX)); } else { checkShortName = checkFullName.substring(0, checkFullName.length()); } } else { if (checkFullName.endsWith(SUFFIX)) { checkShortName = checkFullName.substring(lastDotIndex + 1, checkFullName.lastIndexOf(SUFFIX)); } else { checkShortName = checkFullName.substring(lastDotIndex + 1, checkFullName.length()); } } return checkShortName; } The only other way to change that would be to make a custom 'logger' or there may be way to register a custom AuditEventFormatter. Digging through the AuditEvent class it looks like the only thing it tracks is the Class<?> of the checker - so getting further detail on the source for a more friendly name is likely not easily doable without a lot of customization in checkstyle itself. Comment Actions "CustomImportOrderCheck" is not that bad a name, and "Custom Import Order" (Caps to space, sans "Check") is only a little bit clearer (For Java programmers ๐ ). I'm not sure any more effort is useful beyond the current simplify code. Comment Actions I have an update for defaulting to simplifying class names as @avivey suggests* however the unit test is now failing. It looks like the paths resolved when running unit test are not generating correct path names anymore: 'java' '-jar' '/usr/local/bin/checkstyle-6.13-all.jar' '-f' 'xml' '-c' '/Users/cspeckrun/Projects/neandrake/phacility/arcanist/src/lint/linter/tests/checkstyle/google_checks.xml Notice that it's using /linter/tests/ and not /linter/__tests__/. I don't really know right now why this is happening. I did verify when setting the path in the ArcanistCheckstyleLinterTest class that dirname(__FILE__) does properly resolve with the __tests__ so there's probably something else going on that I'm not sure where to look. The change I made is working appropriately however. _* I actually changed the format of the config option as it felt weird to just swap the default -- in that if you want to affect the behavior you would set the setting to false. I changed it to instead be checkstyle.fully-qualify-sourcename which defaults to false. Comment Actions When I took this code to my library, the checkstyle linter refused to analyze files not ending with .java, so the tests failed anyway (CS 6.4-ish, I think, even after adding .lint-test in the xml). I'm not sure what the state of the world in T10038 is, but I think it's something like "Very little chance of upstreaming", so you can probably cheat a little in the test code if it will get things going. I have no guess about the /__test__/ becoming /test/ thing. Do other linter tests run? Comment Actions It might make sense to extract the Interpreter Flags to a separate diff, because it's more generally useful. Comment Actions
I specifically had to modify the ArcanistLinterTestCase to write out the intermediate file using a different name because of this issue (which was also an issue for PMD) (see my inline comment: https://secure.phabricator.com/D14632#inline-49549). The issue with tests vs. __test__ I'm not sure how to wiggle in order to get passing. I explicitly have it set a path with __tests__ and somewhere along the line it's getting translated to tests. Other UT's seem to run fine (I don't have some of the tools installed though so they skip). Comment Actions
I wish I had done this earlier, as my knowledge of using git to manipulate commits is extremely limited -- I think I'd just need to know how to rebase this change on top of another diff/commit. Additionally I've had some difficulty in the past working with dependent diffs with arc/phab, though it was with using mercurial. Comment Actions OK - I extracted the relevant changes to a new revision for now: D15067 Will need to figure out how to update this diff to be dependent on it rather than include the changes itself. I think:
Comment Actions That should work (use git rebase -i for both 1 and 2). You might want to squash everything - that might reduce the total number of conflicts to resolve. Then arc diff <other commit> to update this one... Comment Actions Inversing the checkstyle flag for simplifying sourcename and defaulting so that sourcenames are simplified. Comment Actions @avivey @cspeckmim Sorry to intrude, but I'm also interested in this functionality. I've read the comments but what needs to be done to move this forward? I assume getting D15067 landed? And I think the unit tests for this change need to be fixed? Comment Actions Yes those things need to happen. The changeset was working but then I worked on splitting out part of it into D15067 but haven't revisited this since. Part of the reason is that there's little chance this would be accepted upstream (T10038#150629), and I haven't yet had a chance to work within my organization to setup a fork of arcanist which all developers would install from, in order to utilize this functionality. Comment Actions @cspeckmim So basically T5055 needs to be done before any linter can be added in a maintainable way? There is no recent activity in T5055 and since it is an architectural change it is really hard for the community to do since it would always require closely working with upstream and apparently they have other priorities. I guess there is not much else to do then to off-load the linting to the CI build engine then. Comment Actions Unfortunately I think that's the current situation until they are funded to prioritize T5055. Comment Actions I would be willing to chime in for T5055 however I doubt anyone will want to foot the entire bill for such a large architectural change. I also think that upstream is the one that would benefit the most from it since it enables people to contribute more easy. I mean we both clearly want to contribute by extending phabricator (you even made the contribution already) but are unable. Comment Actions Until T5055, this change looks like it can live in the Community Resources page... The only modified file is a test-engine, so just bundling a copy of the test engine in there would just work. It's far from ideal, yeah, but at least it's usable. Comment Actions In any case, these lints can take a long time to run, probably too long for users to agree to run them, so running them in the CI isn't a bad choice. Comment Actions
@avivey - This was split into two diffs and is dependent on D15067, as it does require some changes to the ArcanistExternalLinter. Unfortunately not as simple as dropping in new linter files. Comment Actions A workaround I did for this in my package looks like this:
resources/java-jar.sh #! /usr/bin/env bash java -jar "$@"
ArcanistCheckstyleLinter.php public function getDefaultInterpreter() { $lib = 'libname'; $root = dirname(phutil_get_library_root($lib)); return $root.'/resources/java-jar.sh'; } This probably doesn't work on Windows, but most other things don't either. Comment Actions @cspeckmim: Is there any chance this will be merged into master? (Sorry for joining in late) Comment Actions As-is definitely not (caused some UT failures and probably needs more UT). I stopped working on it primarily as Phacility has communicated that they would no longer be taking new linters upstream, per T10038 and related tasks/discussions. Comment Actions Hello! Comment Actions @pranshu1995 - you can manually run PMD, or set it up to IDEs like Eclipse. Currently it's not supported by arcanist. Comment Actions @cspeckmim Okay! Thanks for replying. |