Page MenuHomePhabricator

Support `.arcunit`, similar to `.arclint`
Closed, ResolvedPublic

Description

Broad idea is to support a .arcunit file which is structurally similar to .arclint, since .arclint seems like an improvement on the old way of doing things.

Some maybe-good-ideas:

  • Separate the pipeline into three parts: a part that finds tests, a part that executes tests, and a part that parses test results?
    • Rules for where tests go seem to vary a lot, but there are at least a few common patterns ("run every test file" being most common).
      • "Run every path matching some regexp as a test, ignoring modified files" is probably the most common rule, and maybe even covers like 90% of cases.
    • Test execution is usually similar to running an external linter.
    • Since many engines support common output formats (I think?) like XUnit (I think?) separating parsing might be valuable here, while it's not very useful for lint.
  • I think that's about it? I feel like this needs less magic than lint does.

Related Objects

Event Timeline

There are a very large number of changes, so older changes are hidden. Show Older Changes

Maybe something like this:

{
  "paths": {
    "@^src/(.*)/(.*)\.php$@": "src/$1/__tests__/$2TestCase.php",
    "@^webroot/rsrc/@": "src/__tests__/PhabricatorCelerityTestCase.php"
  },
  "engines": [
    "phutil"
  ],
  "parsers": [
    "phutil"
  ]
}

I'm not sure what sort of configuration would be required for engines and parsers.

We might need one more level of stuff to let you run tests in multiple languages, but I think something like that would work.

I'd guess that we can get ~80% of cases with "always run all these tests when a JS file is touched" and ~90% of cases with something like your example with capture/replace, but, e.g., the PHPUnit logic we currently implement is a big complicated mess and libphutil has special logic too. But maybe I'm overthinking that part and we can basically just do a regexp map and then let the handful of special engines implement their weird default rule or take special config or something.

The parsers might also not get specified here at all, since the engines probably dictate the parsers, I guess? So maybe that's just split up internally and not exposed in .arcunit. I can't think of any reason why you'd want to force an engine to emit/parse in a specific format.

{
  "jsunit" : {
    "paths": {
      "type": "all",
      "trigger": "(\\.js$)",
      "run": "(tests/js/.*\\.js$)"
    },
    "engine": {
      "type": "jsunit"
    },
    "parser": {
      "type": "xunit"
    }
  },
  "phpunit": {
    "paths": {
      "type": "all",
      "trigger": "(\\.php$)",
      "run": "(tests/php/.*TestCase.php$)"
    },
    "engine": {
      "type": "phpunit"
    },
    "parser": {
      "type": "xunit"
    }
  }
}

With those simplifications, maybe it's more like this.

{
  "jsunit": {
    "engine": "jsunit",
    "paths": {
      "(\\.js$)": "tests/js/.*\\.js"
    }
  },
  "phpunit": {
    "engine": "phpunit",
    "paths": { 
       "@^src/(.*)/(.*)\.php$@": "tests/$1/$2TestCase.php"
    }
  }
}

Yeah, that seems pretty reasonable.

I had a bit of a look at this and I think that the unit testing code is in need of some attention anyway.

One potential issue is that there is inconstencies between the definitions of "engine" between linting and unit testing. A lint engine glues together multiple linters. A unit test engine is akin to a linter. For these reasons, I think that there may be some breaking changes here.

With .arclint, this inconsistency is hidden from the API, in that there is only one engine, and it's implicit; In theory, the word "engine" isn't used in the lint flow any more.

By analogy, a ConfigurationDrivenUnit<something> would not be exposed, and the .arcunit file will hold test "engines" in the current meaning ("uniter"? "runner"? "tester"?).

I guess I'm suggesting is to make the lint code be more like the unit code (By removing the Engine layer from it). This will still break stuff, but mostly stuff we were trying to get rid of anyway, instead of just rename.

In T5568#13, @avivey wrote:

With .arclint, this inconsistency is hidden from the API, in that there is only one engine, and it's implicit; In theory, the word "engine" isn't used in the lint flow any more.

By analogy, a ConfigurationDrivenUnit<something> would not be exposed, and the .arcunit file will hold test "engines" in the current meaning ("uniter"? "runner"? "tester"?).

I guess I'm suggesting is to make the lint code be more like the unit code (By removing the Engine layer from it). This will still break stuff, but mostly stuff we were trying to get rid of anyway, instead of just rename.

I somewhat agree. I was thinking about this a few days ago actually. I guess I sort of feel that (eventually) there should be no need for any ArcanistLintEngine other than ArcanistConfigurationDrivenLintEngine. However, to get to this stage we would need to make ArcanistConfigurationDrivenLintEngine significantly more powerful such that there would never be any good reason to write your own lint engine... and I'm not sure that this is something that we want to pursue.

With that said, I think that the inconsistencies between linting and (unit) testing is semantic more than anything else. In the linting context, an engine glues together one or more linters. In the unit testing context, there is no "glue" layer.

Support Impact The transition to .arclint seems to have reduced the lint-related support burden substantially.

While digging in there, it looks like having coverage information as a per-test datum is not actually the common practice in any test framework - in all implementations I saw, it's generated separately and just pasted into each test.
Maybe we should have it as a separate field in the diff data too?

I haven't seen much action on this recently.

Since many engines support common output formats (I think?) like XUnit (I think?) separating parsing might be valuable here, while it's not very useful for lint.

I think this fact can be used to encourage linter writers to converge on a common format?

"Run every path matching some regexp as a test, ignoring modified files" is probably the most common rule, and maybe even covers like 90% of cases.

I kind of feel like this is a dangerous assumption, especially mid to large projects since as the project size grows, running partial tests locally becomes quite important.

The parsers might also not get specified here at all, since the engines probably dictate the parsers, I guess?

For JS, I know that mocha lets you chose a bunch of different result formats (XUnit, tap etc.) so this might be a bad idea. May be make that optional and default to something sensible?

It might be nice to have some sort of .arcbuild file to describe how the code needs to be built before tests are run (this could then also be used by Harbormaster?)

Right now for compiled code, each test engine has to hard code the way things are built, which leads to support issues like T9036. By having .arcbuild, this would also greatly simplify things for projects that have multiple build engines (like one build system for server code and one build system for minifying HTML / CSS / JS resources).

Interesting idea, I haven't really considered Arcanist integrations for compiled languages.

.arcbuild would certainly assist with abstracting a lot of complexity that XUnitTestEngine (and the newer variants) have to deal with, especially if we allowed build steps in .arcbuild to designate artifact files for testing (so that the unit test engine doesn't have to calculate the paths to assemblies ugh)

See also T9223, which sort of leads toward an arc build + .arcbuild or some equivalent too.

I think this probably implies building a build tool in arc, though, not just a very thin wrapper around other build tools. I want to hold off on opening this line of development up for as long as possible.

Aaaand have arc build be called arc weld! Seems fitting ;)

Yeah, a lot of the issues mentioned in that Java thread apply to C# as well, although we also have the difficulty that there's no easy way of mapping changed files to individual tests in assemblies.

This is the reason that for C# I've resorted to writing my own linter in PHP that uses the C# AST outputted as JSON. The downside to this is that there's no real way for the linter to perform type checking - it can only look at a single file at a time, so it can only looks at information entirely local to the file, private fields in non-partial classes for example.

Big yes, please to this. I came across this task when looking into a way to exclude our Python tests when creating diffs when only our front-end JS code has been touched, for instance. It seems this feature will allow just that?

This seems a bit stalled, though. Any chance this might see the light of day in the not-too-distant-future? I might be able to spend some time to help out with this, if at all possible?

I've done some work on this and I do intend to continue working on it, but have been somewhat time constrained as of late. The foundations for this task are already in place and many unit tests engines just need to be modernized to work with the .arcunit specification.

There's a couple of other things that I'd like to implement here as well:

  • Formalize the concept of ArcanistExternalUnitTestEngine, similar to ArcanistExternalLinter.
  • Decouple test discovery and test execution. Currently, each unit test engine imposes a certain directory structure and I think that this is unnecessarily inflexible.

Any chance this could be speeded up if we were to sponsor some of the development on this?

If this would indeed let us include/exclude tests based on what code were in fact touched, this would be such a big improvement for us (speeding up every arc diff) that we may be able to help with some funding.

@asteinlein, sorry, I missed your earlier question.

There's no progress on this and we're too busy to prioritize it right now. When this does move forward, it's likely to be part of a larger effort adjacent to T10038.

Yeah, I understand, although too bad if that task is as low-pri and as large an undertaking as it seems. :/

Phabricator is still
Macro awesometown:
So much so that it's almost a shame we can't throw money at you. ;)

This has existed for some time and will be the only option after T13098.