Page MenuHomePhabricator

Touching only data files for unit tests doesn't run tests with current `.arcunit`
Open, NormalPublic

Description

Currently, the .arcunit files use this rule:

"include": "(\\.php$)"

However, this means that the tests don't run if you only touch unit test data files which don't end in .php. For example, in D13992 I touched two test cases that are in .lint-test files, but that touch didn't trigger the tests to run on arc unit. It seems like it should, since I directly changed the tests and the standard rules can discover the correct tests to run on this data.

We could use the rule (^src/) instead, or possibly just (.*) (or maybe we can omit it entirely, I don't recall the behavior of no rule).

See some prior discussion in D13853. Maybe (.*) is best?

Event Timeline

epriestley renamed this task from Touching test files doesn't run tests with current `.arcunit` to Touching only data files for unit tests doesn't run tests with current `.arcunit`.
epriestley raised the priority of this task from to Normal.
epriestley updated the task description. (Show Details)
epriestley added subscribers: epriestley, joshuaspence.

How about something like this?

"include": [
    "(\\.php$)",
    "(/__tests__/.*/data/)"
]

That wouldn't work in this specific case, but we could restructure the src/lint/linter/__tests__/ subdirectories accordingly?

What file could I touch in a __tests__/ directory that shouldn't trigger tests in that directory?

I guess the issue is knowing which test engine to run... Okay so a file in __tests__/ changes, do I need to invoke the PHP tests or the JS tests? Given the repository structure of this project we could probably just change the regex from \\.php$ to ^src/. Longer term, I wanted a way to map files to tests in a much more flexible way, such as mapping ^src/lint/linter/__tests__/xml/ to src/lint/linter/__tests__/ArcanistXmlLinterTestCase.php.

My gut is that it's OK to hand pretty broad "include" lists to most test engines and assume that they'll handle the mapping rules. Since the Phutil engine can do the right thing from any path, it seems OK to me to hand it pretty much anything.

I'd imagine include is more useful if you have to tell the engine more about the mapping. For example, the Maven test engine in T9223 works by finding pom.xml files to figure out what to do. It seems OK to me for the default behavior in this case to be: always run, and search for pom.xml files under the project root. This tends to make the defaults work correctly for simpler projects, I think.

But you might have a more complex project, project1/ and project2/, and each directory has separate non-interacting tests. Then maybe you'd specify two Maven engines, give one project1/ and one project2/, and they'd do the right (more-granular, properly-rooted) thing. I'm not totally sure that include would even be useful in this case, though, since I'd expect to just hand them everything and they'd quickly no-op if no files were actually under their roots, just like the Phutil engine does.

I'd expect "include" to not really see a ton of use in practice, unless there are engines which can't make a distinction about whether a file is relevant or not quickly even when properly configured. I think it's worthwhile for consistency and likely to find some use cases -- like using exclude to prevent tests from running just because you update some frequently-updated artifact that can't affect unit behavior -- but probably not the right tool for expressing mappings in most cases?

Do you have specific examples in mind where include/exclude are the best way to do the mappings?

I can come up with one: if we had a real dumb "just run this binary to run all the tests; it can not be decomposed" engine. That seems reasonable to some degree, and would always use include/exclude to express the mapping, and presumably just emit one pass/fail result I guess (or maybe do a script-and-regex kind of thing). So in that case it seems reasonable to have the mapping answer "do we run run-php-tests or run-js-tests", but in sophisticated cases like Phutil or Maven or whatever else I expect the engine can usually figure it out with no configuration or with engine-specific configuration (like "your project root is at project1/, not repository root"). I'd guess these cases probably don't need any include/exclude directives 95% or 99% of the time.

In the Maven case, I think it's OK for the default behavior to be something like: mvn test always runs, even if you just update the README, and for future documentation to walk through users getting this kind of over-testing setup working first.

If a project has an issue with that (maybe they update the README a lot and the tests are real, real slow), they ideally switch to using a more granular test engine which is smart enough to do the right thing on its own. Of course, in most cases this isn't possible, or it's very complicated. In these cases, you could exclude "README" or include just \.java to improve the behavior at the cost of some customization, which seems reasonable. I'd expect this to be a sort of "Tuning an Advanced Customization" section, though, not a common need (hopefully).

For the script-and-regex-unit case, too, the default would just be to run run-tests every time when anything was touched, and then if you want to be more selective you can use include and exclude or switch to a more powerful engine.