Page MenuHomePhabricator

arc unit uses the config of the current installation when running tests
Closed, ResolvedPublic

Description

While running tests after setting up a repro for T12845, I received this crash:

[2017-06-14 19:46:09] ERROR 2: krsort() expects parameter 1 to be array, null given at [/Users/amckinley/src/phacility/core/lib/phabricator/src/applications/maniphest/constants/ManiphestTaskPriority.php:129]
arcanist(head=experimental, ref.master=5d0f5afca8cd, ref.experimental=dc65bfbe5434), corgi(head=master, ref.master=47559c2f3db1), instances(head=master, ref.master=776ec695ac60), libcore(), phabricator(head=T12124, ref.master=3400f24c8b53, ref.T12124=103857925ae3), phutil(head=master, ref.master=74a1350416eb), services(head=master, ref.master=490e2a8b0429)
  #0 krsort(NULL) called at [<phabricator>/src/applications/maniphest/constants/ManiphestTaskPriority.php:129]
  #1 ManiphestTaskPriority::getConfig() called at [<phabricator>/src/applications/maniphest/constants/ManiphestTaskPriority.php:27]
  #2 ManiphestTaskPriority::getTaskPriorityKeywordsMap() called at [<phabricator>/src/applications/maniphest/__tests__/ManiphestTaskTestCase.php:197]
  #3 ManiphestTaskTestCase::moveTask(PhabricatorUser, ManiphestTask, ManiphestTask, boolean) called at [<phabricator>/src/applications/maniphest/__tests__/ManiphestTaskTestCase.php:30]
  #4 ManiphestTaskTestCase::testTaskReordering()
  #5 call_user_func_array(array, array) called at [<arcanist>/src/unit/engine/phutil/PhutilTestCase.php:492]
  #6 PhutilTestCase::run() called at [<arcanist>/src/unit/engine/PhutilUnitTestEngine.php:69]
  #7 PhutilUnitTestEngine::run() called at [<arcanist>/src/unit/engine/ArcanistConfigurationDrivenUnitTestEngine.php:147]
  #8 ArcanistConfigurationDrivenUnitTestEngine::run() called at [<arcanist>/src/workflow/ArcanistUnitWorkflow.php:167]
  #9 ArcanistUnitWorkflow::run() called at [<arcanist>/scripts/arcanist.php:420]
[2017-06-14 19:46:09] ERROR 2: Invalid argument supplied for foreach() at [/Users/amckinley/src/phacility/core/lib/phabricator/src/applications/maniphest/constants/ManiphestTaskPriority.php:28]
arcanist(head=experimental, ref.master=5d0f5afca8cd, ref.experimental=dc65bfbe5434), corgi(head=master, ref.master=47559c2f3db1), instances(head=master, ref.master=776ec695ac60), libcore(), phabricator(head=T12124, ref.master=3400f24c8b53, ref.T12124=103857925ae3), phutil(head=master, ref.master=74a1350416eb), services(head=master, ref.master=490e2a8b0429)
  #0 ManiphestTaskPriority::getTaskPriorityKeywordsMap() called at [<phabricator>/src/applications/maniphest/__tests__/ManiphestTaskTestCase.php:197]
  #1 ManiphestTaskTestCase::moveTask(PhabricatorUser, ManiphestTask, ManiphestTask, boolean) called at [<phabricator>/src/applications/maniphest/__tests__/ManiphestTaskTestCase.php:30]
  #2 ManiphestTaskTestCase::testTaskReordering()
  #3 call_user_func_array(array, array) called at [<arcanist>/src/unit/engine/phutil/PhutilTestCase.php:492]
  #4 PhutilTestCase::run() called at [<arcanist>/src/unit/engine/PhutilUnitTestEngine.php:69]
  #5 PhutilUnitTestEngine::run() called at [<arcanist>/src/unit/engine/ArcanistConfigurationDrivenUnitTestEngine.php:147]
  #6 ArcanistConfigurationDrivenUnitTestEngine::run() called at [<arcanist>/src/workflow/ArcanistUnitWorkflow.php:167]
  #7 ArcanistUnitWorkflow::run() called at [<arcanist>/scripts/arcanist.php:420]

Changing my install's config from null back to the global default resolved the issue, implying that the unit tests are pulling their configuration from my local install, instead of some mocked out config. This might be expected behavior for extracting stuff like locale, but it was counterintuitive when I saw it.

Event Timeline

This is semi-intended, because we have to pull a lot of config (like mysql.host) from your local install for things to make sense: we can't create test fixtures if we don't know how to connect to the database.

Then we override a bunch of stuff in PhabricatorTestCase->willRunTests(), largely to reset it to defaults. Some tests also do overrides.

So basically there's a kind of blacklist of "bad" config which we reset to the defaults, and leave everything else set up however you configured it.

This is arguably the wrong way to do things, and we should perhaps really do the opposite: use your settings for some specific short list of whitelisted values (like mysql.host), and use the defaults by default for everything else.

Some mild arguments against that are:

  • We do occasionally find bugs / bad assumptions as things are set up now, since different configs test different things. This is maybe slightly good on the balance?
  • By the time we reach test execution we've already run a bunch of code with the full local config, and can't really put the genie back in the bottle. If we REALLY wanted to whitelist in a completely correct way, we'd have to move this way up to the start of script execution. This might be kind of a huge mess?
  • Mostly, it just rarely matters and the current approach seems like it's good enough and not especially perilous more or less all the time, even though it's maybe a little silly/backwards.

That makes sense. Probably not much to be done for it (other than potentially doing a config override for maniphest.priorities prior to running ManiphestTaskTestCase).

This is outside the scope of this task, but does it really make sense to run unit tests against the configured production DB? Since they're supposed to be "unit tests", I feel like it would be nice to run them against sqlite or something else that doesn't depend on any potential quirks of the user's DB configuration. Not sure if we depend on MySQL-specific features anywhere that don't exist in sqlite (but fwiw my dev environment has been running on MariaDB with no problems).

I can't really come up with a problem that would prevent, even theoretically -- do you have a specific scenario in mind offhand where some kind of MySQL configuration could mask a test failure locally?

I'm not sure if SQLite implements things like, say, "IN BOOLEAN MODE" for FULLTEXT, global locks, SHOW SLAVE STATUS, the various collations, etc., the same way as MySQL. Presumably it does not implement the mysql meta-database the same way, either. We don't necessarily test that stuff today, but might in the future.

To me, the advantages of SQLite are:

  • You don't have to run a database to run the tests (but, for us, everyone running the tests always has a local database -- and if they don't, they should, since they can't test their change manually without one).
  • You don't have to do all the work to namespace all your logical databases, global lock names, etc (but we already did that work, and needed to for other reasons anyway).

At Facebook, everyone connected directly to production and namespacing would have been a trillion years of work, so SQLite made sense. But, for this project, my gut is that SQLite wouldn't realistically ever prevent any problems (I can't come up with any plausible ones, at least), but would be a lot of work to implement and might create a lot of exciting new problems rooted in behavioral differences between MySQL and SQLite.

Philosophically I'm happy with that. Just as an exercise though, I came up with:

  • For nebulous "security reasons", mysql.user might not have permissions to CREATE/DROP databases (or that permission was removed after bin/storage upgrade)
  • Someone might have really important stuff in a database they decided to name phabricator_unittests_sdfefsadfagmnxc
  • The DB might be hosted somewhere with a "pay by the byte" storage model (like RDS in AWS), and our hypothetical unit test that imports the linux kernel git repo ends up costing someone $$$ (especially because failed test runs don't get cleaned up automatically)
  • That same "import the linux kernel" unit test takes 10 minutes and massively slows down the production DB
  • Someone defeats the unit test DB sandboxing (on purpose or by accident) and accidentally reads/writes from the production databases (or acquires a global lock without releasing it)
  • Unit test DB traffic is very bursty and shows up on someone's ops dashboard as attempted data exfiltration, triggering a general freak out
  • Developers running unit tests simultaneously could interfere with each other if their randomly generated database names collide

I don't think any of this is seriously worth worrying about though, given the potential downsides you outlined. Maybe we should just add a little advisory in the developer documentation saying "beware that unit tests will run against the same database server you have configured for production" so when someone complains that running unit tests destroyed their world, we have somewhere to point to. Or another idea that just occurred to me is adding config options for unittest.mysql.user, unittest.mysql.host, etc etc that override the normal config.

Anyway, I'll leave it up to you to close this if you don't think there's anything worth addressing now.

Oh, I don't think anyone should ever be running unit tests with actual production credentials. You should assume that arc unit will immediately escape all sandboxing and destroy all the data it can touch, then run BENCHMARK() for an hour and then crash the server. But you should assume arc unit will do this even if it is "supposed" to be Totally Seriously Sandboxed 100% into SQLite.

We could maybe add more specific safeguards against this (e.g., refuse to run when not in development mode?) if we got a whiff of anyone ever doing it, but all of the "tests against production" stuff is a general problem with pointing development environments at production, I think.

epriestley claimed this task.

I think this is worth acting on (by changing MySQL sandboxing or config sandboxing or some combination of things) if:

  • We see evidence that anyone thinks arc unit on production systems is a great idea. I haven't seen any, but this is definitely a Very Bad Idea.
    • (Maybe the best defense against this would be: refuse to run if not in development mode with a huge scary warning, then turn off development mode for actually running the tests?)
  • We start seeing a lot of issues / high rate of change around the manual config reset list in PhabricatorTestCase. I think we get one new entry every ~6 months today and this doesn't feel like it's risky/churning a lot or anything right now, even though it doesn't feel especially great.

Swapping the whitelist feels like a lot of work without stronger justification and I believe no one runs arc unit in production (hopefully?) so let's just sit on this until there's more motivation.