When we run arc diff, we typically spend time doing these things, more or less sequentially:
- Prompting the user to answer various questions.
- Launching $EDITOR to ask the user for a commit message or update message.
- Running linters (roughly arc lint).
- Running unit tests (roughly arc unit).
Currently, these steps run sequentially. In theory, much of this work can happen in parallel instead.
T4281 was an earlier effort to parallelize some of this work. It relied on using the parent process as a sort of server, and the subprocesses as sort of clients, and letting the "server" give the "clients" access to the terminal when they needed to prompt the user. This was very "clever" and also very fragile and hard to debug (i.e., users reported unreproducible hangs until I ripped the whole thing out). Although we may have fixed some of the problems with this model in the meantime, some of the problems are also fairly fundamental.
Currently, there are some legitimate dependencies between these steps. Some of these we can likely extract by adjusting how the workflow works; others may require more finesse.
In particular, lint can emit any kind of patch, and may instruct arc to edit files in a way which changes their behavior (for example, you can write a linter which replaces every file with a haiku about pasta). If lint modifies files, unit tests which passed before the modifications may fail after the modifications. If we run lint and unit in parallel, then apply the lint fixes, and do not re-run unit tests, we may upload a bad change with metadata that says "tests pass". Realistically, it is generally safe to assume that lint does not break unit tests, but we can't be certain this is true in the general case.
A couple of general technical capability questions:
- Can we actually write a PhutilConfirmFuture which doesn't block? (The answer should be "yes".)
- Can we avoid filling the output buffer in subprocesses by testing if writes to stdout would block? (This should also be a "yes".)
I broadly expect to:
- Restructure lint and unit so they can operate in a --subprocess sort of mode which just hands back the results without acting on them.
- Do stdout testing in those subprocesses to prevent stalling on the stdout buffer.
- Keep all command-and-control logic in the main process. We can switch this to futures where it makes sense, but if the subprocesses don't block this doesn't really matter.
As with other infrastructure changes, this is probably 30 diffs that do nothing and then five lines of actual changes.
I think this is not directly adjacent to other planned arc refactoring, although at least some amount of cleanup is likely inevitable and that should further the cause of T10038, etc.