Page MenuHomePhabricator

Plans: Arcanist toolsets and extensions
Open, NormalPublic

Description

See PHI416. Is lint [cache] "granularity" now completely unused?

See PHI410. An install would like some things which probably start with better arc cascade support (T3875) in the upstream before they get real upstream support.

See PHI102. An install would like the output from arc unit to be generally better.

See PHI84. See T12996. An install would like better performance from arc diff.

See PHI45. An install would like arc land in Mercurial to do a better job of cleaning up commits with the evolve extension enabled.

See T11429. Much of this is still relevant. A major plan is T5055 (package management).

See PHI481. This is a multifaceted request which largely overlaps with other existing requests (better tools for distributing extensions, customizing workflows, etc).

See PHI582. arc diff currently prompts users to mark diffs with more than 4MB of text changes as "binary". Although we probably can not remove this limit entirely (if you add a 20GB text SQL dump to a repository, we can't reasonably ship it over normal JSON Conduit calls or show it in the web UI) it could be much higher than it currently is. More importantly, the UI and workflows should clearly distinguish between "enormous file" and "binary file".

See PHI64. See T784. Changeset annotations, principally @generated, should become more flexible.

See PHI644. See T3271. At least one install would find this "Waiting on Editor" prompt useful.

See PHI644. The delay between arc diff and the $EDITOR prompt for update messages is potentially long enough to get up from your desk in large repositories. It would be nice to push more interactive prompts to the top and/or provide flags to reduce prompts.

See PHI694. An install would like support for custom configuration properties in configuration-driven unit engines, similar to the existing support in configuration-driven lint engines.

See PHI709. An install would like arc to retry network operations, e.g. after losing WiFi connectivity at a coffee shop. Although this is difficult in the general case because it's hard to prevent double writes and recover from successful writes where we lost the acknowledgement, we could likely retry *.search calls automatically.

See PHI858. The current algorithm for removing instructional comments can remove too much in the case of:

Subscribers:
#projectname

arc or phage should be more helpful.

arc --help should work like arc help.

arc quack should be more helpful (not arc quack specifically, but arc <some invalid command>). Perhaps arc quack should also be more useful, this is a pretty good command.

Some configuration options, like repository.callsign and http.basicauth.user, only make meaningful sense to provide below a particular scope. For example, arc set-config http.basicauth.user ... should likely tell the user to run arc set-config --local http.basicauth.user ... instead.

From D19697, some code could be cleaned up by moving the "pure JSON list" test to a phutil_is_natural_list() function or similar.

From D19697, there's some sketchy line unwrapping magic around workflow help in Workflow.


Followup

See PHI705. An install would explicitly like arc set-config ... to be modular.

See PHI403. An install would like to provide some arc-adjacent workflows through a different named binary, e.g. thaumaturgist undiff. In the upstream, we already effectively do this with phage. I think this is generally reasonable and should be formalized/supported.

See PHI954. When a project configures no arc linters or unit tests, it still gets an autoplan build result in the web UI. It should not. See also T13098.


Resolved

Once --config x=y is actually read, we should be explicit about when x=true means true and when it means "true". There's some support for this already. This may happen naturally as a result of general config rewrites.

Revisions and Commits

rPHU libphutil
Abandoned
D19675
rARC Arcanist
D20990
D20085
Audit Required
D19716
D19717
D19715
D19714
D19713
D19712
D19711
D19710
D19709
D19708
D19707
D19706
D19703
D19705
D19700
D19699
D19698
D19697
D19695
Audit Required
D19694
D19691
D19690
D19689
D19693
D19692
D19688
D19687
D19686
D19679
D19678
D19677
D19639

Related Objects

StatusAssignedTask
Openepriestley
Resolvedepriestley
Invalidepriestley
Openepriestley
Resolvedepriestley
OpenNone
Openepriestley
Wontfixepriestley
Wontfixepriestley
Resolvedepriestley
Resolvedepriestley
OpenNone
Resolvedepriestley
Wontfixepriestley
Wontfixjoshuaspence
Wontfixepriestley
ResolvedNone
Wontfixjoshuaspence
Resolvedepriestley
OpenNone
OpenNone
Resolvedjoshuaspence
Resolvedjoshuaspence
OpenNone
Resolvedepriestley
Openepriestley
Resolvedepriestley
Resolvedepriestley
Resolvedepriestley
OpenNone
OpenNone
Resolvedcspeckmim

Event Timeline

There are a very large number of changes, so older changes are hidden. Show Older Changes
epriestley added a subtask: Restricted Maniphest Task.Sep 24 2018, 3:49 PM
epriestley closed subtask Restricted Maniphest Task as Wontfix.

For prompting, the major things I want to do are:

  • (T13198) Make prompts modular so there's a generic mechanism for answering them from configuration/scripting. To some degree, I expect this to be "enough rope to shoot yourself in the foot" change (i.e., it plausibly enables you to build automation on all kinds of things you probably shouldn't be building automation on) but it will also let us get rid of a fair number of flags.
  • (T12996) It would be nice, although not entirely necessary, for prompts to become nonblocking.

We currently prompt with bash -c ... in most cases, and fgets(STDIN) in a subset of cases.

After the SIGINT changes in D19703, ^C no longer interrupts prompts (it is handled once the prompt exits). I think this is significantly bad. We could cascade the ^C to subprocesses.

But I'm broadly inclined to move away from the bash -c ... approach. The historical reason we do this is so that pressing the up arrow key lets you reuse lint and unit excuses, but I'd like to just remove these features.

In theory, arc unit is next on the docket so we can start getting test coverage, then Windows support. However, I suspect arc unit may unfurl into a lot of substeps.

Here are some of the bigger ideas around arc unit:

First, I plan to combine the the --json, --ugly, and --output flags into a single --format <format> flag which selects from modular formatters. We'll ship "default" and "json" formatters.

I'll possibly repurpose --output to mean --output <file>, but purely because arc unit > file is an antipattern in Bash. This may not happen immediately.

Formatters will receive results as they arrive (so they can output progress or status information) and then receive a call to render final results. The default formatter will stop rendering failures twenty pages up the scrollbuffer, and will change to balance regular feedback with making failures easily recognizable.

The --engine argument will be removed. All unit test configuration will be driven by .arcunit.

I'd like to remove or consolidate --coverage, --no-coverage, and --detailed-coverage but don't have a plan yet.

T9223 is a mess which approximately wants to be able to share an artifact between "lint" and "unit". I'm going to leave this alone for now. I think we'll probably write arc build.

I plan to let "arc land" run unit tests soon. This may introduce a concept of "phases" to unit, e.g. arc unit --phase diff vs arc unit --phase land. I'm going to let this sit for now since worst case is we need to reformat .arcunit and that's fine to do later.

Now that arc unit runs, I'm going to try to make the test suite pass so that tests are a useful tool in moving forward. We currently have 162 test failures.

Most of these are linter test failures related to changes to the WorkingCopy. Since actual linters don't care about the working copy or repository I suspect this won't be enormously difficult to resolve, although we may still be some ways away from being able to evaluate base rules and select commit ranges correctly.

The lint unit tests are significantly entangled with general lint infrastructure, so I'm now looking at making arc lint run again: I want to make sure the infrastructure is headed in the right direction overall before fixing unit tests, so we don't end up with a narrowly-scoped fix to the lint tests that needs to be rewritten once arc lint comes online. In particular, linters currently have their hands very deep in repository / working copy information. Although I think it is plausible that a linter may want access to this information (for example, we have a unit test method which reads configuration to render a link to the Phabricator install, which doesn't seem strictly bad), I think linters and unit tests should not require this information and we should not need to build an entire working copy to run a linter.

I expect to make most of the same changes to lint that I made to unit, e.g. remove support for --engine and unify outputs.

The behavior of arc lint --lintall is historically confusing: it isn't intuitive that "warning" means "show on changed lines", "error" means "show always", and "--lintall" means "--show-warnings-even-if-the-line-did-not-change". This is approximately T4287. I'd like to remove this flag and make the renderer show a hint instead:

19 warnings on lines you didn't touch were not shown. Use arc lint --severity all to show warnings on untouched lines.

I'd like to replace both --everything flags with a flag like --working-copy. Neither flag affects ignored or untracked files when run in a working copy, so --everything is misleading. This was a behavioral change from original implementation but changing the flag was too large a compatibility break at the time. A counterpoint to this is that arc lint and arc unit can currently run without a working copy (this is new in wilds and a consequence of generally better infrastructure, at least so far). It would be nice to preserve this, but then maybe the flag should be different or there should be two flags (but: yuck).

Most of the other outstanding issues with lint are not top-level issues. They may affect API decisions later on, but likely do not affect architecture decisions.

Things which are top-level are:

  • better parallelization (T12996), but I expect improved data pipelines and clearer inputs and responsibilities will make this yield easily;
  • a new "beautifier" phase (T8971), but I expect this to be part of arc build and broadly part of a later iteration where arc build comes into existence with beautifier/generator phases and unit/lint/land integrate with these workflows.

I want to make sure the infrastructure is headed in the right direction overall before fixing unit tests

Although I don't have it running yet, I do have a version in my working copy which seems likely to be headed in the right direction. It ends up sharing a very large amount of code with unit, and the shared code also makes good sense for the future build workflow, so this seems very promising to me.

Much of the current complexity in lint flows is around path management because lint treats paths as a key into a WorkingCopy, which is itself a sort of key into a RepositoryAPI. I think I now have a clear picture of how to simplify this, so I may be able to get arc unit passing by making smaller changes now that the big picture is somewhat clearer to me.

In retrospect, I should have rigged the hash for the the "Remove libphutil" commit to have some cool recognizable prefix like de1e7ed, since half of arcanist/ now ends up there on git blame.

One idea here: T12961 and T13012 are security issues related to the misuse of -- in argument lists. I think there was one more (less severe issue?) with Git elsewhere, too.

Some of the toolset changes are hardening arc against these cases. We could further harden arc by sometimes requiring -- separate flags and arguments:

$ arc diff HEAD^
Usage Exception: Separate flags and arguments with "--".
$ arc diff -- HEAD^
Creating diff...

We probably should not always require this since it's annoying for interactive use, but we'd ideally like to always require it for script use.

Maybe a reasonable test is actually checking if stdin is a TTY, and requiring explicit -- if not. I think this only gets us into trouble if you're doing something like arc diff --raw < diff.txt, which would become arc diff --raw -- < diff.txt. This is somewhat unintuitive, but rare, and we could also easily make arc diff --raw be arc diff --raw <path> instead.