Page MenuHomePhabricator

Implement internal workflows / a build engine in Arcanist
Closed, ResolvedPublic


There are several use cases for allowing arc to perform more complex "build-like" logic on various workflows. A few examples are:

  • (T9109, T9223) Run a single build program (like gradle) before arc diff, then read both lint and unit results from the output.
  • Run code generation steps before arc diff, and after merging with arc land or arc patch (for example: the Phabricator upstream has generated maps which would be easier to deal with if they were regenerated automatically).
  • Ask some external system for permission during arc land, and prompt the user about inclement weather (for example: tests are failing at HEAD, the branch is considered retired, or release engineering has frozen it).
  • On a pure Harbormaster/Drydock build pipeline, things are much simpler if arc unit can know how to perform a build and arc unit --everything can just work.
  • (T5786) arc diff currently has some hard-coded prompts which at least one install would like to disable.

Although we could pursue solutions to this class of problems individually in a tailored way, I think requirements in this class have complexity which exceeds what we can reasonably do by inventing JSON DSLs and adding hard-coded extension points. The gradle stuff needs to coordinate with later steps, the "build" and "can I land" stuff would benefit from being independently executable, and various reasonable workflows in this class have sophisticated processing, working copy mutation, or user interaction requirements.

I think it's reasonable for us to push arc in this direction, but we need to make it easy to build, distribute, and maintain extensions first. So the pathway forward is:

  • (T5055) Build a packaging system for Arcanist/Phabricator so this stuff can reasonably be distributed and kept up to date.
  • Design and implement "internal/build" workflows in Arcanist for extending and interacting with its default workflows.

These are both systems with large scope, although we can build a simple version first in both cases and get some useful functionality out of it.

(The description of this as a "build engine" makes it sound enormously complex, but the distinction I'm really making is that I think these workflows are too complex to express with JSON DSLs, and plugins which can express them are too hard to manage until we build T5055. The primary motivation for T5055 is currently linter / unit test plugins, but once it lands it also gives us much better tools to explore extending other workflows.)

Related Objects

Event Timeline

One use case I'd like for you to consider is that the build logic might want to insert data into the commit log or differential revision. For example we may want to add reviewers, subscribers, Jira tickets (from the branch name), etc. While some of this case is handled by Herald, not all of it can be without re implementing existing logic in custom PHP

eadler added a project: Restricted Project.Mar 23 2016, 12:17 AM
eadler moved this task from Restricted Project Column to Restricted Project Column on the Restricted Project board.
eadler moved this task from Restricted Project Column to Restricted Project Column on the Restricted Project board.

Concretely, I think this breaks down as a mixture of modernizations and new code:

Argument Parsing: (T7489, T7488) Shell completion and command/flag spelling correction should move to libphutil and PhutilArgumentParser. Although shell completion is probably not hugely useful outside of the context of arc, it's not arc specific. Spelling correction works well in all cases and every CLI command could benefit from it. (That said, I think ArcanistWorkflow is probably distinct enough that it should wrap or sit alongside PhutilArgumentWorkflow rather than extend from it -- T7487).

ArcanistConfiguration: This is a @concrete-extensible one-off that we should remove. This is an out-of-date modularization pattern from long ago that we also have in Phabricator (AphrontApplicationConfiguration). The more modern approaches are way better and we should get rid of this.

The primary extension points this provides are:

  • Run arbitrary code before a workflow.
  • Run arbitrary code after a workflow.
  • Run arbitrary cleanup code.
  • Arbitrarily rewrite commands.

I think these will be mostly obsoleted by changes here, and don't know what use cases exist for them. They at least partially predate things like arc alias. They completely predate Harbormaster. I haven't recommended anyone use these hooks in a very very long time. I believe they proved to be mostly useless in practice, as almost everything users wanted to do with them required running code inside a workflow instead.

I plan to remove them if possible, or implement ArcanistCoreEngineExtension for top-level changes if compelling use cases remain.

Possibly, ArcanistConfigurationManager should then be renamed, since that name is at least partially a result of ArcanistConfiguration existing and having a questionable name.

ArcanistWorkflow: This class does too much and is at least partially a big ball of random nonsense.

All the Conduit stuff should move elsewhere and be fixed to address any remaining bugs with logged-out access (T7385?).

The Working Copy / RepositoryAPI stuff should probably also move to separate classes and be restructured to be more on-demand. This is slightly tricky because it is good to predict this early so we can exit with an error message. However, I think there are a lot more cases where we sometimes need a working copy and sometimes do not than I had originally anticipated, and making these construct lazily is likely the better approach.

The link to ArcanistConfiguration should probably get nuked. Perhaps we still need some kind of global shared state object for communication across workflows, but I suspect we may not.

All the argument parsing should move to libphutil.

Weird stuff like "commit mode" and "clean working copy" and "should amend" should move elsewhere.

All the "loadX*" stuff should move to query classes. T11355 will probably provide the first example.

Events: All of these should be removed.

const TYPE_COMMIT_WILLCOMMITSVN   = 'commit.willCommitSVN';
const TYPE_DIFF_DIDCOLLECTCHANGES = 'diff.didCollectChanges';
const TYPE_DIFF_WILLBUILDMESSAGE  = 'diff.willBuildMessage';
const TYPE_DIFF_DIDBUILDMESSAGE   = 'diff.didBuildMessage';
const TYPE_DIFF_WASCREATED        = 'diff.wasCreated';
const TYPE_REVISION_WILLCREATEREVISION = 'revision.willCreateRevision';
const TYPE_LAND_WILLPUSHREVISION  = 'land.willPushRevision';

Instead, we should provide an ArcanistReviewEngineExtension to hook all code review workflows (diff, lint, unit, land).

Here's an effort to collect use cases for the hooks I plan to break or replace:


  • T4565 (Mar 2014) Require users to wait for builds to complete before landing.
    • In modern code, obsoleted by builtin Harbormaster prompts.
    • Implementable with planned "inclement weather" checks.
  • D5268 (Mar 2013) This introduced the event. This event was contributed; no context about why.


  • D3408 (Aug 2012) This introduced the event, "so installs can muck around with titles, etc.".
    • Likely replaced by planned "suggest reviewers / etc" hook.


  • D3259 (Aug 2012) This introduced the event, for running server-side builds.
    • In modern code, obsoleted by Harbormaster with Herald.


  • D3394 (Aug 2012) This introduced the event, for collecting metrics at Facebook.
    • Not a use case I plan to retain support for explicitly.


  • D2449 (May 2012) This introduced the event, for Hive/JIRA integration on Facebook's managed install.
    • I believe Hive has moved away from Phabricator.
    • Not a use case I plan to retain support for explicitly.
    • Likely replaced by planned "suggest reviewers / etc" hook anyway.


  • D4785 (Feb 2013) This introduced the event, for Sandcastle integration at Facebook.
    • Obsoleted by Harbormaster + Staging Areas.


  • D1103 (Nov 2011) This introduced the event, for some kind of mutation at Facebook (git-svn bridging?)
    • Not a use case I plan to retain support for explicitly.
    • We may have a more formal hook here eventually (T2920, etc).

ArcanistConfiguration Callbacks

  • These are all in the initial commit and I don't remember if they had use cases at that time. I was a little less aggressive about never ever giving users the power to do anything back then, so these may have been speculative.
  • One indirect (?) use in T11019.
  • The other hooks have no mentions I can find on this install.
  • I suspect these may be completely unused and not really useful for anything.
  • They may have been used to implement metrics at Facebook, but this isn't a use case I plan to retain explicit support for.

So it looks like we have two legitimate use cases here, both of which I expect to provide first-class support for:

  • Prefill or rewrite revision fields in arc diff based on additional information (for example: prefill Owners as reviewers, infer JIRA ticket from title or branch name).
  • Perform "inclement weather" checks in arc land ("build is broken, don't land things on master right now").
epriestley claimed this task.

I'm going to roll this forward into T13098, which is essentially a superset of all the ideas here except arc build (a new step which can: generate code, generate artifacts shared between lint/unit, and prepare a working copy to be able to execute unit tests). That will probably happen, but likely after T13098 stabilizes.