Distribution mechanism for arc extensions
Open, WishlistPublic
Actions

Assigned To

None

Authored By

	champo
	May 14 2014, 6:59 PM

Description

The idea is to have a way to distribute libphutil libraries for arcanist in a simpler way than emailing a tar-ball to all users.

Discussion at https://secure.phabricator.com/chatlog/channel/6/?at=137189

The two big use-cases are:

Get the company's extensions/configuration to all users in an easy way (Without adding them to each repo)
Distribute 3rd party extensions, just like apt-get/npm/etc.

A more formal list of requirements (Mostly gathered from epriestley's comments around):

It should support installing arcanist extensions, Phabricator applications, and libphutil libraries.
- i.e., it should handle configuration for phabricator and arcanist
- Maybe it should even support installing third-party stuff like linters.
- Maybe it should even support installing third-party dependencies like Node?
Packages should be signed by the author, and you should only need to trust the author to trust the package.
- totally compromising a Phabricator install should be insufficient to compromise users of that install by tainting packages. If you (@avivey) sign a package, I (@epriestley) should be unable to taint it, even if you distribute it through secure.phabricator.com.
Packages should be able to define dependencies, and it should handle installing them.
for arcanist, packages may be specified via either by the project (.arcconfig) or by global configuration (.arcrc)
It should handle running different versions of the same package in different projects.
Have a way to require/alert users it's time to upgrade a package
Should not require the phab-marketplace to know about my extension (Because it's internal to my company and has all my secrets).
Support Linux, Mac OS X, and Windows.
"List all the things I have loaded/installed"
Should work in an environment where arc is mounted in a read-only location.

Important challenges:

Organization: Dumping directories next to things won't last very long and will run into issues with everything else here, as well as making it hard for us to do things like "list all the stuff that is installed". We would quickly need to have better rules about where stuff goes.
Versioning: How do we know something needs to be updated? How do we organize, store, and include multiple versions of a package?
Dependencies: How do we manage dependencies? How do we deal with cases like "diamond dependencies", where A depends on B and C, and B and C depend on different versions of D?
Security: How do we make sure that compromised user accounts don't lead to remote code execution on all users' machines? Code signing is probably the solution here, but it's complicated.

Revisions and Commits

rARC Arcanist
	Needs Review	D21485 Packages: Load'em from .cache

Related Objects
Search...

Status	Assigned	Task
Resolved	epriestley	T8116 Prototype a package management application
Duplicate	None	T8115 arcanist plugin system
Open	None	T9223 Allow `arc diff` to run a build step like `gradle` first, then read lint and unit messages from the output
Open	None	T10620 Support complex mappings from branch names to JIRA/Maniphest tasks
Open	None	T10622 Auto add reviewers based on custom logic
Open	epriestley	T13098 Plans: Arcanist toolsets and extensions
Resolved	epriestley	T10329 Implement internal workflows / a build engine in Arcanist
Open	None	T5055 Distribution mechanism for arc extensions
Open	None	T13229 On Third-Party Integrations

Event Timeline

There are a very large number of changes, so older changes are hidden. Show Older Changes

epriestley added a parent task: T10329: Implement internal workflows / a build engine in Arcanist.Feb 11 2016, 6:37 PM

nickz added a subscriber: nickz.Feb 18 2016, 7:02 PM

epriestley mentioned this in T10523: Herald Differential Diffs rule for "Added files"..Mar 5 2016, 9:12 PM

featherless awarded a token.Mar 6 2016, 2:46 AM

featherless added a subscriber: featherless.

epriestley mentioned this in T10605: Implement setOriginalText() and setReplacementText() in RuboCop driver.Mar 16 2016, 6:03 PM

epriestley mentioned this in T9788: config/versions page - also show dates of commits.Mar 19 2016, 2:11 AM

epriestley mentioned this in T10652: Allow arcanist to plan changes to a revision.Mar 23 2016, 4:17 PM

epriestley mentioned this in T10832: Evaluate Git remote execution vulnerabilities with 2GB pathnames.Apr 18 2016, 1:20 PM

jparise added a subscriber: jparise.Apr 26 2016, 9:21 PM

jparise awarded a token.Apr 26 2016, 9:50 PM

epriestley mentioned this in T10885: Arcanist Windows install docs - adding in additional php extensions.Apr 27 2016, 1:37 PM

epriestley mentioned this in Z1336: General Chat.Apr 29 2016, 12:16 PM

Firehed mentioned this in D15895: Modernize PhpunitTestEngine to work with .arcunit.May 12 2016, 7:11 PM

epriestley mentioned this in T10939: Support for OWNERS files.May 13 2016, 1:44 PM

epriestley mentioned this in D15913: Use "fa-shopping-bag" instead of "fa-list-alt" for Owners package icon.May 14 2016, 12:17 AM

epriestley mentioned this in T5267: Localize Phabricator.May 24 2016, 3:44 PM

epriestley mentioned this in T10895: Support`arc browse --revision <commit>` and make `arc browse` with no arguments mean `... --revision HEAD`.Jun 1 2016, 11:06 PM

epriestley mentioned this in T1982: Link to symbols defined or altered in the Revision.Jun 9 2016, 2:47 PM

epriestley mentioned this in T9789: Make it easier to write custom transaction types.Jun 9 2016, 4:40 PM

epriestley mentioned this in T11126: Allow embedding XKCD panes in Remarkup.Jun 10 2016, 12:35 AM

siepkes added a subscriber: siepkes.Jun 11 2016, 1:43 PM

siepkes mentioned this in D14632: Add Java linters, checkstyle and PMD.Jun 11 2016, 1:57 PM

svemir awarded a token.Jun 14 2016, 7:24 PM

epriestley mentioned this in T4631: Allow Differential to raise warnings on the server side via Conduit.Jun 17 2016, 2:48 PM

epriestley mentioned this in T11257: HTML in Diffusion not escaped in certain circumstances.Jul 2 2016, 1:12 PM

greggrossmeier added a subscriber: greggrossmeier.Jul 2 2016, 6:31 PM

epriestley mentioned this in T11297: Provide a way to filter arc unit coverage by coverable paths.Jul 8 2016, 2:56 PM

epriestley mentioned this in T11251: Diff dependency should cause shared lines to be hidden.Jul 12 2016, 1:24 PM

epriestley mentioned this in T11334: Add support for Ipsilon as an Auth Provider.Jul 15 2016, 5:34 PM

epriestley mentioned this in T11343: Generate default "Depends on" line in commit message when multiple diffs are stacked.Jul 18 2016, 11:03 PM

epriestley moved this task from Backlog to vMajor on the Arcanist board.Jul 21 2016, 12:11 PM

epriestley mentioned this in D14152: [Not ready for review] initial code dump of Packages app.Jul 21 2016, 2:54 PM

epriestley mentioned this in D16315: Add PackagesPackage.Jul 21 2016, 6:27 PM

epriestley mentioned this in T8116: Prototype a package management application.Jul 22 2016, 2:53 PM

isfs added a subscriber: isfs.Jul 24 2016, 7:49 AM

epriestley mentioned this in Roadmap.Jul 24 2016, 11:23 AM

epriestley merged a task: T8115: arcanist plugin system.Jul 24 2016, 11:43 AM

epriestley added subscribers: calfzhou, sophiebits, csilvers.

Changes connected to T8116 implement the initial server-side version of this. It's still very skeletal, but we probably need to make some client changes to move forward. In particular, the next object to implement is probably Signature, but signature algorithms should live in the client since the client will need to be able to verify signatures.

Roughly, arc will get new/expanded workflows:

arc upgrade: Today, this means "upgrade Arcanist". In the future, it will potentially mean several things:

Upgrade Arcanist, the client.
Upgrade extensions installed in Arcanist (very rare?).
Upgrade the current working directory (impossible/never?).
Upgrade extensions installed in libraries in the current working directory.
Upgrade global system software (future?)

I expect these to all live in the upgrade command. The default behavior will either become "upgrade everything" or "prompt, asking the user what to upgrade".

arc install: New command. This now gets several meanings:

Install a new extension into Arcanist.
Install a new extension into the Arcanist configuration for the current project.
Install a new extension or application into a library in the current working directory.
Install software on the system globally.
Download configured extensions for the current project or library.

There's some ambiguity here too, but arc install with no arguments probably means "synchronize everything so it is up to date", while arc install <package> probably means either "guess" or "prompt".

We can narrow down what arc install <package> means by giving packages types, like "Arcanist Extension", "Phabricator Application", "Library", "System Package", etc. It can then select a narrower range of reasonable install behaviors.

arc sign: New command. Sign a publisher, package, or version. This is used when publishing or attesting to the correctness of packages.

arc version: Today, this means "show Arcanist version". In the future, it will likely mean "show versions of all installed stuff" instead.

We also probably need these capabilities, but can figure them out in the future:

Search packages? arc search jslint?
Add a new package source (URI of an installed version of Packages)? Just arc set-config for now?
Remove a package -- some --remove flag on arc install? Separate arc uninstall?

Making major additions and changes to arc workflows dovetails heavily with T10329 and adjacent tasks. I expect to pursue that first, provide a more solid foundation for arc workflow to build upon, and then implement the new workflows.

Specifically, the next pieces I expect to build are:

Package types ("Arcanist Extension", "Phabricator Application", "Library", "System Software" (future)).
PackageSignature on the server, and arc sign on the client (initially, only for Publishers and Packages, probably).
Additional properties on Versions so they can actually point at a Git repository to clone (this will be modular in the future, but only support "clone a Git repo" for now).
Client-side support for cloning repos (arc install, arc version) and loading extensions.
Some sense of an upgrade channel / pathway and arc upgrade. Currently, "Versions" are not related to one another, so there's no way to specify how to upgrade version X. We can limp along without this initially since "look it up, then arc install" is fine for administrators while this is a prototype and I don't expect anyone to publish and sign 200 versions of an extension in the first week.

Upshot:

To move forward, Packages needs a mixture of client and server changes.
Arcanist workflows are getting modernized before the client changes (T10329).
After that, Packages can move forward on both the client and server.

epriestley mentioned this in T11397: Overcommit -> arc lint.Jul 30 2016, 11:38 AM

faulconbridge added a subscriber: faulconbridge.Aug 1 2016, 1:26 PM

It might be beneficial to generally support gpg signed commits / tags in phabricators git repos and then use the same mechanism for the arcanist packages.
Since github recently started pushing this feature a bit (https://github.com/blog/2144-gpg-signature-verification) quite a few library maintainers started signing their release tags.
And i just recently spoke to a lib maintainer about extending composer to verify all packages (on install or upgrade) against a list of authors trusted by the user (or an enterprise wide list).

I've not experimented with this stuff yet and setting up gpg is still a pain in the ass. But the integration into git afterwards is quite straight-forward.
https://git-scm.com/book/en/v2/Git-Tools-Signing-Your-Work

You could optionally use the users gpg keyring in addition to an arcanist specific one.

I generally expect all signing to be external, at least for the foreseeable future.

One of the major concerns I have with Composer is that it conflates the software developer and the software packager, often assuming they are one in the same (and generally having no mechanism to identify or verify the packager).

For example, I think having strong trust mechanism for the developer is of limited use (and potentially quite misleading) if the packager can release an "update" from v2.7.3 to v2.7.4 which actually reverts to v2.7.2, re-opening a widely disclosed and easily exploited security hole. If the developer removes the v2.7.2 tag from their repository as dangerous, the publisher can copy the repository elsewhere, restore the tag (which will have a valid signature!), and then point the package at the new repository. Generally, there's no way that I'm aware of to "unpublish" a GPG signature, but over time many signed versions of software become trivially unsafe as vulnerabilities are discovered and disclosed.

I want to primarily focus on trusting the publisher, and making it clear to the user that this is who they are trusting, and that they are trusting the publisher more or less completely. It's possibly even desirable not to show any developer-signature information to the user, as this implies that the barrier of trust the publisher must meet is lower: seeing that the package is signed by qmysteryman but the code is signed by Facebook "seems" trustworthy, but is not actually much different from only seeing that the package is signed by qmysteryman. I think a sophisticated attacker with complete control of a package is not made substantially less dangerous by only being able to undo security fixes vs being able to deploy arbitrary code.

Showing this kind of information to the publisher at the time they sign a version (to make it easier for publishers to perform due diligence before signing a package) could be useful, but it's probably some ways away.

This specific attack may not be entirely possible in practice (I'm not familiar with the Composer workflows), but I think it's broadly difficult to establish a clear capability gap between an attacker who can deploy any code at all and an attacker who can "only" deploy any code which a particular developer ever signed. At any given time, most code which a particular developer has ever signed is probably unsafe to run. I think the cryptographic assertion that the developer considered it safe to run at one time is not a very strong one: I would have made this assertion about all versions of Phabricator as I released them in the past, but would no longer make this assertion about those versions because users have discovered and reported security issues since then.

The whole "identify the packager problem" with composer is indeed a big one and i'm not suggesting you use composer in any way for the arcanist/phabricator packages! Generally i agree that i want to primarily trust the packager and not the developer. This is the way this is handled with os packages (rpm/deb) and companies are used to the workflow: Getting new packages from a somewhat trusted/signed source (e.g. redhat/canonical), testing them yourself, signing them with your own key as well and then distributing to the internal repositories and "end users" only trust your own internal key.

I don't know of any package manager that currently supports "revoking" a package signature, but it would be a cool feature.
GPG generally allows for "unpublishing" a key or your signature of a key by using revocation certs. I think you would have to use a different key (maybe subkey) for every release and publish your trust to them with your "publisher" key. When a new version is released you would publish a revocation cert for the old key stating that you no longer trust it and signature validation should fail. But as GPG is a big box of black magic i'd need to test this properly. I might be wrong and validation will succeed and the key can only no longer be used for future signatures which wouldn't help much.

Going the other way and using a X.509 CA with a new key/cert for every release using CRLs / OCSP for revocation should work just as well.

epriestley mentioned this in T11429: Upcoming: Changes to Arcanist.Aug 4 2016, 4:07 PM

epriestley mentioned this in T11439: Retrieve Diff PHID via phid.lookup.Aug 10 2016, 8:34 PM

avivey mentioned this in D16401: Support the Midje testing framework for Clojure.Aug 16 2016, 10:47 PM

urzds added a subscriber: urzds.Aug 17 2016, 8:35 AM

epriestley mentioned this in D16429: Update Config Application UI.Aug 22 2016, 7:43 AM

epriestley mentioned this in T8236: `arc weld` should do something.Aug 22 2016, 9:17 PM

20after4 added a subscriber: 20after4.Sep 28 2016, 8:53 AM

jcox added a subscriber: jcox.Sep 28 2016, 11:03 AM

Sam2304 added a subscriber: Sam2304.Oct 25 2016, 10:04 AM

epriestley mentioned this in D16462: Update documentation for text linter.Dec 15 2016, 12:40 PM

chad mentioned this in T12024: Feature request: Release arcanist as Phar package, to reduce the cost (download, redistribute, update, mantain).Dec 16 2016, 6:47 AM

mtsgrd added a subscriber: mtsgrd.Feb 22 2017, 1:49 PM

epriestley mentioned this in T12525: amckinley's Onboarding.Apr 9 2017, 1:46 PM

jcarrillo7 added a subscriber: jcarrillo7.Apr 14 2017, 9:28 PM

pouyana added a subscriber: pouyana.Apr 19 2017, 1:48 PM

epriestley mentioned this in T12847: A Pathway Towards Private Clusters.Jun 16 2017, 5:45 PM

fcoelho added a subscriber: fcoelho.Aug 7 2017, 9:31 PM

cmmata added a subscriber: cmmata.Aug 10 2017, 8:18 AM

epriestley mentioned this in T9805: XHProf will not build on PHP7.Nov 29 2017, 8:57 PM

epriestley mentioned this in T13098: Plans: Arcanist toolsets and extensions.Mar 5 2018, 2:04 PM

epriestley mentioned this in D19372: Add a rough "!history" email command to get an entire object history via email.Apr 16 2018, 5:49 PM

epriestley mentioned this in T13129: Phlux variables can't have custom view policies.Apr 18 2018, 9:42 PM

epriestley added a parent task: T13098: Plans: Arcanist toolsets and extensions.Sep 14 2018, 6:09 PM

epriestley mentioned this in D19692: [Wilds] Remove include_path mangling and drop support for "externals/includes".Sep 18 2018, 8:07 PM

epriestley mentioned this in rARCfe0c29389518: [Wilds] Remove include_path mangling and drop support for "externals/includes".Sep 21 2018, 11:44 PM

epriestley mentioned this in T13222: 2018 Week 48-51 Bonus Content.Nov 26 2018, 5:07 PM

epriestley mentioned this in T13224: Pygments Bash lexer has explosive complexity on unterminated strings with many backslashes.Nov 30 2018, 6:53 PM

epriestley mentioned this in T814: Support HTTP Basic Auth as an authentication mechanism.Dec 12 2018, 8:23 PM

epriestley mentioned this in T13229: On Third-Party Integrations.Dec 28 2018, 9:57 PM

epriestley added a subtask: T13229: On Third-Party Integrations.

epriestley mentioned this in D20026: Add a Duo API future.Jan 23 2019, 9:33 PM

epriestley mentioned this in rP069160404fe8: Add a Duo API future.Jan 24 2019, 11:10 PM

epriestley mentioned this in D20039: Bring Duo MFA upstream.Jan 25 2019, 9:22 PM

epriestley mentioned this in rP9fd8343704ee: Bring Duo MFA upstream.Jan 29 2019, 2:26 AM

epriestley mentioned this in T11515: `arc ade` should do something.Feb 2 2019, 12:58 PM

epriestley mentioned this in T13251: Upgrading: PhutilURI Query Parameter Changes.Feb 12 2019, 10:17 PM

hoeflingd added a subscriber: hoeflingd.Feb 16 2019, 2:00 AM

• pasik added a subscriber: • pasik.Mar 5 2019, 10:38 AM

epriestley mentioned this in T9456: Evaluate upstream support for third-party build systems.Sep 23 2019, 4:28 PM

epriestley mentioned this in T12011: Support builds with Travis CI.Sep 23 2019, 4:33 PM

epriestley mentioned this in D21004: Restore old expanded include path rules for workflows which fall through.Feb 17 2020, 5:12 PM

epriestley mentioned this in rARCeb6edb27399b: Restore old expanded include path rules for workflows which fall through.Feb 17 2020, 5:24 PM

I think I'm going to start working on the Arcanist side of this soon...

Here's the high-level of what I'm planning:

Distribution would just be zip/tgz files for each package
Signatures would be separate objects, signing the zip file after-the-fact. So anyone can sign any package by downloading it and signing.
- The public key won't necessarily be available to Phabricator (because it's kinda funny to have the public key and signature in the same place). We'll just register the fingerprints I guess?
- Arc will have some mechanism to install a signature file from a side-channel
Packages will have a manifest file, with enough information to import them into a Phabricator install
Signature verification would (only) happen during the "install" phase.
"installing" a package is basically just extracting it to ~/.cache/arcanist/<publisher>-<package>-<version>/.
- We'll select which installed package to load at run-time using relevant configuration.

Update: I just found the old discussion, where we talked at length about using git for distribution and package uri rather then publisher.package naming.
Let's see if 2015 avivey can convince 2020 avivey...

Distribution would just be zip/tgz files for each package

I haven't thought about this in too much detail, but I suspect a package version should have multiple possible variants (e.g., a zip file, a git repository, a mercurial repository, a .tgz, etc). A "package format" is some collection of methods like "get the data for this package reference", "check this signature against the reference you downloaded", "convert the wire data into disk data [e.g., decompress it]", etc.

Signatures would be separate objects, signing the zip file after-the-fact. So anyone can sign any package by downloading it and signing.

(This assumes that "hashing" and "signing" are distinct operations, e.g. you transform distribution data into a hash, then sign the hash.)

Yeah, the signature should be fully computable locally. Exactly what input data you're hashing would depend on the distribution format, but if it's a zip file you just sign the file content, presumably.

This specific signature is possibly dangerous: can two zip files with the same, say, SHA1 decompress to have different data?

In PDFs, the attack is:

Find two inputs with the same hash, X1 and X2.
Build two versions of the PDF. One looks like this:

good.pdf

if (X1 === X1) {
  print good/safe content
} else {
  print evil/bad content
}

evil.pdf

if (X1 === X2) {
  print good/safe content
} else {
  print evil/bad content
}

These files differ only in the X1/X2 bytes so they have the same hash (under some hashing algorithms) if X1 and X2 have the same hash, but they have different content. I think there's a specific example of this attack here:

https://shattered.io

I'm not sure if zip files are conceptually vulnerable to the same attack or not, but it seems like they might be: perhaps there is some length field which you can make valid in the "good.zip" and invalid in the "evil.zip".

Even if this is possible, finding collisions is still hard in SHA256 and there's probably no need to be more paranoid about how things are signed.

I suspect the best approach here in general is to say "a signature is a type, like 'sha256-of-raw-files-on-disk' or 'sha256-of-zip', plus a value" and "distribution objects have zero or more signatures". Then signers can sign the wire format (".zip"), or the entire directory of raw files on disk, or a Git or Mercurial hash, or all of them, using whatever hash algorithms they prefer, and clients can accept or reject signature types. If a SHA256 collision is discovered, clients can eventually be updated to reject SHA256 signatures, etc.

The public key won't necessarily be available to Phabricator (because it's kinda funny to have the public key and signature in the same place). We'll just register the fingerprints I guess?

I think the public key has to be available -- you can't verify signatures otherwise.

Arc will have some mechanism to install a signature file from a side-channel

I'd expect this to all happen over HTTP with the package directory app, e.g. the only thing users install is a distribution channel.

There might be plumbing-level commands to trust a specific public key but I don't think users need to do this in general.

Signature verification would (only) happen during the "install" phase.

It might be nice-to-have to support explicit verification later, but, yeah, I wouldn't expect to verify-on-execution in the general case.

The public key won't necessarily be available to Phabricator (because it's kinda funny to have the public key and signature in the same place). We'll just register the fingerprints I guess?

I think the public key has to be available -- you can't verify signatures otherwise.

The scenario I'm worried about:

Users expect to download public key for verification from the repository
Repository packages.dino.com is computerized by attacker
Attacker replaces public signature file for epriestley
Attacker publishes new version of, say, dinosaurs package and signs it with new fake key
New user is told "install dinosaurs, you can trust epriestley.
New user downloads new (fake) public key, latest (evil) package, and signature from packages.dino.com, and they all match.

To cut this vector, I suggest to make users concise of "can I really trust this public key", by requiring it to be installed by some other channel.

New user is told "install dinosaurs, you can trust epriestley.

I imagine they aren't. They're told some version of this instead:

Trust newly discovered public key "ab:cd:..." which "Package Server" claims is owned by "Dinosaurs, Inc"?

No entities you trust have signed this key.

Really trust this key? [y/N]

The trust mechanism in is a local web of trust. The server facilitates building that web, but the client does not trust the server.

In most cases, I imagine you will bootstrap trust for a small number of keys (e.g., "Phacility, Inc" and "Your Employer, Inc" through other channels) and use their signatures to decide whether or not to trust other publishers and extensions.

This is basically a "CA" + Certificates system, similar to the SSL system.

In practice, I imagine this works out as a small set of extensions which Phacility would sign through the "Phacility Developer Program" (similar to Apple or Microsoft signing applications in their app stores), and a relatively straightforward way for "Your Employer, Inc" to distribute extensions and/or audit and then approve third-party extensions, and then a wild west of random people publishing left_pad.js.exe. But that seems sort of reasonable?

In the basis case (there are no signatures you know), this degrades to "use some other channel to verify the key". But in the most common case this should be much better: whoever is deploying Phabricator at Acme, Inc can sign packages with the Acme, Inc key to approve them for employees and not have to worry about maintaining a fingerprint list somewhere or trying to convince users to actually verify that packages are on the list.

avivey added a revision: D21485: Packages: Load'em from .cache.Oct 30 2020, 4:49 PM

Distribution mechanism for arc extensionsOpen, WishlistPublicActions

Description

Revisions and Commits

Related ObjectsSearch...

Event Timeline

Distribution mechanism for arc extensions
Open, WishlistPublic
Actions

Related Objects
Search...