Page MenuHomePhabricator

Implement "Run Remote Command" build step implementation.
ClosedPublic

Authored by hach-que on Nov 6 2013, 9:33 PM.
Tags
None
Referenced Files
Unknown Object (File)
Tue, Apr 23, 4:36 AM
Unknown Object (File)
Fri, Apr 19, 3:28 AM
Unknown Object (File)
Fri, Apr 19, 3:28 AM
Unknown Object (File)
Fri, Apr 19, 3:28 AM
Unknown Object (File)
Fri, Apr 19, 3:28 AM
Unknown Object (File)
Fri, Apr 19, 3:28 AM
Unknown Object (File)
Fri, Apr 19, 3:28 AM
Unknown Object (File)
Fri, Apr 19, 3:28 AM

Details

Summary

This adds a build step implementation for running a command on a remote machine over SSH. It supports merging in various variables about the build (such as the commit hash / revision ID, repository call sign, version control type and clone URI).

Test Plan

Configured a build plan to run /bin/true on localhost and the build passed. Configured a build plan to run /bin/false on localhost and the build failed.

Diff Detail

Lint
Lint Skipped
Unit
Tests Skipped

Event Timeline

hach-que updated this revision to Unknown Object (????).Nov 6 2013, 9:39 PM

Rebase on master.

hach-que updated this revision to Unknown Object (????).Nov 7 2013, 9:38 PM

Escape variables that are merged in by using vcsprintf.

hach-que updated this revision to Unknown Object (????).Nov 7 2013, 9:40 PM

Fix undeclared variable when using preg_match_all.

I think sshkey being a path on disk still makes this dangerous. In theory, on some setups, I can set:

sshkey=/home/daemonuser/.ssh/id_rsa
host=localhost

Then I can SSH to the local machine and we're back to local command execution.

I think this is less exploitable than LocalCommand, but I still don't feel great about it. I think we need a host whitelist to make this safe, and I'd feel better about storing keys in some sort of real credential management thing, too.

Some technical inlines. Let me mull over the permissions model a bit and get back to you with a recommendation. This is really close to being upstreamable, I just want to lock it down a bit tighter.

src/applications/harbormaster/step/RemoteCommandBuildStepImplementation.php
36

I think we should fail if you type ${cmmomti}, not continue silently. For now, every variable is always present. If we have more dynamic variables in the future maybe we'd have to look at this, but this seems like something we can catch easily.

38

Before doing this replacement, we should:

str_replace('%', '%%', $command);

Otherwise, it will be impossible to issue, e.g.:

echo '%'
src/applications/harbormaster/step/VariableBuildStepImplementation.php
6–12 ↗(On Diff #16979)

Maybe this junk should be more like:

buildable.name
buildable.phid
buildable.vcs
buildable.uri
container.name
container.phid
container.uri

Kind of just thinking out loud, though. This is presumably easy enough to change later, so long as not everyone goes into production with it immediately.

34–43 ↗(On Diff #16979)

I think we should make this abstract or remove it, since there are currently no implementations for which it is correct.

I thought about checking the hostname, but that gets complicated as soon as you realise that there's a whole lot of ways you can actually reach localhost.

I've been thinking about the idea of having a HarbormasterBuildMachine table, which represents remote machines upon which builds are made. The idea is that you can define preallocated machines in this table via a UI and if Harbormaster asks Drydock for a resource, it adds a record into this table for the lifetime of the resource (deleting the record when the resource is released).

If we do this, then we can actually make "remote host", "remote port" and authentication details part of the build (e.g. HarbormasterBuild has buildMachinePHID and the build machine record has connection details). Users then never define any authentication information in build steps; the remote SSH information is per build, depending on what machine is being used for that build. This seems like a much more suitable long term solution (we'll need tracking of build machines when Drydock exists), addresses the security issues around this build step, and also addresses how we're going to allow for both dynamic and preallocated build machines.

I thought about checking the hostname, but that gets complicated as soon as you realise that there's a whole lot of ways you can actually reach localhost.

Yeah, agreed. I don't have much confidence that we could successfully blacklist localhost, and even if we could there are plenty of cases like "we're load balanced across 3 machines" where you can just hit another machine with a configuration similar to localhost and be in just as much trouble. The host stuff has to be a whitelist to not be full of holes, I think.

I've been thinking about the idea of having a HarbormasterBuildMachine

I think this is the right object conceptually, but that it should just be DrydockResource, not a separate Harbormaster object. A stopgap might be to provide some CLI method to write machines into the DrydockResource table, and then have Harbormaster use that as the whitelist, but this is kind of funky because it's not a great representation of where we want to end up (where a buildstep would not hard code a specific resource).

If we do this, then we can actually make "remote host", "remote port" and authentication details part of the build

Yeah, I think this is the right approach in the long term -- or maybe a slightly more general version of it. It would be nice if we could get the structure to look something like this, today:

  • The build has two steps, "Allocate Host" and "Run Remote Command".
  • "Allocate Host" produces a HarbormasterBuildArtifact representing the machine and its credentials, and "Run Remote Command" consumes the artifact.
    • In practice, this means that "Allocate Host" produces some kind of "uri" / "username" / "port" / "key" dictionary, and "Run Remote Command" can read that stuff.
    • Eventually, "Allocate Host" would produce a DrydockLease PHID, and "Run Remote Command" would consume the lease/resource via Drydock interfaces, although we might be able to take smaller steps toward that today without having to build out Drydock. For now, "Allocate Host" just has to return some reasonable dictionary out of some whitelist, although maybe there's no smaller, forward-compatible step we can take than getting some of this in Drydock.
  • Eventually, "Allocate Host" would usually be replaced with "Allocate Working Copy".

The UI for this might get a bit complicated, but I think it's ideal if we can make "Allocate Host" a normal build step and give the build steps a reasonable way to pass artifacts to one another. This will be more work now, but I think it will make a lot of things way easier in the future. This is all still kind of fuzzy, though.


I do think I am going to implement a credential management application (maybe "Passphrase") which can store authentication keys. We have one pretty reasonable use case for this now (multiple repositories with shared credentials) and more coming (mirroring to remote Git, remote hosts here). That should be a small step forward here, and serve to clean up some of the read-stuff-on-disk security issues.

Ideally I want to just sit down and spend a chunk of time trying to put a reasonable forward-looking bridge in place between this and Drydock, but I have a bunch of irons in the fire right now so I'm not sure I'll be able to. I think it will be really good if we can get such a bridge in place (even if Drydock just acts as "store a static list of host information" for now), but I'm worried it's going to be too messy to make the cut for the first reasonable iteration of Harbormaster.

I think if HarbormasterBuildMachine is going to pull in a lot of Drydock dependencies (even for preallocated machines which I was kind of thinking it wouldn't), then we're probably better off with some config values that's configured with bin/config and we call it a day.

hach-que updated this revision to Unknown Object (????).Nov 9 2013, 2:02 AM

Update remote command implementation to use whitelist.

"SSH Host" could maybe mention the config, although maybe it's good that it doesn't.

I'll setLocked(true) the config in the pull.

Closed by commit rPe64efa05c3ff (authored by @hach-que, committed by @epriestley).