Alright, let's tackle this. At the very least it'll give us a good idea of what's working in Drydock and what's not.
The general use cases that Drydock needs to provide for in the region of build machines is:
- A preset list of build machines that the user has defined (e.g. preallocated hosts).
- Allocating out new hosts using various blueprints (AWS, etc.)
- Finding a host with a suitable set of capabilities.
- Allocating out a working directory on hosts of various types.
- Invoking commands remotely on the host.
- Cleaning up the working directory when the build finishes, regardless of whether the build succeeds or fails.
- Cleaning up and terminating the host resource if it's not a preallocated resource.
Looking at how Drydock behaves for allocating a working copy, it seems like it'll handle the AWS scenario quite well, but we'll need some sort of interface in Harbormaster for defining preallocated build machines (wherein Harbormaster will provide a blueprint and use that to create a specific resource).
We also need some mechanism to ensure that requesting a working copy uses a particular host; I'm reasonably confident this can just be an attribute on the working copy lease (host.lease=4) when requesting it.
If this all sounds good, I think we should proceed like so:
- Define DrydockPreallocatedHostBlueprint. This should only answer to leases with an attribute of preallocated=yes and trust the attributes to provide information about the resource it allocates. I'm reasonably confident that this should allocate out resources with a type of host. It can then provide the command interface based on the attributes of the resource. The resources it allocates should also have attributes preallocated=yes and remote=yes (in contrast with DrydockLocalHostBlueprint which will only provide where remote=no).
- Add a UI in Harbormaster to list preallocated host resources and create / edit new ones. This will basically query drydock resources where they are of the host type with attribute preallocated=yes. This UI will allow the user to define the SSH connection details as well as the target machine type (windows, mac or linux).
- Add a host lease PHID column and a working copy lease PHID column to the HarbormasterBuild object. Each build has a lease on the host and a lease for a working copy on that host.
- Update the DrydockWorkingCopyBlueprint implementation to accept an attribute of host.lease to force it to allocate on a particular host.
- Update the Harbormaster worker to request a lease for the host and working copy. It'll first request a host lease for the particular build type (windows, mac or linux) with attributes remote=yes (but leave preallocated unspecified to allow for dynamically allocated hosts). Once it has a lease on the host, it'll request a lease on a working copy where host.lease is equal to the ID of the lease it just got. The reason for requesting the lease on the host explicitly is that we need to be sure of the target machine's type (because we will be executing build commands on it).
- Update HarbormasterBuildPlan to add a machine type column where the values can either be windows, mac or linux. We'll make this a drop-down field in the UI, but we could change it to a text field if there's demand for it later on.
- Introduce DrydockRemoteCommandBuildStepImplementation; a variant of RemoteCommandBuildStepImplementation that instead uses the command interface on the host lease to run the command in the working directory provided by the working copy lease. It seems DrydockCommandInterface already supports everything we'll need (it returns an ExecFuture) so this should be reasonably trivial.
- Ensure that build plans configured for Windows and Linux target machines execute and lease correctly, and that DrydockRemoteCommandBuildStepImplementation executes the builds correctly.
- If UploadArtifactBuildStepImplementation is present in upstream, also create a version of that that uses the host and working copy leases. We should probably add a transfer interface used to transfer files independent of how the host resource is provided. This would even allow us to switch between SFTP and SCP on varying host types (e.g. differences between Windows and Linux).
- Once the dust settles and this all looks like it's working, drop the RemoteCommandBuildStepImplementation and UploadArtifactBuildStepImplementation build steps. Remove the harbormaster.temporary.hosts.whitelist configuration option.
- Celebrate that Drydock is now being used.