Page MenuHomePhabricator

Expose build variables in harbormaster build plans to the execution environment
Open, Needs TriagePublic

Description

It's becoming common for the "drydock run command" step to look something like this in our build plans:

/mnt/high_value/PIT-PSC-1/scratch/bschmersal/multi.py --notify @bschmersal -- /mnt/high_value/PIT-PSC-1/scratch/bschmersal/deploy.py -d /mnt/high_value/PIT-PSC-1/vnv_vault/ci -e -m '*CI' -A -F pit_testing,new_stanton -S SDV -c "${buildable.commit}" -B "${build.id}" -T "${target.phid}" -P "${repository.phid}" -V "${buildable.revision}" -D "${buildable.diff}"

It would be much more convenient if these were environment variables instead.

Event Timeline

I'm a little wary about this. Some of my concerns:

  • We may have a large number of variables in the future -- some discussion in T10517. I think the discussion on that task isn't too relevant, but I imagine modularizing build variables and allowing future applications to define new types of buildables (for example, releases) and build variables.
  • Some of those variables may be expensive to build. If we have a flag to always pass them, defining an expensive variable slows down every step which uses the flag. Currently, we could construct variables lazily and steps could pay only the costs for what they use (today, we don't do this, but we have no expensive variables).
    • An example of an expensive variable might be "URL to a tarball of the code", after T1816. I don't know that we'll necessarily have a use case for this, and would probably push back against it fairly hard, but you could imagine it making certain build or release steps easier to write under some set of plausible constraints.
  • Some of those variables might be functions, not variables. We likely can't pass every possible value of a function. We can just not pass them, of course, but this makes such an option flimsier and more perplexing.
  • Overriding variables like LD_PRELOAD or HTTP_PROXY (see httpoxy Vulnerability Information) can be dangerous, but the ability to expose custom build variables "seems" like a pretty safe one. I could imagine this leading to security issues where safe-seeming changes are made in the future which interact with this flag to create security issues.
  • I don't actually know how to do this in the general case without just dumping everything on the CLI in X=x Y=y command form? Env vars don't get passed over ssh by default and passing them is tricky, I think. We can shove them all on the CLI, but this probably won't work in Windows and may hit command line length issues which are surprising (install an extension which happens to define a build variable, suddenly none of your stuff works since you went over the limit). Probably not a real concern but adds to my general feeling of flimsiness here.

In this case, why does the script need all this stuff? I'd generally imagine that build scripts might fall into one of two buckets:

  • Phabricator-Agnostic scripts: these don't know that Harbormaster exists, and take very little data. Harbormaster just calls them and reads the output.
  • Phabricator-Aware scripts: these know that they're being run by Harbormaster and Phabricator, and make Conduit calls to query state, report results, etc.

I'd expect scripts in the first bucket to not need all these variables since they don't interact with Harbormaster, and scripts in the second bucket to be calling Conduit anyway so they could just build all that stuff from target.phid, so I can't immediately guess why this script needs a lot of different IDs/PHIDs if it doesn't have Conduit capability.

eadler added a project: Restricted Project.Sep 15 2016, 6:04 PM