Currently, bin/remote runs ssh -q ... -- bin/bastion ... which runs ssh -q ... -- ....
The -q flag suppresses annoying routine output, mostly about "Permanently added host X to known hosts". However, it also suppresses useful error output when things go wrong.
It's also vaguely bad that we're using -o StrictHostKeyChecking=no. It would be better to know the host keys (they're knowable) and use -o StrictHostKeyChecking=yes.
One general issue is that we probably don't want to check in a big everything.txt file with a list of every host and key since this makes deployment a mess that depends on git. So we either need to check in a general-purpose file which covers everything broadly, or generate a purpose-built file before each invocation of ssh.
When a file looks like this:
hostname.phacility.net <key>
...ssh tries to add a 1.2.3.4 <key> line when it connects, and emits a notice. This can be disabled with -o CheckHostIP=no, which gets us a little bit of the way there.
We can also use wildcards in hostnames, so we could do this:
bastion*.phacility.net <bastion-key> web*.phacility.net <web-key>
...and so on. That feels a little janky but maybe it's OK?
We also currently don't install consistent keys on hosts. We do present a consistent key to git clone ..., but do so by specifying HostKey in the VCS SSHD, not by overwriting /etc/ssh/... on the system.
It's also theoretically sort-of good that each host have its own key?
So we could probably go down two paths:
- Normalize keys: install the same key on every host, or on all hosts of a given type. Then write a wildcard file and check it into Git since it would rarely need to be updated (only when we add new tiers or cycle keys for some reason).
- Keep unique keys: Stick with whatever unique key the host comes up with (or generate a new one) but give each host a unique key either way. Before connecting, bin/remote writes a temporary file with exactly the hostname and IP it expects, then uses -o UserKnownHostsFile to point ssh at that file. By writing a file immediately before use, we don't need to keep a giant list of every host/key in git.
(2) seems a little better in some sense and is "more correct", but I'm not sure it reaaaaally defuses any meaningful attack compared to per-tier-class keys?
(2) is also a bit trickier because it means bin/bastion must also generate a file on the bastion host. Since anything generating a file probably needs to make a service call to Almanac to figure out the public key, this might add a significant bit of overhead (one local call, then one call on the bastion) compared to normalizing keys per tier and storing them in git.