Avoiding Residual SSH Keys on Ubuntu AMIs

If you’ve ever used Amazon EC2 to run Linux, you probably know that the AWS console prompts you to choose an SSH key-pair when spawning a new Linux instance. Public/private key pairs allow you to securely connect to your instance using SSH after it launches. On Ubuntu Linux, the SSH public key is made available to the instance by the Ubuntu CloudInit package. This package is installed on all Ubuntu Cloud Images and also in the official Ubuntu images available on EC2 (https://help.ubuntu.com/community/CloudInit). It runs at boot time, and adds the SSH key to the default user's ~/.ssh/authorized_keys file.

What you may not realize is that by default, CloudInit does not replace the authorized_keys file. Instead, it appends the key to the existing authorized_keys file. Depending on your build process, this could create a security exposure if you are not careful, since it can lead to residual keys building up on the image over time as AMIs are created. This is especially true when the lines between development and operations can be blurred (often referred to as DevOps), a huge trend that will likely continue with regards to applications that run in the cloud.

Consider this example: A team wants to launch their application in AWS. With cool features like Elastic Load Balancers (ELBs) and Auto Scaling, it’s a perfect platform to get scalability without breaking the bank. They’ll typically start by tasking someone from the team to get a clean base server image (AMI) of the operating system from a trusted source. They launch an EC2 instance from the AMI and customize the local stack to meet the specific needs of the application (install non-standard packages, get the web server configured, etc). Once everything is installed and patched, they load the application onto the instance and perform some final configuration tweaks to get things up and running. Everything appears to be working as intended, so the initial plan to launch in AWS is green lighted.

The next logical step is for the team to create an image of the new custom build to avoid having to re-install everything in the event that the instance gets hosed or otherwise corrupted. A new AMI from the running system is created, which will serve as their new “base” AMI. Going forward, their release/deployment process will start by launching a new instance of the base AMI, applying any recent system patches (hopefully), and deploying the latest version of the code. Provided the application passes pre-production testing, the instance is ready to be pushed into production.

When using AutoScaling, ELBs launch and terminate EC2 instances as needed to meet changing load demands over time. Since each new launched instance needs to originate from an AMI, a new AMI that has the exact copy of the code you want running in production will need to be generated. The newly configured deployment instance will usually serve as the source for the AMI, so a final AMI gets created for the ELB to use.

In many environments, development teams are not given unrestricted access to production systems. If a team wants to get an AMI spawned in production, they would likely need to request this be done by a separate production support team. In all likelihood, the production team launch their AMIs with a separate SSH key that the development team rightfully does not have access to.

What could go wrong here? Every time you create an AMI from a running EC2 instance, the root SSH key that was used to spawn the instance gets copied into the AMI. Looking at the history of the final production AMI used in this example:

It was launched from a clean install image (no pre-loaded keys) and was configured with a new ssh key by the EC2 wizard.
That key was then copied to the “base” AMI, which was then launched during the pre-deployment setup. If a different SSH key was used during this launch, there are now two SSH keys on the new instance.
Both of those keys get copied when the “deployment” AMI is created, which might then launched by the production team using yet another SSH key. Unless someone thought to clean out the un-wanted keys before creating that final AMI, all three SSH keys end up on the running host in production.

To avoid this problem, you’ll want to make sure that production engineers (not your developers) are tasked with creating AMIs that will ultimately be used on production ELBs. They should specifically remove all un-wanted SSH keys from the authorized_keys file before creating the AMI.

The process of checking for un-wanted SSH keys should already be baked into most server deployment processes, but this step can easily get overlooked when AMIs are used for frequent deployments.