Auto-remediating Security Defects in AWS

9 min readJan 25, 2022

The troubling reality for the modern cloud security professional, is that your opponents almost certainly have more resources at their disposal than you do. That resource may be an individual hacker’s time, or the technological clout and finance of a state actor. Alas, this disparity is sadly very probable to only get worse for the majority of organisations. If you are reading this article, it is likely that you are a cloud security professional. Assuming that is correct, you are the first and last line of defence for your company and their customers. You decide which security tooling to use, how to implement it and what is in scope. You devise or contribute to cloud security policy while trying to provide enough balance to allow developers to spread their wings and experiment on the bleeding edge. It is your reputation on the line if your company gets hacked or, worse still, your customers data is exposed. You already know this of course and thankfully are concerned enough to read articles like this to keep ahead of the game.

However, while you sit at your brightly lit, multi-monitored desk, comfortable in the knowledge that your security alerting software is protecting you… something dark and nefarious stirs and gibbers in the shadows, thirsting for your data. Watching… Waiting…

I am willing to wager that individuals similar to yourself never thought that any of the “big hacks” could ever happen to them. We had Twitch, Solar Winds, the Colonia Pipeline, JBS, the Washington D.C. Metropolitan Police Department and a host of others all subject to spectacular and very public hacks. And that was just 2021. We’ll see more of the same in 2022 and beyond, but how do you make sure such a disaster does not happen under your watch?

There are several areas that need close attention of course. But this is an article, not a book, so we are going to look at perhaps the most important area of cloud security that you need to focus on right now and for the next year. If you take anything away from this article, I humbly ask that it is just two simple phrases. The first is, “auto remediation”. I am going to say it again in case you are skim reading this article — auto remediation. Remember those two words. Put them on a post-it note next to your monitor. Look in the mirror and say it out loud every morning. Whisper it into the ear of your manager regularly. OK, maybe not the last one. That is a bit creepy. The point is that You need to put the days of “listing” security defects behind you. That is a very 2010s practice, and it will not serve you well in the 2020s. If anything, such lists merely serve to provide you with a false sense of security that your cloud security issues are “in hand” and will be fixed when (or if) you get around to it.

If a cloud security defect goes to languish in a list, or a task recording system or a kanban board etc, it has already been permitted to exist for too long. Those gibbering things lurking in the shadows that I mentioned earlier? Those are bots. The bots do not give you a couple of days or a couple of weeks to fix your security defects. They find them quickly. They are always searching. They never sleep. They report their findings immediately to their twisted masters or, even worse, exploit the finding directly.

You don’t just need to take my word for it of course. Look at this article from Forbes: https://www.forbes.com/sites/googlecloud/2021/04/01/bot-attacks-are-the-biggest-online-risk-you-havent-addressed/.

It states that bot attacks are the biggest online risk that you have not addressed. A phrase to take away from that article that we will all guitily acknowledge is this — “The rush to meet customer needs online in 2020 left many organizations exposed to automated bot attacks.” In addition, the Covid 19 pandemic meant many businesses had to expand their online offerings before their cloud infrastructure was ready for the dangers of being online.

Some online newbie companies were wondering why so many customers would fill up a cart with items, only to abandon the process and never be seen again… only to discover later that these were bots probing the defences of their checkout processes. Did that freak you out? Me too. This brings us on to our second take away phrase (which is the reason we are interested in auto remediation). We need to focus on auto remediation to enable us to “beat the bot”. I’ll try and add weight to the argument:

Hackers are getting better all the time and utilizing bots to do their bidding. Their skill sets are transferable across borders to hostile state actors. The surface area they can attack us on is getting bigger and bigger due to the IOT and everything being connected. If you are not already, you need to begin looking at your enterprise as fort knox and the data and infrastructure within has to be seen as the gold you are protecting. To achieve this we need a multi layered defense where you are cognizant of the threats that are out there. Who has time to research this in normal technical security role though?

This paints an intimidating picture for those entrusted with protecting the enterprise gold. You need a proactive capability that is going to undertake most of the Security legwork for you. Auto-remediation. You are not automating away your own job by embracing the automated resolution of security bugs. You are freeing yourself up to spend more time researching how you can further improve your security defenses.

So what security auto-remediation tools are available to us?

We don’t all have the luxury of being able to spend big on glamorous security software to do this for us, so I will focus on a few example solutions that you can implement yourself right now, by using reactionary systems to respond to log events or alerts. I will not fully document the solutions here, otherwise this article will end up being a book. However, if you are interested in implementing any solutions detailed here, please contact me and I will be happy to send you the relevant documentation.

1. Harness Control Tower guardrails to prevent security bugs from occurring in the first place.

Disclaimer… this may hamstring your developers if abused.

AWS Control Tower is an orchestration system which extends the capabilities of AWS Organizations master account (of which, all of your AWS accounts should be a member of). Control Tower applies preventive and detective controls (guardrails) to enforce policy from a central location. Such controls may be able to help Cyber Security apply their security and other cloud policies more effectively and prevent drift from adherence to those policies through continual enforcement. For example, we could use guardrails to ensure that security logs and necessary cross-account access permissions are created, and not altered. In a nutshell, Control Tower will help your organisation adhere to it’s cloud security standards, meet regulatory requirements, and follow best practices more easily.

You can also use a Control Tower utility named Account Factory to provide a configurable account template. This helps to ensure new AWS accounts are provisioned according to security standards and pre-approved configurations.

2. Automatically block malicious IPs

In general, it is not good practice to leave ports open, particularly ssh ports. It guarantees you will be subject to daily ssh brute force and port probing attacks. But sometimes it is unavoidable. If you find yourself in the position of needing to leave ports open, there are steps you can take to automatically block malicious IPs from probing, at the VPC ACL level.

You can automatically block access from known malicious IP addresses attempting to access resources in your AWS cloud accounts. A “known malicious IP” is one which has been designated as such by the AWS Security team plus feeds from Crowdstrike and Proofpoint. When probing or brute force events are detected by GuardDuty, the associated IP can be immediately blocked from accessing anything in the VPC that it is trying to connect to. Here is the solution workflow:

1. A GuardDuty finding is raised with suspected malicious activity.
2. A CloudWatch Event is configured to filter for GuardDuty Finding types listed below:

UnauthorizedAccess:EC2/SSHBruteForce
UnauthorizedAccess:EC2/RDPBruteForce
Recon:EC2/PortProbeUnprotectedPort
Trojan:EC2/BlackholeTraffic
Backdoor:EC2/XORDDOS
UnauthorizedAccess:EC2/TorIPCaller
Trojan:EC2/DropPoint

3. A Lambda function is invoked by the CloudWatch Event and parses the GuardDuty finding.
4. State data for blocked hosts is stored in a DynamoDB table. The Lambda function checks the state table for existing host entries.
5. Notification issued to Slack
6. The Lambda function creates a rule inside in the relevant a VPC NACL.

3. Send real-time GuardDuty alerts to IM

You are not still using email for alerting are you? Emails are slow, easily swallowed up in the mailbox morass and lack the ability for interaction. Move your real-time alerting to an instant messenger. Particularly if it is for critical detection coming from GuardDuty. AWS GuardDuty can be configured to use a CloudWatch alarm to notify an Amazon SNS topic, which in turn activates AWS Chatbot to notify a your Instant Messaging utility — such as a Slack chat room. You can set your Slack (or other IM tool) notification to be sufficiently noisy, should you receive alerts into this channel. This will help you know of any issues GuardDuty believes may be a transgression immediately. The specific finding will be linked to in the message.

An example process for sending GuardDuty alerts to Slack would be:

1. Create an SNS topic.
2. Configure the AWS Chatbot on Slack. in the AWS web console, search for the AWS Chatbot, and select Slack as chat client from the dropdown list.
3. Select “allow” on the next screen.
4. Under Configuration details, enter a name for your configuration. The name must be unique across your account and can’t be edited later.
5. For the Slack channel, choose the channel that you want to use. To use a private Slack channel with AWS Chatbot, choose Private channel.
6. In Slack, copy the Channel ID of the private channel by right-clicking on the channel name and selecting Copy Link.
7. In the AWS Management Console, in AWS Chatbot window, paste the ID into the Channel URL.
8. Define the IAM permissions that the AWS Chatbot uses for messaging your Slack chat room
9. For Policy templates, choose Notification permissions. This is the IAM policy template for AWS Chatbot. It provides the necessary read and list permissions for CloudWatch alarms, events and logs, and for Amazon SNS topics.

4. Automate enforcement of IAM Multi Factor Authentication and key rotation

Last, but certainly not least, is ensuring that your AWS users have their MFA set for console access. It is easy to fall into a degree of casualaity over extended periods of use for any system requiring credentials. But if you do not ensure your users are enabling MFA for console access you WILL be hacked at some point. Please trust me on this. Never fear though, because our old friend. lambda will ride to the rescue once more. You can use a lambda function to force the disabling of an IAM profile for any user which has not enabled their MFA with a certain time frame.

The same applies for ensuring IAM users do not forget to rotate their keys regularly. We can use a lambda function to, for example, warn users once keys reach an 80 day age. Then keys that remain unchanged can be deactivated after 90 days.

For this solution we would need:

1. A Lambda Function: To execute the source code (I can send an example to interested parties).
2. Simple Storage service: To store the python package used by Lamdba function.
3. Simple Email Services: To send email notifications to users for a) the 80 warning and b) that a key has been disabled.
4. Event Bridge: To run your Lamdba function on a scheduled basis.
5. Cloudwatch Logs: For Lamdba execution logs.

Conclusion

I sincerely hope this article has helped reinforce the belief that you undoubtedly already hold that we cannot stand still with cloud security. We, as cloud security professionals, have to move from static detection to proactive prevention to defeat the nefarious hordes who thirst for our data. The solutions here are but a tiny example of how we can try and stay ahead of the security game and beat the bot.

Don’t wait until the day after the attack to get started. Do it now!