AWS provides a large range of services to help you monitor what is going on in your AWS infrastructure. By observing running resources, collecting metrics and analyzing the data in relation to resource and system health and performance, you can set alerts and even trigger events to resolve detected problems.
If for example you detect an EC2 instance that is operating above a certain threshold, you can trigger a scaling event to deploy additional resources to take up the processing load. Conversely you could also detect under utilization and scale in resources in response to reduced compute loads.
Amazon CloudWatch
The first of these services is Amazon CloudWatch which allows you to monitor the health and operations of the applications you build on AWS.
CloudWatch monitors your AWS infrastructure and application by monitoring standard and custom metrics.
You could track the CPU utilization of an EC2 instance with a CloudWatch Alarm. The Alarm monitors a resource or metric and when certain thresholds are exceeded, trigger a message via SNS or function to respond to the event.
As well as using the integration with AWS services to monitor and trigger events, you can also use a CloudWatch dashboard to aggregate alarms and metrics from your AWS resources, custom application metrics and on premise servers to give you a centralized view of what is happening across your AWS infrastructure.
AWS CloudTrail
CloudTrail is a service that records every activity on AWS. Since everything you do on AWS is ultimately an API call, be that adding and EC2 instance, adding a row to a DynamoDB table or adding a file to an S3 bucket, the CloudTrail Engine captures that activity including what was changed and who made the request.
CloudTrail captures all requests, the requestor IP address, whether anything changed and the result of the API call, was the request successful or was the change denied.
Most audit frameworks require the ability to track changes in IT, so from that perspective, CloudTrail answers all the questions. If an auditor wants to know if the permissions on a security group haven’t been changed during the audit period, CloudTrail can show you all the activity on that group.
You can store CloudTrail data in secure S3 buckets and protect them with VaultLock to ensure you have your audit requirements covered.
You can also enable CloudTrail Insights which monitors your infrastructure for unusual or out of character activity and alerts you to that activity.
AWS Trusted Advisor
Just like a business advisor can come into your business and observe how you can make your business more efficient, secure or more cost effective, AWS Trusted Advisor is an automated service that can review your AWS infrastructure and make recommendations on elements that may not be in line with best practices.
There are five pillars that trusted advisor measures. The service performs a number of checks in real time and compiles categorized findings for you to look into. Items like missing 2FA on your root user, under utilised resources or EBS volumes that haven’t been backed up.
Cost optimization
The checks performed under cost optimization might be things like low utilization on compute resources or storage volumes and databases. Trusted advisor will show you the checks it has performed and categorises them as action needed (red circle) , investigate (orange triangle) or looks ok (green square).
It will also estimate the amount in dollar terms you could save by taking action on the warning
Performance
The performance checks will alert you to resources that are adversely affected by capacity or connectivity issues. For example, you might have EBS volumes that are limited by the type of EC2 instance they are connected to.
Security
Security checks will advise where your security policies don’t meet best practice. You may have missing MFA on your root user, errors in IAM policies or vulnerabilities in security groups, like unrestricted ports.
Fault Tolerance
This pillar is where Trusted Advisor will alert you to issues that will be a problem should a resource fail or a zone experience an outage. You may have EBS volumes with no snapshots (backups) meaning you won’t be able to recover your data should the hardware hosting your EBS volume fail.
You may also be alerted to vulnerabilities like resources that only exist in a single availability zone, so cannot persist if that zone has an outage
Service Limits
In this pillar, AWS Trusted Advisor notifies you of any resources that are approaching service limits.
Your account might be limited to 5 VPCs in a single region. When you deploy your 5th VPC, trusted advisor will let you know that you should contact AWS support and have your service limits increased.
You can set up email alerts for different elements of the trusted advisor checks that go to different people in your organisation as the checks are run.
So they are the main components of the Monitoring and Analytics section of AWS Cloud Practitioner Essentials. While none of the topics are explored in any great depth, you do get a good understanding of the tools and services available in AWS to observe whether your configurations are secure, performant and economically efficient.
You can use Hava diagrams to visualise your network topology and gain context when these AWS services are highlighting potential problems. The automatically generated diagrams show the resources deployed, the AZs they are deployed in and any load balancing in place useful for any performance or fault tolerance issues.
Hava's security group diagrams will help you quickly zero in on any security or open port issues reported by trusted advisor.
You can view sample data or connect your own AWS account to a free 14 day trial of the fully featured Hava "Teams" plan to view the different network infrastructure and security diagrams auto generated by Hava.