cvwiki

CloudWatch

Nov 5, 2022

# CloudWatch

Ingest, store and manage metrics

# CloudWatch - Data

# CloudWatch Logs

# CloudWatch Events and EventBridge

Default Event Bus

# Misc. Notes

# Basic Monitoring and Detailed Monitoring

Many AWS services offer basic monitoring by publishing a default set of metrics to CloudWatch with no charge to customers. By default, when you start using one of these AWS services, basic monitoring is automatically enabled. For a list of services that offer basic monitoring, see  AWS services that publish CloudWatch metrics.

Detailed monitoring is offered by only some services. It also incurs charges. To use it for an AWS service, you must choose to activate it. For more information about pricing, see  Amazon CloudWatch pricing.

Detailed monitoring options differ based on the services that offer it. For example, Amazon EC2 detailed monitoring provides more frequent metrics, published at one-minute intervals, instead of the five-minute intervals used in Amazon EC2 basic monitoring. Detailed monitoring for Amazon S3 and Amazon Managed Streaming for Apache Kafka means more fine-grained metrics.

In different AWS services, detailed monitoring also has different names. For example, in Amazon EC2 it is called detailed monitoring, in AWS Elastic Beanstalk it is called enhanced monitoring, and in Amazon S3 it is called request metrics.

Using detailed monitoring for Amazon EC2 helps you better manage your Amazon EC2 resources, so that you can find trends and take action faster. For Amazon S3 request metrics are available at one-minute intervals to help you quickly identify and act on operational issues. On Amazon MSK, when you enable the PER_BROKERPER_TOPIC_PER_BROKER, or PER_TOPIC_PER_PARTITION level monitoring, you get additional metrics that provide more visibility.

The following list shows the services that offer detailed monitoring.

# #sysops Scenarios

Question: A SysOps team is in the process of automating tasks to expedite the time to recover Amazon instances in the event of underlying hardware failure. The team must ensure that the attached Elastic IP and the private IP address of the original instance are retained after the instance is recovered. Additionally, an automated email notification should be in place to inform everyone in the SysOps team once a recovery process is triggered to run. Answer: You can create an Amazon CloudWatch alarm that monitors an Amazon EC2 instance and automatically recovers the instance if it becomes impaired due to an underlying hardware failure or a problem that requires AWS involvement to repair. Terminated instances cannot be recovered. A recovered instance is identical to the original instance, including the instance ID, private IP addresses, Elastic IP addresses, and all instance metadata. If the impaired instance has a public IPv4 address, the instance retains the public IPv4 address after recovery. If the impaired instance is in a placement group, the recovered instance runs in the placement group. When the StatusCheckFailed_System alarm is triggered, and the recovery action is initiated, you will be notified by the Amazon SNS topic that you selected when you created the alarm and associated the recover action. During instance recovery, the instance is migrated during an instance reboot, and any data that is in-memory is lost. When the process is complete, information is published to the SNS topic you’ve configured for the alarm. Anyone who is subscribed to this SNS topic will receive an email notification that includes the status of the recovery attempt and any further instructions. You will notice an instance reboot on the recovered instance. It’s worth noting here that the StatusCheckFailed_Instance metric in CloudWatch just monitors the software and network configuration of your individual instance. It does not detect if the underlying hardware failed.