You can create a
CloudWatch alarm that monitors an Amazon EC2 instance and automatically recovers the instance if it becomes
impaired due to an underlying hardware failure or a problem that requires AWS involvement to repair. Terminated instances cannot be recovered. A
recovered instance is identical to the original instance, including the instance ID, private IP addresses, Elastic IP addresses, and all instance metadata. If the impaired instance is in a placement group, the recovered instance runs in the placement group.
From EC2 console, select
Monitoring tag, then select
When creating alarm, there are quit a few of action that we can actually choose from, as shown in the image, but we are going to do auto recover only. When creating alarm, we configure
- Send notification if action is triggered (
- What action to perform when monitored condition is met
- The monitoring metric and threshold
- Name of the alarm
Pay attention that for auto recover, CloudWatch alarm monitors the status of
System Status Checks instead of
Instance Status Checks. AWS uses these two metrics to check and identify the health of the instance.
System Status Checks: Check the connectivity of the instance. If it fails, it means that part of the underlaying infrastructure is failed.
Instance Status Checks: Check if the OS is accepting traffic. It if fails, it means something is wrong with the OS level.
A finishing view, click the alarm’s name to redirect to CloudWatch Alarms
Once at CloudWatch Alarms page, select the alarm to get more detailed information. Alarm can be modified by using
Actions tag on the top of the windows.
Once the alarm is configured, it can be seen under the
Monitoring tag inside EC2 Instances page too.
And that’s it. CloudWatch will keep monitoring the
System Status Checks metric and if, in this post, the check fails for two minutes continuously, CloudWatch alarm will trigger the
Recover this instance action and migrate(should be stop then start) the instance to different host.
Recover Your Instance