

EC2 Spot Interruptions - AWS Fault Injection Simulator
source link: https://dev.to/aws-builders/ec2-spot-interruptions-aws-fault-injection-simulator-31i2
Go to the source link to view the article. You can view the picture content, updated content and better typesetting reading experience. If the link is broken, please click the button below to view the snapshot at that time.

Abstract
- AWS Fault Injection Simulator now supports Spot Interruptions, now you can trigger the interruption of an Amazon EC2 Spot Instance using AWS Fault Injection Simulator (FIS).
- With FIS, you can test the resiliency of your workload and validate that your application is reacting to the interruption notices that EC2 sends before terminating your instances.
- This blog guide you step-by-step to create FIS Experiment templates using AWS CDK
Table Of Contents
🚀 Overview of EC2 spot instance
- Amazon EC2 Spot Instances reduce the cost up to 90% but can be interrupted or reclaimed at any time with warning in 2 mins.
- We can use
aws-node-termination-handler
to ensures that the Kubernetes control plane responds appropriately to events that can cause your EC2 instance to become unavailable
🚀 Simulate Spot Interruptions architect
- Starting the FIS experiment which sends
send-spot-instance-interruptions
event. - Use cloudwatch event rule to catch
EC2 Spot Instance Interruption Warning
event and then trigger lambda function for sending slack notifications. -
aws-node-termination-handler
kubernetes DaemonSet also takes action when catching the event
Now we start creating CDK stacks
🚀 Create Lambda function - send slack
-
Lambda handler parses the event to send slack message which contains event detail-type, instance ID and action
app.pyimport requests from datetime import datetime import json def send_slack(msg): """ Send payload to slack """ webhook_url = "https://hooks.slack.com/services/******" footer_icon = 'https://cdkworkshop.com/images/new-cdk-logo.png' color = '#36C5F0' level = ':white_check_mark: INFO :white_check_mark:' curr_time = datetime.now().strftime('%Y-%m-%d %H:%M:%S') payload = {"username": "Test", "attachments": [{ "pretext": level, "color": color, "text": f"{msg}", "footer": f"{curr_time}", "footer_icon": footer_icon}]} requests.post(webhook_url, data=json.dumps(payload), headers={'Content-Type': 'application/json'}) def handler(event, context): detail_type = event.get('detail-type', '') instance_id = event['detail']['instance-id'] action = event['detail']['instance-action'] message = f'{detail_type}\nresource: {instance_id}, action: *{action}*' send_slack(message)
-
Lambda stack
lambda.tsconst send_slack = new lambda.Function(this, 'slackLambda', { description: 'Send Event message to slack', runtime: lambda.Runtime.PYTHON_3_8, code: lambda.Code.fromAsset('lambda-code/app.zip'), handler: 'app.handler', functionName: 'send-slack-spot-event' });
🚀 Create event rule of spot interruption
-
The event listens to
event.tsEC2 Spot Instance Interruption Warning
to trigger the above lambda functionconst spot_event = new event.Rule(this, 'SpotEventRule', { description: 'Spot termination event rule', ruleName: 'spot-event', eventPattern: { source: ['aws.ec2'], detailType: ['EC2 Spot Instance Interruption Warning'], detail: { 'instance-action': ['terminate'] } } }); spot_event.addTarget(new event_target.LambdaFunction(send_slack));
🚀 Create FIS service role
-
IAM role for AWS FIS permissions to handle the target resources here is EC2 instance
fis_role.tsconst fis_role = new iam.Role(this, 'FisRole', { roleName: 'spot-fis-test', assumedBy: new iam.ServicePrincipal('fis.amazonaws.com') }); const ec2_policy_sts = new iam.PolicyStatement({ sid: 'SpotFisTest', effect: iam.Effect.ALLOW, actions: [ 'ec2:DescribeInstances', 'ec2:StopInstances', 'ec2:SendSpotInstanceInterruptions' ], resources: ['arn:aws:ec2:ap-northeast-1:*:instance/*'], conditions: { 'StringEquals': {'aws:RequestedRegion': props?.env?.region} } }); fis_role.addToPolicy(ec2_policy_sts);
🚀 Create FIS Experiment Template
-
The experiment template includes:
- Action:
send-spot-instance-interruptions
, parameter:durationBeforeInterruption
PT2M
- Targets:
- Resource type:
aws:ec2:spot-instance
- Resource filters:
State.Name=running
- Selection mode:
COUNT(1)
- Action:
-
Stack
fis.tsconst target: fis.CfnExperimentTemplate.ExperimentTemplateTargetProperty = { resourceType: 'aws:ec2:spot-instance', resourceTags: {'eks:nodegroup-name': 'eks-airflow-nodegroup-pet'}, selectionMode: 'COUNT(1)', filters: [{ path: 'State.Name', values: ['running'] }] }; const action: fis.CfnExperimentTemplate.ExperimentTemplateActionProperty = { actionId: 'aws:ec2:send-spot-instance-interruptions', parameters: {'durationBeforeInterruption': 'PT2M'}, targets: {'SpotInstances': 'spot-fis-target'} }; const fis_exp = new fis.CfnExperimentTemplate(this, 'FisExperiment', { description: 'Spot Interruption Simulate', roleArn: fis_role.roleArn, tags: { 'Name': 'spot-interrupt-test', 'cdk': 'fis-stack' }, stopConditions: [ {source: 'none'} ], targets: {'spot-fis-target': target}, actions: {'send-spot-instance-interruptions': action} });
🚀 Start experiment template
- Start
- Complete
- Slack notify the event and
aws-node-termination-handler
action either
🚀 Conclution
- This kind of FIS experiment help us to test the scenario of spot interruption to check
aws-node-termination-handler
and fault tolerance of application - We should also know about FIS pricing. The AWS FIS price is
$0.10
per action-minute.
Recommend
-
12
Les interruptions sous Arduino / ATMega atmega328 arduino i2c...
-
17
Abstract This post provides an overview of Amazon EC2 Spot Instances, as well as best practices for using them on AWS EKS effectively Table Of Contents 🚀 What to know about spo...
-
6
WEBINAR ON PRODUCTS AND SERVICES POLITEKNIK TUANKU SULTANAH BAHIYAH (RASMI) 2.7K vi...
-
7
Manage Interruptions with Defensive Project Portfolio Management Here's a scenario I see in all kinds of businesses. Your team has product-focused work. And, the team also has “fast” response...
-
7
Cool Tools: Fault Injection into Unit Tests with JBoss Byteman - Easier Testing of Error Handling February 25, 2012 How do yo...
-
14
Using Fault Injection Testing to Improve DoorDash Reliability  April 25, 2022 13 Minute Read Backend
-
10
10 August 2022 /
-
5
Try, Buy, Sell Red Hat Hybrid CloudAccess technical how-tos, tutorials, and learning paths focused on Red Hat’s hybrid cloud managed services.
-
12
Background It might sound paradoxical to deliberately break something we’re trying to fix, but sometimes, that’s the most efficient method to do it. Fault injection is the process by which we deliberately introduce faults into the system....
-
7
EC2 Spot Instance 價錢的上漲趨勢 在「Farewell to the Era of Cheap EC2 Spot Instance...
About Joyk
Aggregate valuable and interesting links.
Joyk means Joy of geeK