Skip to main content
This guide walks you through upgrading EC2 instance types for generic EC2 Auto Scaling Groups (ASGs) behind a load balancer. You’ve got two main paths:
  • Option 1 (recommended for traffic-facing apps): Blue/Green deployment Create a parallel “green” environment with new instance types, validate it, then cut over via DNS.
  • Option 2: Instance Refresh deployment Update an existing ASG in place using Instance Refresh and a new Launch Template version.
If you’re coming from the MilkStraw AI recommender, you’ll plug the recommended instance type(s) into the Launch Template steps below.

Before you start

Make sure:
  • You can log in to the AWS Console and have IAM permissions for:
    • EC2, Auto Scaling, ELB/ALB/NLB, IAM, and Route 53 (if you manage DNS).
  • You understand your app’s ingress:
    • Which load balancer you use: ALB / NLB / Classic.
    • Which Target Group(s) your ASGs register into and what health checks they use.
    • Where your DNS (e.g., api.example.com) points today.

You’ll create a parallel green environment (new Target Group, Load Balancer, ASG with new instance types), validate it, then update DNS to send traffic to green. Rollback is just switching DNS back.

Step 1 · Create the parallel (green) environment

1.1 Create a new Target Group (green)

In the Target Groups section (EC2 console):
  • Target type
    • Usually instance for Auto Scaling Groups.
  • Health checks
    • Protocol/port: match your app (e.g., HTTP:80 or HTTPS:443).
    • Path: realistic health endpoint (e.g., /health, /status).
    • Thresholds: set conservative values for prod (e.g., slightly more lenient than default until you’re confident).
  • Deregistration delay
    • Typical values: 60–300 seconds.
    • This controls how long in-flight connections are allowed to drain after instance removal.
This is your green Target Group.

1.2 Create a new Load Balancer (green)

In the Load Balancers section:
  • Choose ALB (HTTP/HTTPS) or NLB (TCP/UDP/HTTPS) to match your app.
  • Subnets
    • Use the same subnets/AZs as your current (blue) environment for high availability.
  • Security group
    • For ALB: mirror rules from the blue ALB; review any source IP allowlists or security boundaries.
  • Listeners & routing
    • Recreate the listeners you have on the blue LB (ports, protocols, SSL certs).
    • Forward listener traffic to the green Target Group you created above.
    • Any path-based routing rules from blue should be mirrored.
This LB + Target Group pair forms the front door of your green environment.

1.3 Create or update a Launch Template

You’ll now create a new Launch Template version for the new instance type.
  1. Go to EC2 → Launch Templates.
  2. Find your current template used by the blue ASG.
  3. Choose Create new version.
In the new version:
  • Instance type
    • Set this to the new instance type, for example:
      m6a.xlarge
      
    • If you’re using MilkStraw, plug in the MilkStraw-recommended instance type.
  • Root volume size
    • Adjust if your application needs more disk space:
      • Typical: 50–100 GiB.
      • More if you store large logs/cache/data on the instance.
  • User data
    • Adjust only if needed (e.g., different bootstrap logic, AMI, or agent configuration).
    • Keep it as close as possible to the blue environment to reduce variables.
Save this as Launch Template Version N+1.

1.4 Create a new Auto Scaling Group (green)

  1. Go to EC2 → Auto Scaling Groups.
  2. Click Create Auto Scaling group.
Configure:
  • Name
    • Use your naming convention, e.g.:
      production-api-asg-v2
      
  • Launch template
    • Select your existing Launch Template and explicitly choose Version N+1 (the one with the new instance type).
  • VPC & Subnets
    • Use the same VPC and subnets as the current (blue) ASG.
  • Attach to load balancer
    • Attach the new ASG to the green Target Group you created in Step 1.1.
    • Make sure it’s not accidentally pointing to blue’s Target Group.
  • Scaling settings
    • Example for prod:
      Desired capacity: 2
      Minimum capacity: 1
      Maximum capacity: 10
      
    • Align with your expected load and existing scaling policies.
  • Health checks
    • Enable ELB / Target Group health checks.
    • Set Health check grace period realistically (e.g., 60–120s) to allow app startup.
  • (Optional) Instance warm-up
    • Set a value close to your real cold-start time (e.g., app launch + cache warm).
  • (Optional) Mixed instances / Spot
    • Only use mixed instances or Spot if you understand interruption behavior and scaling policies.
Create the ASG and wait for instances to launch and register as healthy in the green Target Group.

1.5 Validate the green environment

Use the green ALB’s DNS name directly (e.g., internal-xyz-green-123456.elb.amazonaws.com) to test. Validate:
  • Application behavior
    • Run smoke tests and key user flows.
  • Target Group health
    • All expected instances show as healthy.
  • Metrics
    • HTTP 5xx rate.
    • Latency and throughput.
    • CPU/memory (CloudWatch).
  • Network & security
    • TLS certificates and ciphers (if HTTPS).
    • Redirects (HTTP → HTTPS, domain redirects).
    • WAF rules (if present).
    • Access logs, if configured.
Only proceed once the green environment is stable and looks production-ready.

Step 2 · DNS preparations

You’ll cut over by changing DNS to point to the green ALB.
  1. In Route 53 (or your DNS provider):
    • Find the public record, for example api.example.com.
    • Lower the TTL to 60–300 seconds.
    • Do this at least 15–30 minutes before the cutover so cached entries expire quickly.
  2. If you front the ALB with CloudFront or another CDN:
    • Plan to update the Origin or CNAME there as well.
    • Allow for CDN cache and DNS TTLs in your timing.

Step 3 · DNS cutover to green

Once validation is complete and TTLs are lowered:
  1. Update the DNS CNAME for your public hostname (e.g., api.example.com) to point to the green ALB’s DNS name.
  2. Wait a few TTLs for propagation.
  3. Confirm traffic is hitting green:
    • Check green Target Group request counts.
    • Compare logs and metrics against blue’s previous baseline.
Rollback is simply:
  • Change the CNAME back to the blue ALB.
  • Wait for TTL to expire.
  • Confirm traffic is back on blue.
Keep the blue environment running for a bake period so you can easily revert if needed.

Option 2 · Instance Refresh deployment (in-place ASG rollout)

If you don’t want to create a second ASG and LB, you can use Instance Refresh to roll the existing ASG to a new instance type by updating the Launch Template version. This is often suitable for:
  • Worker fleets.
  • Internal services.
  • Apps where a controlled, in-place rollout is acceptable.
This has more risk than Blue/Green because you’re modifying the production environment directly, but Instance Refresh with high min-healthy-percentage can still be quite safe.

Step 1 · Prepare a new Launch Template version

Just like in Option 1:
  1. Go to EC2 → Launch Templates.
  2. Find the template used by your existing ASG.
  3. Create new version:
    • Set Instance type to the new value (e.g., m6a.xlarge).
    • Adjust root volume size and user data if needed.
  4. Save as Version N+1.
Update your ASG to use Launch Template Version N+1 as its default (in the ASG settings).

Step 2 · Start Instance Refresh

  1. In Auto Scaling Groups, select the ASG.
  2. Choose Instance refreshStart instance refresh.
Configure:
  • Min healthy percentage
    • Example: 90–100% for a cautious rollout.
    • This means the ASG will ensure 90–100% of the desired capacity stays healthy during the refresh.
    • Lower only if your app can handle reduced capacity during rotation.
  • Warm-up time
    • Use your app’s realistic startup time + load balancer health check grace period.
    • Example: 120 seconds or more for heavier apps.
  • (Optional) Skip matching
    • Enable this if you want all instances replaced, even those that already match the new template.
    • Leave it off if you only want to replace instances that don’t match the new config.
Start the refresh.

Step 3 · Monitor and validate

While Instance Refresh runs:
  • Monitor:
    • ASG health status.
    • Load balancer Target Group health.
    • Application errors and latency.
    • Capacity and scaling alarms.
If issues appear:
  • You can Cancel instance refresh.
  • Any instances already replaced remain, but the ASG stops further replacements.
  • If needed, revert the ASG back to the previous Launch Template version and plan a new rollout.
Once the refresh completes:
  • Confirm all instances are running the new instance type.
  • Run smoke tests and verify that error rates and performance look good.

Extra resources