(Illustration: Screenshot of AWS Health Dashboard at 2025-10-20 12:51 PDT. Image source: Ernest.)
✳️ tl;dr
- The following content is from an official AWS report 1, segmented and highlighted by AWS Community Hero Ernest 2 from the perspective of a developer and technical manager, aiming to stay close to the facts and conduct reasoning and extended learning based on these facts.
- Through studying this report, we hope that both parties (AWS and us as AWS customers) can accumulate experience and continue to improve together, whether in the cloud or on-premises.
- Unless otherwise specified, all times below are in Pacific Daylight Time (PDT) from AWS Seattle headquarters on the West Coast.
- This note will begin with a knowledge graph, followed by a breakdown of the original official report content, divided into four sections: Amazon DynamoDB, Amazon EC2, Network Load Balancer (NLB), Other AWS Services
- If you have the budget to adjust your architecture for cross-region high availability but don’t have enough time for major architectural changes, it is recommended to take a look at AWS services with “global” in their name. For example, “Amazon DynamoDB Global Tables” from the same DynamoDB family was almost unaffected during this incident.
- We wanted to provide you with some additional information about the service disruption that occurred
- in the N. Virginia (us-east-1) Region 3
- on October 19 and 20, 2025.
- While the event started at 11:48 PM PDT on October 19 (Taipei Timezone UTC+8, 2025-10-20 14:48)
- and ended at 2:20 PM PDT on October 20 (Taipei Timezone UTC+8, 2025-10-21 05:20),
- there were three distinct periods of impact to customer applications.
- First, between 11:48 PM on October 19 and 2:40 AM on October 20, Amazon DynamoDB experienced increased API error rates in the N. Virginia (us-east-1) Region.
- Second, between 5:30 AM and 2:09 PM on October 20, Network Load Balancer (NLB) experienced increased connection errors for some load balancers in the N. Virginia (us-east-1) Region.
- This was caused by health check failures in the NLB fleet, which resulted in increased connection errors on some NLBs.
- Third, between 2:25 AM and 10:36 AM on October 20, new EC2 instance launches failed and, while instance launches began to succeed from 10:37 AM, some newly launched instances experienced connectivity issues which were resolved by 1:50 PM.