21Oct 2025 by alex No Comments

AWS us-east-1 Outage Disrupted Major Apps; Amazon Cites DNS Issue With DynamoDB (Now Resolved)

Cloud data center and network cables

A widespread incident in Amazon Web Services’ us-east-1 region caused increased error rates and latency across multiple services, temporarily disrupting major apps including Snapchat, Venmo, Lyft, Fortnite—and even Amazon’s own Alexa. AWS attributed the trigger to DNS resolution issues affecting DynamoDB endpoints. Amazon later said services had returned to normal operations, with backlogs clearing through the afternoon and evening.

Key timeline

3:11 AM ET: AWS reports elevated errors/latencies in us-east-1.
~5:01 AM ET: Root cause identified: DNS resolution issue for DynamoDB APIs; mitigations begin.
6:35 AM ET: DNS issue mitigated; residual impacts persist (notably new EC2 instance launches).
8:48–10:14 AM ET: Progress continues; AWS rate-limits new EC2 instance launches to aid recovery.
3:01 PM ET: AWS states services have returned to normal operations; backlogs processing.
Evening update: Amazon notes resolution of widespread errors and latencies.

Why this mattered

us-east-1 is among AWS’s most heavily utilized regions. DNS failures to DynamoDB effectively left many apps “separated” from their data/control planes, creating cascading issues. Outages in a single hyperscale region can ripple across large portions of the internet.

What we’re watching

How organizations revisit multi-region or multi-cloud strategies for critical workloads.
Improvements to DNS resilience, resolver caching, and failover patterns.
Operational backlogs and delayed deployments following rate-limited EC2 launches.

Builder takeaways

Design for regional failure: evaluate multi-region architectures where RTO/RPO demand it.
Avoid hard-coding to specific Availability Zones; enable flexible placement and failover.
Implement exponential backoff, circuit breakers, and graceful degradation.
Regularly test disaster recovery and chaos drills; review DNS and dependency maps.

References

Discussion: Will this incident meaningfully accelerate multi-region adoption, or do cost/complexity barriers still outweigh the risk for most teams?

AWS us-east-1 Outage Disrupted Major Apps; Amazon Cites DNS Issue With DynamoDB (Now Resolved)

AWS us-east-1 Outage Disrupted Major Apps; Amazon Cites DNS Issue With DynamoDB (Now Resolved)

Key timeline

Why this mattered

What we’re watching

Builder takeaways

References

Leave a Reply Cancel reply