21Oct 2025 by alex No Comments

AWS us-east-1 Outage Disrupted Major Apps; Amazon Cites DNS Issue With DynamoDB (Now Resolved)

Cloud data center and network cables

A widespread incident in Amazon Web Services’ us-east-1 region on October 20 (ET) caused increased error rates and latency across multiple services, temporarily disrupting major apps including Snapchat, Venmo, Lyft, Fortnite—and even Amazon’s own Alexa. By the afternoon, AWS said most services had returned to normal, with request backlogs clearing into the evening. Amazon later confirmed the broader resolution of the issue.

What happened

AWS identified the trigger as a DNS resolution issue impacting DynamoDB endpoints. That meant many apps could not reliably reach their databases, causing timeouts, elevated API errors and cascading failures across dependent services.

Timeline highlights (ET)

3:11 AM: AWS reports increased error rates/latencies for multiple services in us-east-1.
~5:01 AM: Root cause identified: DNS resolution issue for DynamoDB APIs; mitigations begin.
6:35 AM: DNS issue mitigated; residual impacts remain, especially for new EC2 instance launches.
8:48–10:14 AM: Progress continues; AWS rate-limits new EC2 launches to stabilize recovery.
3:01 PM: AWS reports services back to normal operations; backlogs processing.
6:53 PM: Amazon notes resolution of the widespread errors and latencies.

Why it mattered

us-east-1 is one of AWS’s most heavily used regions. When DNS can’t resolve critical service endpoints like DynamoDB, applications are effectively “separated” from their data for a period—rippling across a large portion of the internet and impacting both consumer apps and enterprise systems.

Reportedly affected during the incident

Alexa voice requests and routines
Snapchat, Venmo, Lyft
Fortnite, Roblox
Streaming and media apps (e.g., Disney+) and numerous websites

Builder takeaways

Evaluate multi-region designs where RTO/RPO demand regional resilience; avoid single-region dependencies on critical data paths.
Avoid hard-coding deployments to specific Availability Zones to maximize failover flexibility.
Harden clients with exponential backoff, circuit breakers, and graceful degradation.
Review DNS and resolver caching strategies; regularly run DR and chaos drills.

References

Discussion: Will this outage accelerate multi-region or multi-cloud adoption, or do cost and complexity still outweigh the resilience benefits for most teams?

AWS us-east-1 Outage Disrupted Major Apps; Amazon Cites DNS Issue With DynamoDB (Now Resolved)

AWS us-east-1 Outage Disrupted Major Apps; Amazon Cites DNS Issue With DynamoDB (Now Resolved)

What happened

Timeline highlights (ET)

Why it mattered

Reportedly affected during the incident

Builder takeaways

References

Leave a Reply Cancel reply