AWS US‑EAST‑1 Outage Disrupts Major Apps, Services (Oct 20, 2025)
On Oct. 20, 2025, a significant outage in Amazon Web Services’ US‑EAST‑1 region caused widespread disruptions across many popular websites, apps and services. Companies and consumers reported slow responses, failed transactions and error pages for services including Venmo, Lyft, Snapchat, Canva, Fortnite and even Amazon’s Alexa.
According to AWS status updates, the root cause was a DNS resolution issue affecting the DynamoDB API, first noted in the early hours of the morning. AWS applied mitigations, and by mid‑afternoon many services were returning to normal, though some customers still faced elevated error rates and delayed recoveries due to backlogs and EC2 instance launch limitations.
- Root cause: DNS resolution problems impacting the DynamoDB API in the us‑east‑1 (Northern Virginia) region.
- Impact: Intermittent or full outages for apps that rely on AWS (payments, social apps, games, streaming, home assistants).
- Mitigation: AWS applied DNS fixes, rate‑limited new EC2 launches and recommended avoiding binding new deployments to a single Availability Zone to aid recovery.
Why this mattered: Many companies use US‑EAST‑1 because of its capacity and features, so an outage there can ripple across the internet. Experts noted that while data remained intact, services effectively lost access to it for hours, causing the visible downtime.
For more details, see the AWS Service Health Dashboard and the original coverage at Engadget.
Key timeline (selected):
- ~3:11 AM ET: AWS reports increased error rates/latencies in US‑EAST‑1.
- ~5:01 AM ET: AWS identifies a DynamoDB DNS resolution issue as a contributor.
- ~6:35 AM ET: AWS reports DNS issue fully mitigated; follow‑on issues persist.
- By ~4:30 PM ET: Many affected services (Venmo, Lyft, etc.) appear to be recovering.
This event highlights the risks of centralizing critical infrastructure with a small number of cloud providers. Redundancy across regions and multi‑cloud strategies can reduce single‑region impact, though they add complexity and cost.
Discussion: How do you think companies should balance resilience versus complexity? Have you been affected by today’s outage?
