Is AWS Down?
Real-time status check for aws.amazon.com
Checking status...
About AWS Status
BlueMonitor checks AWS (aws.amazon.com) by sending automated requests to its servers. If the service responds within a normal timeframe and returns a successful status code, it's marked as operational. Response times over 3 seconds indicate the service is slow, and connection failures or server errors indicate the service may be down.
Recent Incidents
Service is operating normally: [RESOLVED] Increased Error Rate and Latency
Starting May 7 4:20 PM PDT, we experienced increased impaired EC2 instances and degraded EBS volumes in a single facility (data center) within a single Availability Zone (use1-az4) in the US-EAST-1 Region. The issue was caused by a thermal event resulting in a loss of power. As part of our recovery effort, we shifted traffic away from the impacted Availability Zone for most services at May 7 5:06 PM. AWS services, like Elastic Load Balancing, Elastic Kubernetes Service, ElastiCache, Redshift, OpenSearch, Managed Streaming for Apache Kafka among others, that depend on the affected EC2 instances and EBS volumes in this Availability Zone, also experienced elevated error rates and latencies for some workflows and/or configurations. Our main effort during the event mitigation strategy was to bring back our cooling systems capacity. By May 8 1:50 PM, we were able to stabilize cooling system capacity to pre-event levels, which helped us to restore the majority of the impaired EC2 instances and EBS volumes. A small number of instances and EBS volumes remain impaired and we continue to work to recover all affected remaining resources. We will communicate with customers who are still impacted via the Your Account view of the AWS Health Dashboard. Customers that require further assistance with this event may contact AWS Support through the AWS Management Console or the AWS Support Center.
Service impact: Increased Error Rate and Latency
We have begun to see improvements in the overall number of affected EC2 instances and degraded EBS volumes in a single Availability Zone (use1-az4) in the US-EAST-1 Region. The steps taken to supply additional cooling capacity have been showing steady signs of progress. Some EBS Volumes and EC2 instances affected by the issue will continue to experience impairments while we continue to drive these efforts. We continue to recommend that customers who require immediate recovery restore from EBS snapshots and/or replace affected resources by launching new replacement resources. In parallel, we have seen some improvements in Amazon Managed Streaming for Apache Kafka as a result of the parallel mitigation efforts being performed. We are still experiencing timeouts to partitions but are seeing continued progress. We do anticipate that recovery will still take several hours. We will provide an additional update by 7:30 PM or sooner if we have new information to provide.
Service impact: Increased Error Rate and Latency
We continue to work towards the recovery of the impaired EC2 instances and degraded EBS volumes in a single Availability Zone (use1-az4) in the US-EAST-1 Region though efforts are slower than we had previously anticipated. We are taking measured steps to ensure that cooling capacity is brought online in a safe and controlled manner. As a result, EBS Volumes and EC2 instances affected by the issue will continue to experience impairments. We continue to recommend that customers who require immediate recovery restore from EBS snapshots and/or replace affected resourced by launching new replacement resources. Full recovery is still expected to take several hours. We will provide an additional update by 4:00 PM or sooner if we have new information to provide.
Service impact: Increased Error Rate and Latency
We are experiencing an increase in timeouts to Amazon Managed Streaming for Apache Kafka partitions on a subset of clusters as a result of the ongoing issue in a single Availability Zone (use1-az4) in the US-EAST-1 Region. We are working in parallel to determine a path towards mitigation for affected clusters. We will provide an additional update by 12:30 PM or sooner.
Service impact: Increased Error Rate and Latency
We have observed complete recovery of increased error rates and query failures for Redshift clusters in the US-EAST-1 Region. We were able to resolve the impact independently of the ongoing efforts to recover the affected hardware in the use1-az4 Availability Zone. The issue affecting Redshift has been resolved and the service is operating normally. We will provide an additional update regarding the efforts towards hardware restoration by 12:30 PM or sooner.
Service impact: Increased Error Rate and Latency
We continue our efforts to work towards the recovery of the impaired EC2 instances and degraded EBS volumes in a single Availability Zone (use1-az4) in the US-EAST-1 Region. We are making progress towards the restoration of the cooling system capacity that is required to recover the affected hardware in the impacted zone. Some customers will continue to see their affected EC2 instances and EBS volumes as impaired until the affected racks are recovered. We continue to recommend that customers who require immediate recovery restore from EBS snapshots and/or replace affected resources by launching new replacement resources in one of the unaffected zones. As part of our parallel investigation, we have identified the root cause of the increased error rates and query failures for Redshift clusters in the US-EAST-1 Region. This has been confirmed to be related to impact from an upstream dependency. Affected customers may continue to see errors for resume and restart workflows, failover operations, and impact to general availability. We are actively working to resolve the issue. Our timeline for full recovery is still expected to take several hours and will be incremental as we bring racks online in phases. We will provide an additional update by 12:30 PM or sooner if we have new information to provide.
Service impact: Increased Error Rate and Latency
We continue working to resolve the impaired EC2 instances and degraded EBS volumes in a single Availability Zone (use1-az4) in the US-EAST-1 Region caused by a thermal event. During such an event, servers automatically shut down when the temperatures exceeded the operating thresholds in order to protect the hardware. We are actively working to bring additional cooling system capacity online, which will enable us to recover the remaining affected hardware in the impacted zone. Some customers will continue to see their affected EC2 instances and EBS volumes as impaired until we achieve full recovery. If immediate recovery is required, we recommend customers restore from EBS snapshots and/or replace affected resources by launching new replacement resources in one of the unaffected zones. In parallel, we are investigating increased error rates and query failures for Redshift clusters in the US-EAST-1 Region. During this time, affected customers may see errors for resume and restart workflows, as well as failover operations and availability issues. Our engineers are actively working to resolve this issue. Full recovery is still expected to take several hours. We are prioritizing this issue and will provide another update by 9:00 AM PDT or sooner if additional information becomes available.
Service impact: Increased Error Rate and Latency
We continue to make progress towards resolving the impaired EC2 instances and degraded EBS volumes in a single Availability Zone (use1-az4) in the US-EAST-1 Region. At this time, we wanted to provide some more details on the issue. Beginning on May 7 at 4:20 PM PDT, we began experiencing an increase in instance impairments within the affected zone due to the loss of power during a thermal event. Engineers were automatically engaged within minutes and immediately began investigating multiple mitigations. By 9:12 PM PDT, we restored power to a subset of the affected infrastructure and observed some signs of recovery, which have remained stable. We continue working to bring additional cooling system capacity online, which will enable us to recover the remaining affected hardware in the impacted zone in a controlled and safe manner. Some AWS services, such as IoT Core, ELB, NAT Gateway, and Redshift, continue to see significant improvements in the recovery of their workflows. However, some customers will continue to see their affected EC2 instances and EBS volumes as impaired until we achieve full recovery. If immediate recovery is required, we recommend customers restore from EBS snapshots and/or replace affected resources by launching new replacement resources in one of the unaffected zones. Based on our current mitigation efforts, we expect full recovery to take several hours. We are prioritizing this issue and will provide another update by 6:30 AM PDT or sooner if additional information becomes available.
Service impact: Increased Error Rate and Latency
Mitigation efforts remain underway to resolve the impaired EC2 instances and degraded EBS volumes in a single Availability Zone (use1-az4) in the US-EAST-1 Region. These EC2 instances and EBS volumes were impacted due to a loss of power during the thermal event. The work to bring additional cooling system capacity online, which will enable us to recover the remaining affected infrastructure in a controlled and safe manner, is taking longer than we had initially anticipated. Some services, such as IoT Core, ELB, NAT Gateway, and Redshift, have seen significant improvements in the recovery of their workflows. However, some customers will continue to see their affected EC2 instances and EBS volumes as impaired until we achieve full recovery. While we do not currently have an ETA for full recovery, we are prioritizing this issue and will provide another update by 3:30 AM PDT or sooner if additional information becomes available.
Service impact: Increased Error Rate and Latency
We continue to make progress in resolving the impaired EC2 instances in the affected Availability Zone (use1-az4) in the US-EAST-1 Region, and are working towards full recovery. We are actively working to bring additional cooling system capacity online, which will enable us to recover the remaining affected racks in a controlled and safe manner. In the impacted Availability Zone, EC2 Instances, EBS Volumes, and other AWS Services may continue to experience elevated error rates and latencies for some workflows. Customers will continue to see some of their affected EC2 instances and EBS volumes as impaired until we achieve full recovery. We will provide an update by May 8, 1:30 AM PDT, or sooner if we have additional information to share.
Frequently Asked Questions
Is AWS down right now?
This page shows the real-time status of AWS. The status is checked automatically by pinging AWS's servers. If the status shows "Down", it means AWS is currently experiencing issues.
Why is AWS not working?
AWS may not be working due to server outages, scheduled maintenance, network issues, or high traffic. Check the current status above for real-time information.
How do I check if AWS is down for everyone?
BlueMonitor checks AWS's servers from our monitoring infrastructure. If the status shows "Down" here, it's likely down for everyone. If it shows "Up" but you can't access it, the issue may be on your end.
What should I do if AWS is down?
If AWS is down, you can: wait a few minutes and try again, check their official social media for updates, clear your browser cache, or try using a different network connection.