 
    Inside the Azure outage: What went wrong and what it reveals about Cloud’s weak spots
 
					
			
Just 10 days after Amazon Web Services (AWS) was stuck with a major disruption, Microsoft’s cloud arm suffered an outage last night. Tracking portal Downdetector showed that Microsoft Azure’s users faced issues with Outlook connectivity and add-ins, Microsoft Copilot functionality, and Microsoft 365 Admin Center access. Additionally, Minecraft login and Xbox Live multiplayer and account services faced disruptions, too.
What caused the outage
The recent Microsoft Azure outage occurred due to a misconfiguration in its global infrastructure, specifically within the Azure Front Door (AFD) network — the content delivery and edge application layer that routes traffic across Microsoft’s cloud.
“This outage differs from the AWS incident: it is global and has widespread impact across Microsoft Cloud. It also appears linked to the October 9 Azure Front Door global outage,” said Alessandro Galimberti, Senior Director — Analyst, Gartner. 
On October 9, around 7:40 AM UTC, Microsoft’s Azure Front Door (AFD) network, which helps deliver and speed up access to services like Microsoft 365 and Azure cloud portals, started having problems. The outage mainly affected users outside the United States, especially in Europe, the Middle East, and Asia.
A monitoring firm, ThousandEyes, found that there was heavy data loss inside Microsoft’s network, which made it hard for users to connect. Many people saw timeouts or error messages when trying to use Microsoft services.
“In the Azure outage, the impacted service was Azure Front Door, a global application load balancer and edge/CDN platform. AFD handles traffic at the edge, including load balancing and routing. A misconfiguration in the routing layer can propagate quickly across multiple regions globally due to cached DNS, global edge networks, and shared control planes,” said Rohan Gupta, VP Cloud, Security & DevOps, R Systems, a digital service provider.
Back-to-back public cloud disruptions

The successive public cloud outages are reminiscent of the Microsoft-CrowdStrike disruption in 2024. It was comparable to last week’s AWS disruption mainly due to the scale of impact. Multiple sectors faced disruptions in July due to a technical failure involving Microsoft and cybersecurity firm CrowdStrike. The outage affected businesses not only in India but also in Australia, Germany, the United States, and the UK. Reports indicated that millions of Microsoft Windows users encountered the "Blue Screen of Death" (BSOD), which can cause abrupt system restarts and potential loss of unsaved data.
Microsoft, at the time, attributed the issue to a "configuration change" in its Azure backend, disrupting connections between storage and compute resources and impacting Microsoft 365 services.  Several Indian airlines, including Air India, Indigo, Akasa Air, and SpiceJet, reported delays due to the outage. Users also struggled to access Microsoft apps like Microsoft 365, Microsoft Teams, and Microsoft Azure.
“The frequency of such outages is not surprising. Cloud environments have become enormously complex, with billions of interconnected processes. As AI, IoT, and enterprise workloads scale, these systems are under unprecedented strain,” said Aniket Tapre, Founder & CEO of AI enterprise solutions provider Neural Arc.
The recent Microsoft Azure disruption wasn't just a technical glitch but a definitive stress test on the fragility of centralized cloud dependence, opined Dhiraj Udapure, CTO, IT consultancy firm SCS Tech. When a major cloud provider faces disruption, it’s not just one platform that goes dark, it's an entire chain of enterprises, public systems, and digital services that momentarily lose their footing, he added.
“General perception is that Cloud disruptions may be more frequent as the causes are beyond technological parameters. It will be interesting to see how the providers like AWS/Azure/Google tackle this, because applications like SAP (on which several Indian public sector undertakings run) have promised their customers about 99.99% availability,” said Raja Mishra, public sector business, digital services provider UiPath.
Notes for Cloud and tech leaders
Data centre company NxtGen Cloud’s CEO and MD, AS Rajgopal, a proponent of tech autonomy, said that the convenience of public cloud has come at the cost of architectural sovereignty.

“When one company’s core control plane decides the uptime of half the internet, we’ve concentrated too much power and risk in one layer. True digital independence comes from a federated model: multiple sovereign clouds, interoperable but autonomous, each capable of operating even when another fails,” he said. To be sure, NxtGen in July launched its sovereign cloud for BFSI sector. 
Further, Gartner and major cloud providers have published prescriptive recommendations and best practices for building resilient architectures that can withstand outages. 
Gartner advised customers of hyperscale cloud providers to focus on resilience and carefully manage service dependencies for cloud-native and cloud-optimized applications, said Galimberti (quoted above). It is recommended to distribute applications across multiple availability zones and ensure the ability to quickly fail over to an alternative region when needed. 
For other types of applications, Gartner suggested planning for potential regional cloud outages, maintaining backups in a secondary region, and ensuring that disaster recovery to that region can be carried out effectively.
