In the era of digital miracles, a comprehensive, thought-out, robust and tested back-up plan in the event of a technology failure is as important as ever. Alicia Crisp, of advisory firm MHA, outlines why.
………………………………….
INDUSTRIES critical to everyday life – including airline, finance, healthcare, and shipping – have for the most part slowly recovered from the global impact of the damaging software update issued by CrowdStrike which, according to Microsoft, affected some 8.5 million devices worldwide.
While this number accounts for less than 1% of all Windows machines, the effect was catastrophic. Within hours, over 5,000 flights had been cancelled around the world and, in the UK, the government activated its COBRA emergency team.
This incident brings to mind the failure of the National Air Traffic Service system on August 28 last year, which affected over 700,000 passengers.
According to the Civil Aviation Authority, the failure was triggered by the inability of the flight plan processing system to manage flight plan data for a specific flight from Los Angeles to Paris. Both primary and secondary systems generated critical errors and entered maintenance mode, preventing data transfer to air traffic controllers.
The workaround involved manually inputting flight data, reducing capacity to only 60 flight plans per hour, compared to the usual 800. With NATS managing around 2.5 million flights annually in UK airspace, the impact was severe.
How could both Plans A and B fail, leaving NATS reliant on a Plan C that operated at only 7.5% effectiveness?
In recent years, several tech-related failures have illustrated the same issue. In early 2023 in the USA, service outages disrupted both United and Hawaiian Airlines and the Federal Aviation Administration’s database failure triggered a national ground stop that halted all take-offs.
Meanwhile, a back-up failure at the New York Stock Exchange led to abnormal market swings when systems incorrectly continued the previous day’s trading on January 25.
In November, an IT failure left half of Australia without phone service when Optus, the country’s second-largest telecom provider, was down for 12 hours. More recently, BT was fined £17.5 million after a network fault left thousands of 999 calls unanswered for more than ten hours.
All of these incidents point to one clear conclusion: a reliable Plan B, one that is failsafe-tested regularly, is too often missing.
In many offices, the default advice for any IT problem is a standing joke: “Have you tried turning it off and on again?” But this humorous response reflects a much larger issue.
For non-critical infrastructure, restarting a device might work as a back-up plan. However, when the stakes are national, affecting emergency services or critical infrastructure, an effective Plan B is a necessity, not a luxury.
With technology advancing exponentially, it seems that organisations – even entire nations – have come to trust these systems almost implicitly. This reliance is likely to increase as AI technologies are integrated more widely.
Already, AI is seen as a cure-all, with the promise to foresee and forestall a host of issues, perhaps leaving less room for investing in robust, destruction-tested Plan Bs.
In the era of digital miracles, it might seem old-fashioned to advocate for robust Plan Bs. However, as more of our lives and livelihoods depend on digital infrastructure, we are likely to see these failures and their impacts grow.
A proper Plan B – well-designed, tested and reliable – is more essential than ever.
Alicia Crisp is an audit and advisory partner at MHA.
………………………………….
Stay connected with local business through Business MK. Join our exclusive community for the latest news, insights, updates, features and thought leadership.
Stay informed – subscribe now at bit.ly/3MZiqzQ. Unsubscribe at any time.