Australians are being warned it could take up to two weeks to fully resolve the disruption to computer systems affected by a global tech outage on Friday.
Microsoft says , triggered by a faulty software update from CrowdStrike, a security tool used by many large organisations to block malware and cyber attacks.
The fault caused Windows computers to display the so-called ‘blue screen of death’, trapped in what the experts call a ‘recovery boot loop’.
It also hit Microsoft’s Azure Cloud, one of the major suppliers of cloud computing, and failures there led to additional breakdowns around the world.
An automatic update by US software firm CrowdStrike crashed millions of computers around the world, bringing chaos to transport, retail, media and many other areas.
Home Affairs Minister Clare O’Neil said it would be some time before all systems were operational and “teething issues” may be seen for one to two weeks.
However, O’Neil said there was no impact to critical infrastructure or government services.
Microsoft said the 8.5 million computers affected by the outage amounted to less than 1 per cent of its machines.
Mark Gregory, a senior telecommunications and network engineering academic from RMIT University, said that despite the low percentage of computers affected, those computers were “typically enterprise customer computers and those enterprise customers are the ones that affect much of our lives”.
“They’re the banks, the airlines, the critical services and other organisations,” he said. “And what we know is that the update went wrong. Those computers all crashed.”
What went wrong?
David Glance is the director of the Centre for Software and Security Practice at the University of Western Australia.
He said a significant part of the problem is that users were happy to let CrowdStrike take care of security without much in the way of checks and balances.
“I think everybody had become complacent in believing that we could trust the companies like CrowdStrike to do the right thing. And clearly, this was a massive failing on their part.
“This is where I am curious to see what litigation follows because it’s just pure negligence on their part to actually release something without the appropriate testing.”
Tom Worthington, an honorary lecturer in the School of Computing at the Australian National University, said he was surprised CrowdStrike hadn’t picked up that there was an issue in their security update before it was rolled out to millions of computers around the world.
“That’s a normal software development practice that we teach to undergraduate students: you don’t go from fixing it to releasing it to all your customers. You go through a number of test stages, but there’s no way you can eliminate every possible problem with the software.
“These things will happen from time to time, but you’ve got to make sure it doesn’t take out everything.”
Worthington says if someone’s entire business depends on one particular software product working, they need to make sure they have alternatives.
The issues exposed by the crisis
Gregory said the incident is a consequence of slack security practices and has highlighted the potential for further chaos.
“The idea that this has happened in 2024 is frightening because there’s the potential for significant damage to occur, not just to things like airlines and to banks, but also within hospitals and to critical care equipment and systems,” he said.
Glance said organisations often rely solely on single providers for software, so if something goes wrong with that software, there’s no backup.
He said the more ubiquitous a software product is, it’s more likely that any fault will have major consequences.
“One of the considerations when people are looking for software to use — and it doesn’t matter what software it is going for — the most popular and the most widely used is not necessarily always going to be a good choice in future, because of this very problem, and to have variety and also as we said earlier, potentially taking precautions against just automated updates from people, and trusting third parties.”
Worthington agrees it’s unwise to rely on one sole provider, however well-known they are.
“Because in the main, the systems have worked so reliably, we’ve fallen into a false sense of security. We’ve said, ‘Okay, well, if you just buy from the major supplier everybody uses, it’ll be fine.’ But if everybody buys from them and then you have a problem, everybody has a problem.”
What does the government need to do?
Gregory said he has been calling for the Australian government to implement minimum performance standards from the tech corporations.
“Australia is a third-world country when it comes to technology and telecommunications and IT. We need to move ourselves away from this third-world thinking. What we have is legislation and regulation that is fit for the 20th century.
“It’s not fit for the current century. We have thinking within government that corporations will do right by the nation when we know that they don’t,” Gregory said.
Glance said that the Australian government has been so focused on foreign threats in cybersecurity that “they’re blinded to all of the other issues that are going on that are possibly greater threats”.
“When it comes to things like critical infrastructure the focus is completely on some foreign states acting to take this out when we really should be considering all the other risks, including, as I said, the sort of variety, the updates, the way people do business.”
Glance said the Australian government may need to alter legislation to manage these risks in the future and that this kind of widespread issue may not be an isolated event.
“I think that people may get better at protecting themselves against it, but I think they just need to get smarter about how they do their IT and cybersecurity infrastructure.”
With the Australian Associated Press.