Experts warn of coming threats
Recovery from IT failures around the world begins to slow, and experts warn of future risks. A mistake by cybersecurity company CrowdStrike left airports, businesses and medical facilities in many countries affected by the “largest outage in history.”
Photo: ru.freepik.com
Services began to resume on Friday evening after an IT failure that caused chaos around the world. But full recovery could take weeks, experts say, after airports, healthcare facilities and businesses were hit by the «largest disruption in history.»
Flights and hospital appointments were cancelled, payroll systems crashed and TV channels went off air after a botched software update hit the Microsoft Windows operating system, The Guardian reports.
The update came from US cybersecurity firm CrowdStrike and left staff with a «blue screen of death» and their computers unable to boot. Experts said each affected computer may need to be manually fixed, but some services had started to recover as of Friday evening, The Guardian reports.
As the recovery continues, experts say the outage underscored concerns that many organizations are not well prepared to implement contingency plans when a single point of failure, such as an IT system or a piece of software within it, goes down. But these outages will continue to happen, experts say, until networks are built with more contingency capabilities and organizations implement better backups.
In the UK, Whitehall crisis officials have been coordinating the response through the COBRA emergency committee. Ministers have been in touch with their sectors to deal with the impact of the IT disruption, with Transport Secretary Louise Hay saying she was working “in lockstep with the industry” after trains and planes were affected.
A Microsoft spokesman said on Friday: “We are aware of an issue affecting Windows devices due to an update to a third-party software platform. We expect a resolution to be put in place shortly.”
Texas-based CrowdStrike confirmed that the outage was due to a software update for one of its products, not a cyberattack.
Its founder and chief executive George Kurtz said he was «deeply sorry for the impact we have had on customers,» adding that there was a «negative interaction» between the update and Microsoft's operating system.
CrowdStrike's share price fell sharply throughout the day, falling as much as 13% at some points in trading.
Elon Musk, owner of Tesla, said the outage caused a «stagnation in the auto supply chain,» while banks in Kenya and Ukraine reported problems with their digital services and supermarkets in Australia faced payment problems.
Govia Thameslink Railway (GTR), the parent company of Southern, Thameslink, Gatwick Express and Great Northern, has warned passengers of possible delays.
According to service monitoring site Downdetector, users in the UK have reported problems with services from Visa, BT, major supermarket chains, banks, online gaming platforms and media outlets.
Sky News and CBBC were also temporarily suspended went off air in the UK before resuming broadcasts, while Australia's ABC was also affected.
In financial services, Metro Bank reported problems with phone lines in the UK, with Santander saying it «may impact card payments». Monzo said some clients reported problems, while some JP Morgan bankers were unable to log into their systems and the London Stock Exchange said there were problems with its news service.
Troy Hunt, host cybersecurity consultant, said the scale of the IT failure was unprecedented.
“I don’t think it’s too early to tell: this will be the biggest IT failure in history,” he tweeted.
«Essentially, this is what we were all worried about with Y2K, except this time it actually happened,» he added, referring to the Millennium Bug that worried IT experts in the lead-up to Y2K. but ultimately did not cause serious damage.
British IT Institute BCS said restoring systems could take days or weeks, although some fixes would be easier to implement.
«In some cases, a fix can be applied very quickly,» said Adam Leon Smith, a research fellow at BCS. «But if computers have reacted in a way that causes blue screens and endless loops, recovery can be difficult and may take days or weeks.»
Alan Woodward, professor of cyber security at the University of Surrey, said the fix requires manually rebooting affected computers and «most normal users don't know how to follow the instructions.» Organizations with thousands of computers distributed in different locations face a more complex challenge, he said.
“It's just numbers. For some organizations this could certainly take weeks,” he said.
From Amsterdam to Zurich, Singapore to Hong Kong, airport operators have noted technical problems that have hampered their operations. While some airports have suspended all flights, in others airline staff have had to check passengers in manually.
Among the companies affected on Friday was Ryanair, Europe's largest airline, which said on its website: «Possible network disruption due to a global outage of a third-party system… We advise passengers to arrive at the airport three hours before departure to avoid any disruption.»
Heathrow, Europe's largest airport, said it was «working hard» to get passengers «on their way.»
A Heathrow spokesman said: “We continue to work with our colleagues at the airport to minimize the impact of the global IT disruption on passenger travel. Flights are continuing to operate and passengers are advised to check with their airlines for the latest flight information.”
In the US, flights were grounded due to communications problems believed to be related to a power outage. Among the affected carriers were American Airlines, Delta and United Airlines.
Berlin Airport temporarily suspended all flights on Friday. Aviation analytics firm Cirium said 5,078 flights (4.6% of planned) were canceled worldwide on Friday, including 167 departures from the UK and 171 arrivals.
GPs in the UK said that they were unable to access patient records or make appointments. Surgeons posted on social media that they were unable to access the web-based EMIS system.
It is understood the outage did not affect 999 services, but the Royal Surrey NHS Trust in the south of England declared a critical incident and canceled radiotherapy appointments scheduled for Friday morning. The National Pharmaceutical Association confirmed that this could impact UK services.
Keir Starmer's spokesman said they were not aware the issue was having any impact on public services, but added that they recognized the impact. which it has in a broader sense. Reports from the Netherlands also suggest there may be problems in the health care system.
Israel's Health Ministry said a «global disruption» affected 16 hospitals, and in Germany, Schleswig-Holstein University Hospital in the north of the country said it had canceled all planned surgeries in Kiel and Lubeck.
Ted Wheeler, mayor of Portland, Oregon issued an emergency declaration saying the outage affected some critical city services, including emergency communications.
Alan Woodward, from the University of Surrey, said the outage was caused by an IT product called CrowdStrike Falcon, which monitors the security of large networks of PCs and downloads a piece of monitoring software onto each machine.
«The product is used by large organisations that have a significant number of computers to ensure that they have monitoring across the board. Unfortunately, if they lose all the computers they won't be able to work on them or their service levels will be significantly reduced,» Woodward said.
Stephen Murdoch, professor of security engineering at University College London, said many organizations may find it difficult to resolve the issue quickly.
“The issue occurs before the computer is connected to the internet, so there is no way to resolve the issue remotely, so it takes someone to go out and solve the problem,” Murdoch said, adding that companies and organizations that have cut costs on IT staff or outsourced their IT work will have their ability to solve problems hampered.
However, Ciaran Martin, former chief executive of the National Cyber Security Centre, said that, unlike adversarial cyber attacks, this problem has already been identified and a solution has been found.
“Recovery is not about coping with the situation, but to go back. I think it's unlikely to be newsworthy in terms of continued disruptions this time next week,” he said.
The challenges for US businesses have also been exacerbated by problems with Microsoft's Azure cloud computing business. which occurred on Thursday.