Protecting our clients during the largest IT outage in history
Five lessons we learned from the recent CrowdStrike outage crisis
Many of the world’s largest companies faced a series of significant difficulties last week when a faulty software update shut down operations at hundreds of airlines, banks, government agencies and retailers that run on the Windows operating system. We understand the impact that system downtime can have a on a business, and some IT experts have called the CrowdStrike situation the largest outage in history.
At DXC Technology, we pride ourselves on being a trusted global partner for many of the world’s largest businesses across the public sector, financial services, automotive and manufacturing, and healthcare and life sciences sectors. That means we work closely with our clients not just on their modernization journey, but also when they face challenges such as the ones brought about by the CrowdStrike issue.
Thankfully, there are ways to minimize the impact a major disruption like this can have on a business. As we continue to guide our customers through the outage, here are some key takeaways to consider:
Contingency planning is critical
As service is being restored, industry-wide discussions on vulnerabilities, data safeguards, the impact on supply chains and other issues have emerged.
Given our deep experience assisting clients with these issues, we had a team together an hour after the outage was known to start operating as command and control and we started with a plan built on prior experience.
In situations like this, you simply can’t do everything at once. Prioritization is key, focusing on what is most critical for the business and repairing that first. In our case – we had most critical systems repaired within the first 72 hours.
Organizations should reevaluate accepted practices for deploying software and granting update rights. The CrowdStrike incident underscores the need for robust testing, risk assessment and defined communication channels to prevent widespread disruptions and minimize the damage.
This also means factoring your entire supply chain into contingency planning exercises, as third-party risk could impact your business during an outage or cyber threat.
Around-the-clock commitment
IT outages don’t necessarily occur on convenient, nine to five schedules. The incident reinforced the importance of maintaining a vigilant, 24/7 response capability to manage unforeseen emergencies.
A commitment to continuous network monitoring, rapid incident response and resource management ensures timely restoration for affected customers.
Recommended by LinkedIn
A human touch is essential to solving problems
While technical solutions are imperative, particularly as the industry embraces an AI-led technology world, the human element still plays a pivotal role. This outage highlighted how the IT industry is struggling to incorporate best practices for cloud-based IT infrastructure while keeping humans in the loop for testing the technology.
At DXC, our technicians are engaging directly with end users, guiding them through the complex restoration process.
Fixing the issue remotely was simply not the only option. In some cases, we had to work over the phone with non-technical users, which exemplifies the patience and empathy required during incidents like the CrowdStrike event.
Vendor relationships matter
DXC boasts a global ecosystem of partners. Collaborating closely with our providers allows us to address the issue swiftly.
Regular engagement with vendors outside of a crisis, understanding their update processes, and having direct lines of communication are also critical for effective incident response.
Effective communication channels are key
Clear communication is essential during a crisis. We’ve witnessed the importance of promptly informing customers about the situation, providing updates and managing expectations. Establishing reliable communication channels helps ensure transparency and minimizes confusion.
Even if an outage is short-term, its effects can linger, impacting how your customers view your response. Hearing from customers directly about their experience during the incident is especially useful for refining one’s response strategies to be better prepared for the next time.
Keeping our clients up and running is always our priority, and I am thankful for the dedication displayed by our teams at DXC to ensure our clients are up and running as quickly as possible.
For example, we worked with a regional airline with speed and urgency. While they had many delays, they we were able to complete all flights and transport all passengers with minimal missed connections. By Friday afternoon of the incident, they were almost back to normal operation and on time performance with some follow up actions to recover remaining non-critical services.
It’s these instances that further reinforce our focus as a globally trusted partner for our clients.
Chris Drumgoole is Managing Director of Cloud & Infrastructure and Security Services for DXC Technology. He leads international teams within DXC, focusing on the development, marketing, and delivery of essential services to our valued customers. Additionally, Chris is responsible for DXC’s Public Sector business globally, ensuring strategic growth and effective solutions.
DXC Technology - ITSM | BizOps | PMO
2moOne exemplified instance. This is true with every change there will be new 'Winners' or 'Loosers', In modern times, winner is not the one who has lot of money in account, however the one who is prepared to deal with something that has never happened before. This instance has proved to the world how well prepared one is. Great Job Done DXCians... 👌 👍
Regional Field Services Manager
2moI am glad to be part! #WeAreDXC
Account Delivery Leader Americas Banking and Capital Markets at DXC Technology
2moThanks for sharing your insights on an email also Chris Drumgoole
Client Services Leader, APJMEA, UKI, BENeLux
3moThanks to everyone involved in restoring normal operation in this 'Never Before' situation.
General Manager at LFB Investments, LLC
3moThese are the 5 fundamental pillars required to achieve a successful organizations. Annual simulations will help identify gaps as our business processes are constantly evolving.