CIO POV: CrowdStrike Incident Offers 3 Digital Resilience Lessons

August 13, 2024 Omer Grossman

On July 19, 2024, organizations around the world began to experience the “blue screen of death” in what would soon be considered one of the largest IT outages in history. Early rumors of a mass cyberattack were quickly squashed: it seemed a minor software update was to blame for countless shopping excursions cut short, airline flights grounded and critical surgeries postponed.

Nearly three weeks later, the world is still reeling from the faulty CrowdStrike update, and new details are emerging about what went wrong. On August 6, the company published an in-depth technical root cause analysis and acknowledged shortcomings in its software testing processes.

As the dust settles and we continue to learn more, here are three observations and lessons in digital resilience that every organization can take away from the incident:

1. Prepare for Your Worst Day

The global CrowdStrike outages highlighted risks associated with vendor lock-in (over-reliance on any one vendor) and even left some organizations questioning their cloud strategies completely. This scrutiny is important, but it needs to be balanced with practicality.

Virtually every organization today relies on cloud services for some aspect of its business. While sometimes keeping “crown jewels” on-prem and distributing workloads across different providers can limit failures, it also adds more complexity and cost. All these factors must be carefully considered when building an anti-fragile organization with a solid IT infrastructure.

As security leaders, we must prepare our organizations to function with limited digital capacity in the event of an outage or service degradation. Anything can happen, like when Google Cloud deleted a customer account in May 2024, causing two weeks of downtime for 647,000 users in a completely isolated and random misconfiguration accident.

As the saying goes, plans are nothing; planning is everything. Evaluate existing disaster recovery and business continuity plans with fresh eyes. Run and stress-test playbooks regularly for a wide range of scenarios. Go through the entire exercise of bringing backups online to see what’s working and what isn’t. Then, do it all over again and again.

2. Ask Hard Questions

The CrowdStrike incident has prompted many organizations to examine their third-party dependencies to understand better how vendor outages could impact their operations. Now is also a good time to review critical vendor vetting processes for both existing and future partners. For instance, until last month, you may not have considered the importance of phased software updates. Does the vendor give you the option to roll out updates gradually—first testing the patch on a test server, then deploying it to a small test group of users to ensure things are working properly—and, if necessary, stop mid-way if there’s an issue before it impacts the entire organization? How robust are the vendor’s secure development lifecycle and quality assurance processes? How do they test and validate their updates before sending them out into the world? What security certifications do they have to back up these claims?

Asking the right questions is critical to building trust. The more customers know, the better they can prepare for the unknown. On the vendor side, clearly defined customer expectations can drive process and quality improvements and, ultimately, ensure more resilient systems.

3. Communicate Openly

Consistent, transparent communication is critical during a crisis. On July 19, in the early hours of the incident, CrowdStrike’s communications were tightly coordinated and centralized—no speculation or mixed messages to be found on social media. Company leadership quickly took responsibility, apologized and kept customers in the loop as they worked to remediate the problem. Despite widespread issues, people respected this transparent approach, and many customer organizations have publicly voiced their continued loyalty to CrowdStrike.

Organizations can apply these valuable crisis communications lessons to their own DR/BCP contingency planning efforts. How will security leaders keep business stakeholders apprised of an unfolding situation? Are the right communication channels in place to quickly mobilize internal teams and get systems back online? What are the best ways to keep customers and partners in the know? What’s our corporate social media policy—and who is authorized to speak with members of the press during an incident?

There Will Be Another Black Swan

The CrowdStrike incident surfaced critical questions around software testing and update quality assurance that must be addressed. It also reinforces the inherent dangers of a technological world that, in the words of Thomas Friedman, “we’ve taken from connected to interconnected to interdependent.” This interdependency means that every organization will experience a black swan event at some point. It may come in the form of a critical vendor outage, a ransomware attack or something else. By embracing an “assume breach” mindset and continuously stress-testing contingency plans and processes, your team will be better prepared—mentally and operationally—to face a crisis, respond rapidly and emerge even stronger.

Omer Grossman is the global chief information officer at CyberArk. You can check out more content from Omer on CyberArk’s Security Matters | CIO Connections page.

Editor’s Note: For more insights from CyberArk CIO Omer Grossman on this topic and beyond, check out his appearance on CyberArk’s Trust Issues podcast episode, “Trust and Resilience in the Wake of CrowdStrike’s Black Swan.” The episode is available in the player below and on most major podcast platforms

Zero Standing Privileges: The Essentials

In December, I’ll have been with CyberArk for seven years, and at a similar point, I’ll have spent two year...

Navigating Cloud Security: A Shared Responsibility

Each July, my family and I take a road trip from Kentucky back to my hometown in northwestern Pennsylvania ...

Up Your Security I.Q. by Checking Out Our Collection of Curated Resources.

CIO POV: CrowdStrike Incident Offers 3 Digital Resilience Lessons

1. Prepare for Your Worst Day

2. Ask Hard Questions

3. Communicate Openly

There Will Be Another Black Swan

Previous Article

Next Article

STAY IN TOUCH

CIO POV: CrowdStrike Incident Offers 3 Digital Resilience Lessons

1. Prepare for Your Worst Day

2. Ask Hard Questions

3. Communicate Openly

There Will Be Another Black Swan

Previous Article

Next Article

Recommended for You

The adoption of cloud technology has transformed how organizations develop, deploy and oversee internal and customer-facing applications. Cloud workloads and services create efficiencies and...

In today’s rapidly evolving global regulatory landscape, new technologies, environments and threats are heightening cybersecurity and data privacy concerns. In the last year, governing bodies have...

As retailers prepare for a season of high-demand online shopping, the risks of cyberthreats continue to grow, much like the need for increased security in a bustling mall on busy shopping days. In...

The rise of the Internet of Things (IoT) and Operational Technology (OT) devices is reshaping industries, accelerating innovation and driving new efficiencies. However, as organizations...

Trust lies at the heart of every relationship, transaction and encounter. Yet in cyberspace—where we work, live, learn and play—trust can become elusive. Since the dawn of the Internet nearly 50...

Security used to be simpler. Employees, servers and applications were on site. IT admins were the only privileged identities you had to secure, and a strong security perimeter helped to keep all...

Antivirus, malware protection, email security, EDR, XDR, next-generation firewalls, AI-enabled analytics – the list of protective controls and vendors appears to go on forever. Each day, bad...

Decision 2024 – the ultimate election year – is in full swing, with more than 60 countries holding national elections this cycle. In the United States, where presidential candidates are polling...

We are thrilled to announce that we have completed the acquisition of Venafi, a recognized leader in machine identity management. This strategic move aligns with our commitment to not just...

Securing database access has become a critical concern for organizations globally. Your organization’s data is its most valuable asset, encompassing everything about your business, partners,...

Several new vendors entering the privileged access management (PAM) market are boldly claiming they can – or will soon be able to – provide access with zero standing privileges (ZSP). In reality,...

The Clock is Ticking The Digital Operational Resilience Act (DORA) is about to shake things up in the EU, and if you’re not ready, it’s time to get moving. With the new regulations set to...

Generative AI (GenAI) has the power to transform organizations from the inside out. Yet many organizations are struggling to prove the value of their GenAI investments after the initial push to...

From the moment ChatGPT was released to the public, offensive actors started looking to use this new wealth of knowledge to further nefarious activities. Many of the controls we have become...

Today, we’re exceptionally proud to announce our recognition as a Leader in the “2024 Gartner® Magic Quadrant™ for Privileged Access Management (PAM)”1 for the sixth time in a row. CyberArk was...

Ransomware attacks have a profound impact on healthcare organizations, extending well beyond financial losses and the disrupted sleep of staff and shareholders. A University of Minnesota School of...

Physical and network barriers that once separated corporate environments from the outside world no longer exist. In this new technological age defined by hybrid, multi-cloud and SaaS, identities...

In 1968, a killer supercomputer named HAL 9000 gripped imaginations in the sci-fi thriller “2001: A Space Odyssey.” The dark side of artificial intelligence (AI) was intriguing, entertaining and...

AI and Deep Fake Technology v. The Human Element The idea that people are the weakest link has been a constant topic of discussion in cybersecurity conversations for years, and this may have been...

In December, I’ll have been with CyberArk for seven years, and at a similar point, I’ll have spent two years leading product marketing for cloud security at the company. In my short tenure with...