Rebuilding Trust: CrowdStrike’s Short and Long-Term Recovery Tactics

Last week, a software update from CrowdStrike caused a widespread outage, halting operations at numerous organizations and affecting thousands of customers globally. The incident began at 04:09 UTC on July 19, 2024, when CrowdStrike released a sensor configuration update for Windows systems. This update inadvertently triggered a logic error, leading to system crashes and the dreaded blue screen of death (BSOD) across impacted systems.

Although CrowdStrike quickly rolled back the changes, not all systems have been fully restored even two days post-incident. This debacle took a toll on CrowdStrike’s stock, which plummeted over 11%—a single-day drop of more than $38 (CRWD -11.10%).

Cybersecurity experts and financial analysts are now assessing the fallout. According to a Microsoft blog, the faulty update impacted approximately 8.5 million Windows devices—less than 1% of all Windows machines. However, given that CrowdStrike services many critical infrastructures, the ripple effect was substantial. Raj Joshi from Moody’s Ratings highlighted potential large liability claims and a negative impact on operational performance. The incident sparked discussions about whether competitors might benefit from CrowdStrike’s misstep. Elon Musk announced, “We just deleted CrowdStrike from all our systems,” indicating a significant loss of trust. Meanwhile, in recent techcrunch article, Eric Grenier, cybersecurity threat detection and exposure analyst at Gartner suggested that the event might influence other executives to consider alternatives. Yet, Chirag Mehta VP and principal analyst at Constellation Research noted the universal vulnerability: “Other vendors are fortunate that they were not affected this time.”

CrowdStrike is now in full crisis management mode, working tirelessly to resolve the issues and restore service for all customers. As the situation unfolds, executives are undoubtedly scrutinizing every detail to mitigate business impacts and navigate the company through this storm. Intrigued by the strategic implications, I consulted ChatGPT for insights on short and long-term strategies for CrowdStrike, and the detailed response offered thoughtful guidance on navigating the company’s path forward.

Short-Term Actions

  1. Immediate Communication and Apology:
    • Action: Issue a public apology through a press release and on all company communication channels, acknowledging the incident, its impact, and the company’s responsibility.
    • Objective: Demonstrate transparency, accountability, and empathy towards affected customers, showing that the company takes the issue seriously.
  2. Rapid Response Team and Customer Support:
    • Action: Establish a dedicated crisis management and customer support team to handle inquiries and provide support to affected customers, particularly critical businesses like hospitals and emergency services.
    • Objective: Offer immediate assistance and solutions to mitigate the disruption caused by the outage, reinforcing the company’s commitment to customer care.
  3. Technical Investigation and Remediation:
    • Action: Conduct a thorough technical investigation to identify the root cause of the outage. Implement immediate fixes and improvements to ensure such an incident does not recur.
    • Objective: Restore customer confidence by demonstrating that the company has a clear understanding of the problem and has taken steps to prevent future occurrences.
  4. Customer Compensation and Assurance:
    • Action: Provide compensation or special offers to affected customers, such as free service extensions, discounts, or enhanced support packages.
    • Objective: Show goodwill and appreciation for customers’ patience and loyalty, helping to mitigate dissatisfaction and potential churn.
  5. Public Relations and Media Management:
    • Action: Engage with the media to provide updates on the situation, the steps taken to resolve the issue, and measures implemented to prevent future incidents. Consider arranging interviews with key company executives.
    • Objective: Manage the narrative and convey the message that the company is proactive and dedicated to resolving the issue and improving its services.

Long-Term Actions

  1. Enhanced Quality Assurance Processes:
    • Action: Overhaul the software development and release processes to include more rigorous testing, quality assurance, and rollback mechanisms.
    • Objective: Prevent future incidents by ensuring that all updates are thoroughly vetted and tested in real-world scenarios before release.
  2. Customer Engagement and Feedback Loop:
    • Action: Establish regular communication channels with customers to gather feedback on products and services, including user groups, surveys, and dedicated customer advisory boards.
    • Objective: Foster a closer relationship with customers, ensuring their needs and concerns are continuously addressed and incorporated into future developments.
  3. Investment in Security and Resilience:
    • Action: Invest in strengthening the company’s infrastructure, security measures, and disaster recovery plans. This includes hiring additional cybersecurity experts and implementing advanced monitoring tools.
    • Objective: Enhance the overall security posture and resilience of the company’s offerings, reassuring customers of the reliability and safety of their systems.
  4. Transparency and Reporting:
    • Action: Implement regular, transparent reporting on the company’s progress in improving software quality and reliability. This can include quarterly updates, white papers, and technical briefings.
    • Objective: Build long-term trust by keeping customers informed about the company’s ongoing efforts to enhance its products and prevent similar incidents.
  5. Strategic Partnerships and Alliances:
    • Action: Form strategic partnerships with other industry leaders, regulatory bodies, and certification organizations to align with best practices and standards in cybersecurity.
    • Objective: Position the company as a leader committed to industry standards and best practices, enhancing its reputation and credibility in the market.

By systematically addressing both the immediate fallout and implementing long-term improvements, the company can rebuild trust, restore its reputation, and emerge stronger from the crisis.

While most of the Short Term actions were already visible as early as Friday on various social media platforms, news channels and on Crowdstrike’s website, we may likely see the long term actions being rolled out soon after the issue is resolved and when the company shifts its focus to win back the customers’ trust. 

References :

https://blogs.microsoft.com/blog/2024/07/20/helping-our-customers-through-the-crowdstrike-outage

https://www.crowdstrike.com/blog/falcon-update-for-windows-hosts-technical-details/

https://www.morningstar.com/news/marketwatch/20240719348/airlines-grounded-banks-and-retailers-experiencing-outages-tied-to-crowdstrike-issue

https://techcrunch.com/2024/07/19/crowdstrikes-rivals-stand-to-benefit-from-its-update-fail-debacle/

Leave a comment