2 min read
MontyCloud Automates BSOD Fix for the CrowdStrike Outage for AWS EC2
Mitchell Ivanicki : Jul 19, 2024 7:09:12 PM
The CrowdStrike Falcon content update for Windows has caused widespread system crashes and "blue screen of death" (BSOD) errors, leaving IT teams scrambling to bring systems back online.
This blog post explains the challenges organizations face to resolve the issue, how to manually fix the issue, and how MontyCloud’s Automated Solution can help you fix the issue at scale rapidly and bring back the affected systems online.
Customer Challenges
- Time-Consuming Manual Fixes: Fixing each affected EC2 instance manually is labor-intensive, taking hours or days.
- Need for Scale: Manual fixes at scale are impractical and exhausting for overstretched IT teams.
- Technical Complexity: Managing multiple AWS services to deploy fixes requires specialized knowledge and effort.
How to Manually Fix the Issue based on AWS and CrowdStrike Guidance
AWS published the following manual steps to delete the CrowdStrike Falcon agent file on affected instances:
- Create a snapshot of the EBS root volume of the affected instance
- Create a new EBS volume from the snapshot in the same Availability Zone as the affected instance
- Launch a new instance in that Availability Zone using a different version of Windows
- Attach the EBS volume from step (2) to the new instance as a data volume
- Navigate to the \windows\system32\drivers\CrowdStrike\ folder on the attached volume and delete "C-00000291*.sys"
- Detach the EBS volume from the new instance
- Create a snapshot of the detached EBS volume
- Create an AMI from the snapshot by selecting the same volume type as the affected instance
- Call replace root volume on the original EC2 Instance specifying the AMI just created [1]
- Reboot the instance
- Confirm that the "C-00000291*.sys" file in the \windows\system32\drivers\CrowdStrike\ folder has a timestamp of 10:27 PM July 18th 2024 or later
Customers will need to follow these steps for all impacted instances which can take hours to days.
MontyCloud's Automated Solution
MontyCloud provides a no-code solution leveraging the AWSSupport-StartEC2RescueWorkflow solution and AWS Systems Manager to automate the rollout of the fix. Here's how it helps:
- Automated Rollout: MontyCloud uses AWS Systems Manager Agent, Automation, and Run Command to automate fixes across all affected EC2 instances, eliminating manual intervention on each server.
- No Code Required: Customers can perform the remediation without writing any code or logging into AWS accounts for each instance.
- Rapid Deployment: Connecting AWS accounts to MontyCloud DAY2 takes minutes. MontyCloud then detects all EC2 instances in the target accounts and enables remediation at scale.
- Simple Execution: Customers can execute the "Crowdstrike_Falcon_BSOD_Fix" task from MontyCloud's Task Library, which securely operates within the customer's VPC to fix the issue and bring the affected EC2 instances back online.
How the MontyCloud Solution Works
In MontyCloud, you will select the Automation Task “CrowdStrike Falcon BSOD Fix” from the Task Library.
Then you will select the impacted Windows EC2 instances to resolve the issue. There is no need to go into the AWS Console and you can fix all your impacted instances to save hours to days of operational work.
Value of the Solution
By automating the remediation process, MontyCloud empowers organizations to recover from the CrowdStrike issue swiftly and efficiently, minimizing the disruption and enabling businesses to return to normal operations faster.
- Time Savings: Automating the fix saves hours to days of manual work per server.
- Accessibility: Non-technical personnel can safely and efficiently perform the remediation.
- Security & Scalability: The solution executes securely within the customer's VPC and allows for rapid deployment at scale, reducing downtime and operational impact.
If MontyCloud can help you, please contact us: