Programming errors happen; that’s forgivable. Pushing out a worldwide update without extensive testing and without doing a small test update to subset of customers is not forgivable. My company would never do that. Never.
They didn’t even had to do it to a subset of customers, they could have just did it to a bunch of vms and collected any low level debugging logs. Like I’m not IT and even I can do that
@@mrsheabutter I'm not sure what it was, but I find it hard to believe this happened by mistake. The amount of incompetence needed would be staggering, even before this update that brought it to light. It wasn't even a bug. The updated file was all zeroes; clearly corrupted somehow, and it was a data file, not code. That also means a) the already existing program does not do any signature or hash checking of updates, or somehow Crowstrike signed an all zeroes file, and b) the already existing driver crashes on a faulty data file.
Took us 7 hours to get majority of stuff working, but altogether 48 hours no-sleep up to get us stabilized. I'm also in healthcare IT. It was a nightmare.
Yep. We're still at it on the Monday following. All critical care machines were fixed Friday morning. Clinic machines over the weekend. Administrative machines are being done today. God bless!
After decades of dealing with Windows servers, unless its absolutely necessary, we avoid windows in mission critical roles. Even our AD managing the end user comps isn't on Windows.
We were not affected because we run Ubuntu Christian Edition on servers and TempleOS on clientside. Every morning we start our work with a prayer. That is the best endpoint security an IT departement can have.
We dropped Crowdstrike months ago. Glad we made the right decision. Also, bitlocker shouldn't be used on a Windows server long as the physical machine is in a secure location. Bitlocker just complicate things at the OS level.
Unfortunately compliance requires specifically data at rest to be encrypted. We use hosts with system volume auto unlocking and the virtual hard disks on a volume that require unlocking on reboots to meet this.
I also work in a hospital as an end-user support tech, and I went in thinking I was going to have an easy Friday since I usually try to finish my tasks and incidents throughout the week, so I even went out for drinks....I was hung over AF coming into work, and even my machine was down, so my entire Friday was just non-stop lol.
Yeah, the server guy's machine is still dead, couldn't find his bitlocker key. Oddly enough my desktop was unaffected even though it had Crowdstrike. My laptop was affected. It's weird that only about 80% of all our systems were affected. Not sure why it didn't kill 100%. All had Crowdstrike installed. This makes me wonder if this was a dry-run. God bless!
Their core software design is to run uncertified code shipped as .dat files within the kernal, passed off as certified WHQL certified code. And they very clearly have no automated testing environment progression or canary rollout strategy either. Just, wow.
Well, they claim everything is rigorously tested before being deployed. I kind of believe him. My gut tells me this may have been industrial sabotage. No proof though. God bless!
@@NetworkAdminLife i've spent too long in SW development to believe the "we test everything thoroughly" rhetoric without specific details backing up the claim because no business owner who's just taken the world down is going to admit that they lick it, flick and stick it when they fancy throwing updates out. Testing is usually the first sacrifice on the alter of "speed", and this company has done this exact thing several times before across multiple operating systems so their testing is clearly somewhere between paper thin and non-existant: Never attribute to malice that which can plausibly be explained by incompetance... All they needed to do is build and run 3 tests: CanLoadSignalFile() CanValidateSignalFile() and CanRunSignalFileCode() and gate the deplyment off those (and all the others!) passing. It's literally that simple.
Oh man, if it were only my decision to make. That decision will come down from the county level. They will probably do a review of our security posture and ultimately nothing will change. It's government so... God bless!
Working on an IT firm. Thankfully we run only Debian servers and workstations are Ubuntu and Mint. None were affected. VMs running Win Server were also safe because for whatever reason we still don't use CrowdStrike, so I guess it was our lucky day.
I don't know how any of this matters since crowd strike falcon can run on pretty much every distribution of Linux. Just because you guys running Linux doesn't mean you're not going to be affected. You're not affected because you didn't download the software. Just another mysterious case of a TH-cam keyboarder trying to sound smart for no reason and doing the exact opposite
@@KB-nt7egLinux doesn’t allow apps to have kernel access. MSFT windows does & CS pushed an update that had null driver files into millions of Windows PCs. & here we are.
@@KB-nt7eg No. On Linux, there are ways to do what CrowdStrike does on Windows without having to run in kernel space (the most privileged part of an Operating System). Even if something went wrong, it wouldn't bork the whole thing like on Windows.
So I’m an MS Azure (former nuance) engineer in healthcare, and I’ve probably worked with you at some point over the decades. Yeah I repaired about 25k-30k servers that night. I’ve worked with Crowdstrike on and off since 2015 or so, and their engineers were top notch. However I have noticed lots corners be cut in the past few years. So what Crowdstrike does is have an executable binary portion of their updates. The driver doesn’t validate or do checking on this code, it just blindly jumps execution to this block. Normally this isn’t a problem (it’s pretty dangerous, but) as issues would be caught by the OS. however to avoid bypassing crowdstrike they set their driver with the highest tier of criticality in the OS, which tells it to halt if not loaded successfully (usually reserved for very basic hardware like basic display driving or usb.). Well in this update they jump to a string version of a memory address (probably a simple code mistake) followed by all zeros. Basically a null pointer exception. So the kernel goes into a null pointer, panics, and is told to halt as the critical driver is not loaded.
Our server guy is our dedicated Azure resource so I bet you've worked with him at one time or another. Thanks for that really clear description of what they do. In a perfect world Crowdstrike's approach is probably wonderful. Until it isn't. God bless!
@@NetworkAdminLife Oh I've been supporting US and Canadian hospitals and clinics for the past 25 years in many many different situations from Transcription to PAX to EMRs to hardware devices, which is why I say that. Yeah I think CS got too big too fast as a 2010's unicorn, they got ahead of their processes and procedures.
The company I work for has several layers of infrastructure, a significant amount of which lives in Azure. Azure VM's were a nightmare to repair. My team and I worked 20 hours beginning at 3 AM on Friday. Rested a bit, worked 20 additional hours, then worked 10 hours Sunday. We had bitlocker issues, Azure permissions issues, unmanaged disk issues, issues with detached disks reattaching with the wrong drive letters, Share permissions being lost, etc etc. It was insanity.
i am currently learning devops and networking and i really dont understand why a company would even use this stuff. why not just alibaba, ibm and google and NO GOD DAMN WINDOWS
also why the hack wont you just use ebpf for network metric/monitoring. this is just nuts: paying millions for enterprise stuff, but just get f'ed for doing so.
yeah thats why I dont work at places who jam shit in azure personally... MS themselves told us do not use hyper-v at your scale it wont be good.. so how is azure going to be any good if they turn down one single large enterprise with less than 100k VMs .. plus they dropped the ball big time.. they knew hyper-v was bust 15 years ago and they never started developing a true hypervisor and if they would have they would have just ate the entire market share up overnight when broadcom pulled their bullshit all the VMware customers with 10x+ increases
@@DommageCollateral there are many reasons. But the biggest is throughout. We process a large amount of live data. As you may know, a symmetrical connection with enterprise uptime is extremely expensive. You question our OS without knowing our industry? Hmm. A little advice since you say you’re still learning. Open your mind a bit. Otherwise this is not the field for you. You use what you use because if is the best way to get the job done. We have Linux boxes, we have windows boxes, we have mainframe software, and we have onsite equipment with its own software and quirks. Embrace it. It’s the way it is.
ya’ll know crowdstrike CEO was also responsible for the 2002 I think Mcafee (sorry if spelled wrong) debacle. Similar event. Except now he’s a billionaire who buys race cars I think. js.
He was questioned about the incident and gave no answer just said that Cybersecurity is hard to be proactive and ahead of the criminals. That people should be patient and go to the help portal. What an asphole. I hope they sue the shiite out of them for negligence.
Feel your pain and others out there that have had to deal with this fiasco. I got called at 3 AM here in VA. Server guy here at regional large region clinic. We recovered all or our servers in about 14 hours. Had enough resiliency that our hospitals remained open and able to serve our patients. Field services are still cleaning up endpoints.
Yeah, the server guy and I did a divide and conquer approach with the VM's. He started at the top of the list and I started at the bottom and we met in the middle. A couple of severs had to be restored from snapshot but we got it done. Then we started helping with endpoint clean up. God bless!
My rage at everyone downplaying this for CrowdStrike is immeasurable. This is a billion dollar company, with a B, trusted by critical government, public, and private services and they shafted each and everyone. The lack of outrage from our authorities is absolutely disgusting. Speaks a lot to the state of cybersecurity and tech in general
I've dealt with several large incidents in my career (including security incidents). If there's any good from this one, it's that a good chunk of IT pros all over the world got to share the experience together. If there is a glass-half-full way to look at it, we'll all get to hone our response plans together and compare notes. While this was just a bad push, it was also a good dress-rehearsal for a supply chain attack (albeit with a simple, but tedious, resolution). I'm a CIO, myself, and was behind the keyboard with my folks on this one. They inspired the heck out me, how well they worked under pressure on this. Beyond just giving us in IT an annoying workload, I'm sure we'll learn that real human damage was done. Stories like this involving hospitals certainly point to such damage. I don't want to take anything away from that, but anyone not taking the opportunity to extract great lessons and measure the effectiveness of their response is missing out.
We went through the exact same thing as you described for our hospital. We have about 6000+ desktops and it took us 3 to 4 days to get most of them done. We are still dealing with a lot of remote users who have to ship their laptops in. Thanks for the video.
Thats a solid CIO. Staying in the trenches with the team and not asking for an ETA from the ivory tower. Wish I had leadership like that when I worked in Healthcare IT.
Been retired from IT for eleven years now (after a 28 year career) - sooooooooooo glad I don't have to deal with this BS anymore! The IT world is literally insane and only going to get worse. We are doomed. Have a great day.
That happens when IT Managers build very nice PPs which explains why we do not need onsite IT Staff and why it is the best to outsource everything because it is so much cheaper
Yes. This can all be fixed by the folks overseas. Yeah, sure it can. I got laid off from my last job due to outsourcing. Don't get me started. God bless!
I.T. Professional here and have been working on restoring systems all day!!!!!!! Companies are about to get a reality check paid out to their I.T. staff in overtime!!!!!! I held a webinar to fix end users systems and one user alone took 3 and a half hours to get straight!!!!!! Plus, I had to get bitlocker encryption keys for every user multiple times for the same user. I DO NOT agree with zero day updates or automatic updates. I have that option turned off on all my personal systems. It was only a matter of time before an exploit affected the "cloud" and our outsourcing I.T. infrastructure eventually showed cracks!!!!!!!! This is going to be a multibillion dollar cyber security clean up. I can't believe the markets and especially CrowdStrike didn't crash!!!!!!! Time to short!!!!!
i'm glad me and my team had to deal with only 250 users, gues what happen Monday this week, we going back to self hosting baby, fuck cloud, thank god our boss is not a moron and aknolaged outsourcing will bite his ass at some point. had some old servers around deployed proxmox, made VMs and true nas. Order for better servers were placed yesterday, but ye managed to fix everything durring weekend with a staff of 3 including myself. Good luck to all my breathren, allot of sweat and overtime happen fot the past 5 days for allot of you.
Yeah, I clocked almost 30 hours OT in this pay cycle. Multiply that across my hospital, then across the other county departments, then the nation, then the world. Billions is right. And congress wants to investigate Delta for cancelling flights? Seriously?? God bless!
Good on you for having a smart boss. Glad you got things sorted out so quickly. We were still digging out as of yesterday (Tuesday following). God bless!
@@NetworkAdminLifewell that investigation will probably show a lot of outsourcing and an overwhelmed and overworked IT staff. That might be productive on a larger picture in the long run. One can hope anyway.
My wife works at a local hospital. From her description, there were a lot of mid-level managers running around with a USB dongle that they used to "fix" broken systems. That was early Friday morning so I assume they were rebooting/reimaging, not running a mitigation script. The hospital was mostly working by early afternoon.
Ditto. We used the USB's to boot into safe mode w/networking so we could log in as a local admin and delete the affected file. We even had the CIO out there most of the day. Same here, critical departments were up by noon, 80% of the rest were up by Friday afternoon. We did mop up work on Saturday. God bless!
I wanna hear the RCA. Did they create a deployment ticket? Was it signed off as tested?? I’m curious whether these fortune 100 companies all use the same systems management company.
I work in I.T. I went on vacation on 18th then this happened. My email was blowing up they was begging me to come in. I was already at airport waiting for my new flight schedule cause was cancelled. Man I was ticked off. lol.
Criminy! I can imagine. My boss was supposed to come back from a vacation in Europe. He's still there. Flights keep getting cancelled. Lucky him! God bless!
In EU the Crowdstrike EULA states that it MUST NOT be used in mission critical infrastructure. I wonder why? And I wonder why they sold it to mission critical's then. Anyways, wonder if the company will survive the sue's they get.
@@Wavepush yes and no. depends on many factors. The main factor is, that if they sold it to those industries, was this even legal if you must not use it there?
@@NetworkAdminLife Oh there will be plenty of lawsuits. As ultimately billions have been lost in total and someone has to pay. Insurance companies will sort the lost revenue for their companies but anyone knows, insurance companies are like sharks for blood, except blood is money. They will absolutely do whatever they can to claw back the money they have had to pay their customers and will move heaven and earth to hold them accountable. I expect a lot of undisclosed settlements with Crowdstrike over the next year or two. Their liability company is probably stressed as well. Will Crowdstrike survive this? Time will tell. Hopefully they will as it arguable is a negligible mistake, but not intentionally harmful. Accidents happen and some poor guy feels awful for pushing that button to push that update. Microsoft will be dragged into this as well. Their bias towards other countries/companies in regards to allowing access to such levels is coming to bite them. Just like comparing to Apple who don't give such access. Microsoft were warned of these risks but didn't want to address them. So some companies will come for them as being complicit knowing they were warned many times this could happen and would be participating but knowingly allowing these risks and not putting safeguards in place. Regardless, this will cost Microsoft financially, even if not through direct settlements or lawsuits.
From what I've read this isn't the first time something like this happened. It happened before but affected Linux systems and since the market share of the kernel is so low it wasn't widely reported until the Windows outage sort of linked the two together.
That's interesting. I'd heard about the MacAfee issue with the current Crowdstrike CEO but I hadn't heard that there was an issue with Linux systems. I did a quick session with ChatGPT and it confirms this and gave me the following website. Thanks for bringing this up! God bless! www.theregister.com/2024/07/21/crowdstrike_linux_crashes_restoration_tools/ and www.neowin.net/news/crowdstrike-broke-debian-and-rocky-linux-months-ago-but-no-one-noticed/
I'm about to push out a Debian image that will allow you to boot, run a script, and automatically remove the files that crowdstrike has been causing the problem. I'm learning how to use Debian live-build anyways, and I want to help, since this particular issue did not effect my company
@@NetworkAdminLifeno you would need an recovery key for that but any one in dev can build a shell script that can fetch the key if you have any supporting infrastructure for that
excellent point. main frames are never going to be obsolete. in this day and age of having powerful computers in our pockets and around our wrists, you would like main frames would be gone. but no. they are stable and very safe from most viruses , attacks and bad updates
It makes me so happy to find a good godly man in a random tech video. God bless you, dear brother in Christ. Peace and grace to you and your family. Amen.
Thank you brother! So glad it encourages you. It's a huge encouragement to me to find so many godly viewers out there! And it's fun to annoy the atheists. I know that's a character flaw on my part. I'll pray about that. God bless!
@@NetworkAdminLife You're being a testimony, and you're planting seeds, so that's not a flaw, but the gift of boldness. Keep piercing them with the sharp word of God.
Northeast hospital Systems guy here...similar story...it was a crazy night...mine started at 3:30AM and went until Saturday. Also, fix that VNX! it has an amber light on the top shelf.
Yeah it was fun wasn't it. Every department thought they were most important and that we should start with them. FYI it went ED, Pharmacy, Radiology, OR, ICU, and then whoever screamed the loudest from there. I loved the folks who would say stupid stuff like "can you get me a new mouse while you're here?" Sure. I got nothing better to do. Not for nothing but you do realize the whole hospital is down right? God bless!
We were very fortunate in my IT group, we had only tested Crowdstrike a few months ago on a few test machines, but decided not to implement it. We did have to clean out a few Windows registries to clear out the remnants, but nothing like you guys had to do. All of our laptops have Bitlocker turned on, but not our workstations. I was very glad we didn't have Crowdstrike, as I was the night phone tech on duty on Friday!
Being the on-call guy when this hit was no fun at all. The clean up would have been really quick if not for bitlocker. On. Every. Machine. Ugh. God bless!
They pushed a driver-file (CS is installed as a driver and reads its code from those .sys files, it's kind of a backdoor) with all zero's in it and that caused the issue. Normally the deployment process should be secured in a thousand ways and software should be tested for ten thousand scenario's before carefully deploying it, region by region. To push untested (in every test this would have popped up) software to 8,5 million production systems that are with customers who put a enormous amount of trust in your company (since they have privileged access to their core systems) is such a sign of incompetence and a complete lack of internal processes or controls that I would kick CS to the curb immediately and terminate the contract. There is no excuse for this.
100% agree. Compared to CS, my company has a tiny number of people receiving automatic updates from us. Before any update goes out, it is tested for multiple days on several computers. Then we send it only to a small subset of users who have volunteered to be beta testers. It runs there for a week or two before we push it out to everyone else. Pushing out a worldwide update to millions of computers with little or no testing and no small-scale beta testing is insanely irresponsible. In addition to that, it's irresponsible for CIOs to allow a company to push kernel-mode updates without their knowledge, permission, and testing. What if CS had been hacked and ransomware was pushed out?
I'm still keeping corporate sabotage in the back of my mind. What if they did employ all the safeguards and the file got filled with zeros just prior to deployment. We'll probably never know what went down. God bless!
For 100th time in 3 days, For those of you who apparently still dont get it, It wasnt Microsofts update, It was CrowdStrike's update ... They pushed it into production rather than testing it first
But Linux.... Yeah, I love all these people saying I should be using Linux servers as if 1) that's my decision to make, 2) that I even manage the server environment (I don't), 3) that all large corporations have a crew of Linux admins just waiting in reserve to be sent in to save the day. God bless!
Yes, but everyone saw the MS bluescreen on the news. And my sympathy is lacking as MS enabled this Cloudstrike problem by treating a signed driver to execute psuedo-code masquerading as an unchanging part of the kernel.....so it maybe Crowdstrike, but MS will take the hit as well.
@@joerhodes658 You give users WAY too much credit. Executives will just uninstall Crowdstrike for their organization - because the news said CS did it.
Most of our Computers are Azure joins so it was really easy to find the BitLocker key. problem is I’m dyslexic and I had to enter in that long key and I kept messing it up 😭
Yeah we could find most of them in AD but not all. And then some PC's weren't labelled with their names so that was a pain. I hate bitlocker. God bless!
I decided to changing careers after getting my associates, I'm 29 doing comptia A+ and the whole trifecta (sec and net). Going to do the Azure cert too but I love learning this stuff, hoping to land on a helpdesk job here in Portland and getting my feet wet and continue to work my way up. Thanks for posting these kind of videos for another perpective in the enviroment. I love Mike Myers but you can only watch so many videos before I am wanting to be hands on. Thank you
LOL! True. However, I was on call so I would be the right guy to call. And I called the server guy in. Luckily it was still early and he was still awake. It truly was a case of, we rebooted the server and fixed the network. Funny how many people were asking us how long it would take to fix the network. There were gritted teeth behind my smile. God bless!
Compared to CS, my company has a tiny number of people receiving automatic updates from us. Before any update goes out, it is tested for multiple days on several computers. Then we send it only to a small subset of users who have volunteered to be beta testers. It runs there for a week or two before we push it out to everyone else. Pushing out a worldwide update to millions of computers with little or no testing and no limited-scale beta testing is insanely irresponsible. In addition to that, it's irresponsible for CIOs to allow a company to push kernel-mode updates without their knowledge, permission, and testing. What if CS had been hacked and ransomware was pushed out?
@@NetworkAdminLife This would be a great feature for CS. I have sent this ability in a feature request several times. I feel for you; I walked in at 5:00 AM to 110 end-user computers and all of my servers BSOD. We had everything going by 11:00 AM, only having to restore one MSSQL server from last night's backup. I have had Windows updates do far worse damage, as far as kicking CS to the curb, not happening. I still suspect internal sabotage or revision errors.
I work in IT in intel ireland, I knew it was bad when highups at the site were waiting for me at my desk when I walked in at 8am. That was a fun 14 hours of overtime
You know the best news I got was when I turned to the BBC News website (because ABC/NBC/Fox/CNN hadn't woken up yet, and learned that it was a worldwide problem. My first thought was, "Oh, thank God! I didn't cause this!" And from a Dutchman who is part Irish (isn't everyone), Éirinn go Brách and God bless!
I still find it amazing that a hospital can no longer provide ANY essential services when IT goes down (name your incident here). I worked IT at a SoCal hospital and we had a network outage hit us one time. Same result...they closed the entire hospital down. Amazed how far medically we've advanced and yet everything is held in place by one pin/Jenga piece. You'd think that they have a backup system (backup to how to provide medical services with manual record keeping, not an IT backup) in place in case something serious happened over an extended amount of time.
The problem is records. What you suggest would lead to massive amounts of paper being printed daily and having to update them continuously just in case you ever need them. Furthermore, MRI and Xray machines all run on computers. So do modern dispensaries, autoclaves, lab equipment, ....
Of course they can still provide essential services. There are doctors and nurses to attend to patients. They are doing non-elective surgery. It's just hard to register new patients and do accounting.
This affected the user's desktops. What should the backup plan be for desktops? Our servers were back online within a couple of hours. They'll wait longer in the waiting room than that. God bless!
I appreciate the fact that you are willing to be in and help the situation. I know your title states a network admin and yeah well the network is fine, this a OS level issue aka software based. But yet you are willing to drive in and offer what you can to help. I've been at places where if its not a certain admin role and they arent affected they SIMPLY will not help or even assist. I give you kudos!
Bitlocker is a pain in the arse! Network admins should have a global common code for all machines, and plugging in a USB key with it on should be enough. Fast access for just this scenario! Then run a script to delete the offending files.
Well this Network Admin doesn't work on desktops so if such a beast exists that would be mighty handy to have. If you know how to generate that I'd be interested in know. That would have been awesome. God bless!
More power to you guys. I truly feel blessed I was laid off a few weeks ago before this happened. Proud of all the IT folks currenly working this mess! I miss my job but not that much right now. :)
Clearly not enough to disregard pretesting, roll out on a Thursday night for Friday, and to roll out an update at a mass level instead of in waves. Complete and utter failure of quality control and complete idiocy
I understand where you're coming from but what happened can't just be swept under the rug. I don't see how Crowdstrike stays in business after this. The lawsuits alone... God bless!
The worst part, other than software meant to protect against cyber attack perpetuating the worst cyberattack in history, is that people died due to the multiple hospital shutdowns that required reroutes.
Yeah that sucks. I Have family that works for major utility that was affected. Not be able to respond to gas leaks and other shenanigans is also a yikes scenario.
Yeah, that's why were were working so hard to restore hospital systems. I mean, what if it was my wife or child that was on the way to MY hospital. This was akin to the sheep dog killing all the sheep. God bless!
@@aman4everchanged......129 Agreed. Whole lot of nightmare scenarios that most of us never consider. Shows we should always have a backup plan. God bless!
Fun Friday. I work in IT for a company that owns a lot of fast food style restaurants… thank goodness we have a great team and had all machines back up before lunch. Won’t forget that one lol
really glad i quit IT this year in january after 10 years. those days like july 19 will not be missed!! IT already is aweful on a normal day but days like this are just hell. on top of that my company sucked so hard. a merge with another 3 companies made it just worse. only our boss got something out of it... until 50% quit.
@@amano22 i work as a welder and mechanic again. I do manage some IT stuff in my new company but it is just easy stuff. It is just one company with an on premise system i build. They used to be a customer of mine.
I'm I wrong in my thinking. At every company we have a system admin team and they manage the patching for software so new versions for the software isn't pushed to production until it tested in a small testing group. If that was the case wouldn't all these issues have been avoided anyway.
We do that for Windows Updates. However, Crowdstrike manages that themselves. They push out updates in real time and there's no way to roll it out in phases. Sucks. God bless!
Also my son's birthday on the the 19th and as the security engineer I too pulled a triple with our team. This is a really bad look for all security teams. When I worked for a hospital it often took months to deploy patches to ensure due diligence was maintained prior to reaching critical systems. Crowdstrike has some serious reputation repair in store.
I'm not your son, but I turned 42 on July 19th, the CrowdStrike Catastrophe Day. Happy birthday to him. Funny enough... I am in medical I.T. and yeah.... busy busy day it was.
Well happy birthday to you too! Man, your birthday present sucked! LOL! We're still digging out and hope you've got your operation back up and running. God bless!
Thanks for making this video, this is the first time I am coming across your channel and was really interested to see how this situation was handled. I do have one question though, I never saw one of these bluescreens but sometimes they show the driver file, did it show anything related to CrowdStrike because from everything I was hearing from a lot of people is that it didn't, that must have been terrifying. Massive respect to everyone who has had to deal with this situation, you guys are the true hero's
This is the first time it hasn't been Microsoft. Okay, I think maybe Norton AV screwed up maybe 15-20 years ago too. But usually we're backing out MS patches. God bless!
@@joebleed Yep. It's been a while though. Last one I remember was either Norton or Symantec AV and we had to run around and remove some virus definition files. God bless!
From all the videos I’ve seen, not a single one has had an easy fix like running a script to delete the file. This is one for the history books . Glad to see your up and running.
I would’ve written a scripts. I would’ve created a little pxe boot to boot a small Linux with NTFS driver boot the little Linux mount the ntfs system partition delete the file restart and voila. And you can query the EntraID to get the keys from AAD. I’m a true engineer/developer because I am too lazy to do things by hand.
So, what's the most important lesson that you and everyone else learn from this? There are multiple lessons here for those that want to learn them. There are also a few that , it would seem, people need to relearn. The primary one here would be that automatic updates are a rolling failure cascade waiting to happen.
@Agnemons I have over 700 endpoints in eMAR across Canada. We had to disable the Windows Update service by changing the Run As to the disabled .\Guest user. We force them into UWF lockdown mode and stage our updates. The benefit is Windows 10 Enterprise LTSC runs like a toaster at optimum performance throughout the life-cycle of the device. No phone calls since 2016! Each image takes about 30-60 hours to develop which we do every quarter and is Level 2 CIS and HIPAA compliant.
We don't know why the update didn't work or why it was sent out without being thoroughly vetted. Could still be a hack. Crowdstrike will probably never tell us. God bless!
But, but, but....! They offered a $10 Uber Eats gift card to all of their affected clients! There, all better now, right? Right? Hello? Anyone? God bless!
Now that things have settled down, have you given any thought into ditching CrowdStrike? Issues such as this must make you question whether it's features outweigh the problems that it has caused.
@@NetworkAdminLife We stored the bitlocker Keys for clients in their Password Manager so you can get it using the webclient. WE do not used it on Servers. Servers run 24/7 so Bitlocker make No Sense. And If someone took Hardware from your Server rooms you got bigger Problems.......
Thanks for sharing your experience during this historic meltdown. It was very interesting to hear how a massive organization like a hospital handled this situation.
Thanks for this fascinating insight into a day in the life of a Network Admin. You definitely had a day of hell there with the CrowdStrike fiasco. I remember going through some Bitlocker hell myself a few years ago. Not fun!
Love the war story brother. I was doing a VCF SDDC NSX update Thursday night when this fiasco went down. I thought I took everything down doing that, until I saw stuff outside of our SDDC effected and going down in Site 24x7.
Oh man I bet you were pooping bricks for a moment or two! I tell you, I was SO relieve when I heard this was a world wide problem. I kept thinking we had a virus or something. Had me wondering if I had opened something up on the firewall. God bless!
We had all hands on deck. I work on another non IT engineering team and fixed 80 Servers and pcs - I can't imagine what the rest of our IT team had to do. Wake up call at 4:55am.
We were lucky, it only hit our managed ERP, first call was 3:45 AM Eastern PROD system was back online at 09:45 PM Easter 17 hours, but we are back up. Keep up the good work Brother.
I forget where I heard this and can't find it, but "It takes about 6 weeks for the generic public to forget tragedy." I think it holds rather true in IT as well. Almost every large enough tech firm has had major screwups, and they're all (for the most part) still here. More often than not, the tech firms that go out of business are the ones who forget to evolve and adapt to technological advancements.
What a nightmare! Bravo for getting a handle on the cause and fix so quickly. Exhausting though! Well done. Did you have to produce a Lessons Learned report to management? Best.
Well, management was out there with us. I don't particularly like my CIO but I have a lot of respect for him now. He was out in the trenches with us and put in more hours than I did. I think he's writing that lessons learned document on his own. I'm sure we'll have an after-action meeting. God bless!
Great overview from a real like network admin. Can't get any clearer, you've got your point across with respect to the criticality of what happens when your servers go down. Especially in hospitals.
My employer ( thousands of ms endpoints) is all hands on deck. All weekend, still ongoing ( can you say bit locker). Kind of makes me glad i have a broken foot.
Dude, that's all I had to do was break my foot??! Wish I had known! Yeah, I'm still cleaning up the odd server here and there on the Sunday following. Crazy days. God bless!
Being a recently retired Cyber Security Professional, all I can say is that Friday's events show a major computer system vulnerability in the World. I won't go into one of the possible scenarios, but it's not good.
@@HopelessAutistic no it isnt windows. Microsoft had to allow access to the Kernel as 3rd party companies whinged that they didnt have access to it to write their drivers. This is the consequence of it. Crowdstrike a month ago released a dodgy update for linux so not all platforms are 100% bulletproof
@@emilianoabarca I disagree. Microsoft made a cesspool of operating systems - this case maybe someone else’s. But the underlying issue is windows is a joke and is a fragile operating environment. You can get all literally all you want but you won’t change me. :P
For me, our security desk called me at 7:30 in the morning because their acces control software couldn‘t connect. I was in the office at 8 and saw half of the PCs in our office building showing a blue screen, around 400 machines. For some strange reason, all 4 laptops in the IT office were fine although they are running 24/7, but I could not access much as the DHCP, DNS and other servers were offline, so that I needed to configure that with static settings to even start figuring out what exactly wasn‘t working. We created a huge Bitlocker export file that day and shared that with people in the office and in home office so that everyone could fix their machine - this was the only scalable option and we rotated the keys afterwards.
So many young people have been complaining and dodging work. This is the time to be the hero! Yeah people always blame IT but if you do the right thing people recognize it over time
We did get some recognition this time around. But not until they realized it wasn't our fault. Our cafeteria sent over free pizzas, so there's that. God bless!
Cheers for the cool story. Must have been a crazy day for the cio, fighting in the trenches, having to direct the troops and dealing with the politics from inside the organization.
Happy Blue Screen Day! 😄 Lucky You! We have over 5000 Servers in our Data Center and over 1200 Microsoft Azure Servers running various Windows Servers OS's WITH Bitlocker enabled. What a headache! 😰😡😫
Although my company was unaffected directly, some of our 3rd party service providers were impacted and we implemented some business continuity plans. Dodged a bullet. Already setup start of day meeting on Monday to go over continued impacts. Also need to review change control policy as well as software inventory to see if we have an similar IT risk that isn’t properly described and managed.
Yeah, I don't think anyone saw this coming. Business continuity plans have to now account for "what we don't know". Glad I only have to follow orders. God bless!
Programming errors happen; that’s forgivable. Pushing out a worldwide update without extensive testing and without doing a small test update to subset of customers is not forgivable. My company would never do that. Never.
They didn’t even had to do it to a subset of customers, they could have just did it to a bunch of vms and collected any low level debugging logs. Like I’m not IT and even I can do that
Ban crowdstrike.. charge them for every billion the world lost
And thats why I don't believe it was an update. This was a hack. Reported as a failed update
@@chris895 100% agree. The cost of lost business, rescheduled airline flights, and all the human time to reboot computers is enormous.
@@mrsheabutter I'm not sure what it was, but I find it hard to believe this happened by mistake. The amount of incompetence needed would be staggering, even before this update that brought it to light. It wasn't even a bug. The updated file was all zeroes; clearly corrupted somehow, and it was a data file, not code. That also means a) the already existing program does not do any signature or hash checking of updates, or somehow Crowstrike signed an all zeroes file, and b) the already existing driver crashes on a faulty data file.
Took us 7 hours to get majority of stuff working, but altogether 48 hours no-sleep up to get us stabilized. I'm also in healthcare IT. It was a nightmare.
Yep. We're still at it on the Monday following. All critical care machines were fixed Friday morning. Clinic machines over the weekend. Administrative machines are being done today. God bless!
After decades of dealing with Windows servers, unless its absolutely necessary, we avoid windows in mission critical roles. Even our AD managing the end user comps isn't on Windows.
We were not affected because we run Ubuntu Christian Edition on servers and TempleOS on clientside. Every morning we start our work with a prayer. That is the best endpoint security an IT departement can have.
Amen! The first tool I pull out of my toolbox is prayer. Then I get to work. God bless!
We dropped Crowdstrike months ago. Glad we made the right decision. Also, bitlocker shouldn't be used on a Windows server long as the physical machine is in a secure location. Bitlocker just complicate things at the OS level.
Unfortunately compliance requires specifically data at rest to be encrypted. We use hosts with system volume auto unlocking and the virtual hard disks on a volume that require unlocking on reboots to meet this.
@@Wahinies Yep. I've read about this as well. It would make the most sense is to use it on data drives. Leave the OS drive unencrypted.
Luckly we don't use Bitlocker on our servers. Firstly most are virtual, and the physicals are locked up tight. God bless!
@@Darkk6969 you risk data leakage when leaving the OS disk unencrypted. Logs, cache, memory dumps on crashes, etc.
@@Wahinies you might consider encrypted blob storage for data that falls within the cold/archive range. (90 and 180+ days respectively)
I also work in a hospital as an end-user support tech, and I went in thinking I was going to have an easy Friday since I usually try to finish my tasks and incidents throughout the week, so I even went out for drinks....I was hung over AF coming into work, and even my machine was down, so my entire Friday was just non-stop lol.
Yeah, the server guy's machine is still dead, couldn't find his bitlocker key. Oddly enough my desktop was unaffected even though it had Crowdstrike. My laptop was affected. It's weird that only about 80% of all our systems were affected. Not sure why it didn't kill 100%. All had Crowdstrike installed. This makes me wonder if this was a dry-run. God bless!
5 days now of getting paid $100/hr to turn off and turn on computers all day.
Ditto. That was my life. Taking a long weekend this weekend to decompress. God bless!
Oh boy you did the method of exploiting the reboot race condition. Oh man.
Their core software design is to run uncertified code shipped as .dat files within the kernal, passed off as certified WHQL certified code. And they very clearly have no automated testing environment progression or canary rollout strategy either. Just, wow.
Well, they claim everything is rigorously tested before being deployed. I kind of believe him. My gut tells me this may have been industrial sabotage. No proof though. God bless!
@@NetworkAdminLife i've spent too long in SW development to believe the "we test everything thoroughly" rhetoric without specific details backing up the claim because no business owner who's just taken the world down is going to admit that they lick it, flick and stick it when they fancy throwing updates out. Testing is usually the first sacrifice on the alter of "speed", and this company has done this exact thing several times before across multiple operating systems so their testing is clearly somewhere between paper thin and non-existant: Never attribute to malice that which can plausibly be explained by incompetance... All they needed to do is build and run 3 tests: CanLoadSignalFile() CanValidateSignalFile() and CanRunSignalFileCode() and gate the deplyment off those (and all the others!) passing. It's literally that simple.
IT admins are so underappreciated. Glad I got out of that line of work on days like today. You and your team saved the day sir! Kudos!
Days like last Friday make me wonder why I'm still doing this some times. Oh yeah, the money is good. I'll take the red pill by the way. God bless!
Is the cio thinking of switching away from crowdstrike? Nice job getting the hospital up and running so quickly
Oh man, if it were only my decision to make. That decision will come down from the county level. They will probably do a review of our security posture and ultimately nothing will change. It's government so... God bless!
Working on an IT firm. Thankfully we run only Debian servers and workstations are Ubuntu and Mint. None were affected. VMs running Win Server were also safe because for whatever reason we still don't use CrowdStrike, so I guess it was our lucky day.
You were CrowdMissed.
I don't know how any of this matters since crowd strike falcon can run on pretty much every distribution of Linux. Just because you guys running Linux doesn't mean you're not going to be affected. You're not affected because you didn't download the software. Just another mysterious case of a TH-cam keyboarder trying to sound smart for no reason and doing the exact opposite
@@KB-nt7egLinux doesn’t allow apps to have kernel access. MSFT windows does & CS pushed an update that had null driver files into millions of Windows PCs. & here we are.
@@KB-nt7eg No. On Linux, there are ways to do what CrowdStrike does on Windows without having to run in kernel space (the most privileged part of an Operating System). Even if something went wrong, it wouldn't bork the whole thing like on Windows.
It's good to be you and it sucked to be us! LOL! God bless!
So I’m an MS Azure (former nuance) engineer in healthcare, and I’ve probably worked with you at some point over the decades. Yeah I repaired about 25k-30k servers that night.
I’ve worked with Crowdstrike on and off since 2015 or so, and their engineers were top notch. However I have noticed lots corners be cut in the past few years.
So what Crowdstrike does is have an executable binary portion of their updates. The driver doesn’t validate or do checking on this code, it just blindly jumps execution to this block. Normally this isn’t a problem (it’s pretty dangerous, but) as issues would be caught by the OS. however to avoid bypassing crowdstrike they set their driver with the highest tier of criticality in the OS, which tells it to halt if not loaded successfully (usually reserved for very basic hardware like basic display driving or usb.). Well in this update they jump to a string version of a memory address (probably a simple code mistake) followed by all zeros. Basically a null pointer exception. So the kernel goes into a null pointer, panics, and is told to halt as the critical driver is not loaded.
Our server guy is our dedicated Azure resource so I bet you've worked with him at one time or another. Thanks for that really clear description of what they do. In a perfect world Crowdstrike's approach is probably wonderful. Until it isn't. God bless!
@@NetworkAdminLife Oh I've been supporting US and Canadian hospitals and clinics for the past 25 years in many many different situations from Transcription to PAX to EMRs to hardware devices, which is why I say that.
Yeah I think CS got too big too fast as a 2010's unicorn, they got ahead of their processes and procedures.
The company I work for has several layers of infrastructure, a significant amount of which lives in Azure.
Azure VM's were a nightmare to repair. My team and I worked 20 hours beginning at 3 AM on Friday. Rested a bit, worked 20 additional hours, then worked 10 hours Sunday. We had bitlocker issues, Azure permissions issues, unmanaged disk issues, issues with detached disks reattaching with the wrong drive letters, Share permissions being lost, etc etc.
It was insanity.
i am currently learning devops and networking and i really dont understand why a company would even use this stuff. why not just alibaba, ibm and google and NO GOD DAMN WINDOWS
also why the hack wont you just use ebpf for network metric/monitoring. this is just nuts: paying millions for enterprise stuff, but just get f'ed for doing so.
i also dont get why you dont have a backup strategy and in this case: just role back your baremetal automatically with puppet and foreman????
yeah thats why I dont work at places who jam shit in azure personally... MS themselves told us do not use hyper-v at your scale it wont be good.. so how is azure going to be any good if they turn down one single large enterprise with less than 100k VMs .. plus they dropped the ball big time.. they knew hyper-v was bust 15 years ago and they never started developing a true hypervisor and if they would have they would have just ate the entire market share up overnight when broadcom pulled their bullshit all the VMware customers with 10x+ increases
@@DommageCollateral there are many reasons. But the biggest is throughout. We process a large amount of live data. As you may know, a symmetrical connection with enterprise uptime is extremely expensive.
You question our OS without knowing our industry? Hmm. A little advice since you say you’re still learning. Open your mind a bit. Otherwise this is not the field for you.
You use what you use because if is the best way to get the job done. We have Linux boxes, we have windows boxes, we have mainframe software, and we have onsite equipment with its own software and quirks.
Embrace it. It’s the way it is.
ya’ll know crowdstrike CEO was also responsible for the 2002 I think Mcafee (sorry if spelled wrong) debacle. Similar event. Except now he’s a billionaire who buys race cars I think. js.
and the Russigate Hillary paid them to lie, that company os full of shit and snake oil, people love to buy snake oil
He was questioned about the incident and gave no answer just said that Cybersecurity is hard to be proactive and ahead of the criminals. That people should be patient and go to the help portal. What an asphole. I hope they sue the shiite out of them for negligence.
He's actually a pretty good endurance racer. He's participated in the 24 hours of le mans and 24 hours of Daytona a number of times.
@@Volk_ sounds like you are a fan… well he’s a bonafide tool with little regard for public welfare.
Amazing how incompetence is rewarded in today's world. God bless!
Feel your pain and others out there that have had to deal with this fiasco. I got called at 3 AM here in VA. Server guy here at regional large region clinic. We recovered all or our servers in about 14 hours. Had enough resiliency that our hospitals remained open and able to serve our patients. Field services are still cleaning up endpoints.
Yeah, the server guy and I did a divide and conquer approach with the VM's. He started at the top of the list and I started at the bottom and we met in the middle. A couple of severs had to be restored from snapshot but we got it done. Then we started helping with endpoint clean up. God bless!
My rage at everyone downplaying this for CrowdStrike is immeasurable. This is a billion dollar company, with a B, trusted by critical government, public, and private services and they shafted each and everyone. The lack of outrage from our authorities is absolutely disgusting. Speaks a lot to the state of cybersecurity and tech in general
Preach brother! God bless!
Trusting AI to push an update? Shheesh CrowdStrike! (If it's true)
I've dealt with several large incidents in my career (including security incidents). If there's any good from this one, it's that a good chunk of IT pros all over the world got to share the experience together. If there is a glass-half-full way to look at it, we'll all get to hone our response plans together and compare notes. While this was just a bad push, it was also a good dress-rehearsal for a supply chain attack (albeit with a simple, but tedious, resolution).
I'm a CIO, myself, and was behind the keyboard with my folks on this one. They inspired the heck out me, how well they worked under pressure on this.
Beyond just giving us in IT an annoying workload, I'm sure we'll learn that real human damage was done. Stories like this involving hospitals certainly point to such damage. I don't want to take anything away from that, but anyone not taking the opportunity to extract great lessons and measure the effectiveness of their response is missing out.
Charge Crowdstrike for overtime. They only seem to care for their money not their users satisfaction.
I bet if they want to keep many of their user base they will have to bear the cost of overtime. Or should. God bless!
We went through the exact same thing as you described for our hospital. We have about 6000+ desktops and it took us 3 to 4 days to get most of them done. We are still dealing with a lot of remote users who have to ship their laptops in. Thanks for the video.
Heh, yeah. Isn't it fun to be us? God bless!
Thats a solid CIO. Staying in the trenches with the team and not asking for an ETA from the ivory tower. Wish I had leadership like that when I worked in Healthcare IT.
He's still out there right now. I'm supplying Bitlocker keys to the guys in the field even today, 3 days later. God bless!
Been retired from IT for eleven years now (after a 28 year career) - sooooooooooo glad I don't have to deal with this BS anymore! The IT world is literally insane and only going to get worse. We are doomed. Have a great day.
I've been doing it for 40. Almost done.
After 20 years in support roles, I’m trying to get into admin roles.
Yep. We're all slowly dying. God bless!
You and me both. God bless!
Best of luck to you! God bless!
That happens when IT Managers build very nice PPs which explains why we do not need onsite IT Staff and why it is the best to outsource everything because it is so much cheaper
Yes. This can all be fixed by the folks overseas. Yeah, sure it can. I got laid off from my last job due to outsourcing. Don't get me started. God bless!
I.T. Professional here and have been working on restoring systems all day!!!!!!! Companies are about to get a reality check paid out to their I.T. staff in overtime!!!!!! I held a webinar to fix end users systems and one user alone took 3 and a half hours to get straight!!!!!! Plus, I had to get bitlocker encryption keys for every user multiple times for the same user. I DO NOT agree with zero day updates or automatic updates. I have that option turned off on all my personal systems. It was only a matter of time before an exploit affected the "cloud" and our outsourcing I.T. infrastructure eventually showed cracks!!!!!!!! This is going to be a multibillion dollar cyber security clean up. I can't believe the markets and especially CrowdStrike didn't crash!!!!!!! Time to short!!!!!
i'm glad me and my team had to deal with only 250 users, gues what happen Monday this week, we going back to self hosting baby, fuck cloud, thank god our boss is not a moron and aknolaged outsourcing will bite his ass at some point. had some old servers around deployed proxmox, made VMs and true nas. Order for better servers were placed yesterday, but ye managed to fix everything durring weekend with a staff of 3 including myself.
Good luck to all my breathren, allot of sweat and overtime happen fot the past 5 days for allot of you.
Yeah, I clocked almost 30 hours OT in this pay cycle. Multiply that across my hospital, then across the other county departments, then the nation, then the world. Billions is right. And congress wants to investigate Delta for cancelling flights? Seriously?? God bless!
Good on you for having a smart boss. Glad you got things sorted out so quickly. We were still digging out as of yesterday (Tuesday following). God bless!
@@NetworkAdminLifewell that investigation will probably show a lot of outsourcing and an overwhelmed and overworked IT staff. That might be productive on a larger picture in the long run. One can hope anyway.
My wife works at a local hospital. From her description, there were a lot of mid-level managers running around with a USB dongle that they used to "fix" broken systems. That was early Friday morning so I assume they were rebooting/reimaging, not running a mitigation script. The hospital was mostly working by early afternoon.
Ditto. We used the USB's to boot into safe mode w/networking so we could log in as a local admin and delete the affected file. We even had the CIO out there most of the day. Same here, critical departments were up by noon, 80% of the rest were up by Friday afternoon. We did mop up work on Saturday. God bless!
This is for sure a golden example of what can happen when u dont follow the golden rule nr 1 in software development = ALWAYS TEST BEFORE YOU DEPLOY!
I wanna hear the RCA. Did they create a deployment ticket? Was it signed off as tested??
I’m curious whether these fortune 100 companies all use the same systems management company.
Amen! God bless!
I do too. I hear this took down 25% of computers world wide. God bless!
I work in I.T. I went on vacation on 18th then this happened. My email was blowing up they was begging me to come in. I was already at airport waiting for my new flight schedule cause was cancelled. Man I was ticked off. lol.
Criminy! I can imagine. My boss was supposed to come back from a vacation in Europe. He's still there. Flights keep getting cancelled. Lucky him! God bless!
Is it just me to think that "crowdstrike" is a extremely fitting name for a company to to do a blunder like this?
@raylab77 There cover story does not match the aftermath. Was planned!!!! Your on the money. Global strike
In EU the Crowdstrike EULA states that it MUST NOT be used in mission critical infrastructure. I wonder why? And I wonder why they sold it to mission critical's then. Anyways, wonder if the company will survive the sue's they get.
If the EULA states the software mustn't be used for those systems, isn't crowdstrike protected?
@@Wavepush yes and no. depends on many factors. The main factor is, that if they sold it to those industries, was this even legal if you must not use it there?
Sounds like legalese. If we all read our EULA we'd find that no one is responsible for anything. God bless!
@@NetworkAdminLife I am no lawyer but I will buy popcorn and definately follow the lawsuits.
@@NetworkAdminLife Oh there will be plenty of lawsuits. As ultimately billions have been lost in total and someone has to pay. Insurance companies will sort the lost revenue for their companies but anyone knows, insurance companies are like sharks for blood, except blood is money. They will absolutely do whatever they can to claw back the money they have had to pay their customers and will move heaven and earth to hold them accountable. I expect a lot of undisclosed settlements with Crowdstrike over the next year or two. Their liability company is probably stressed as well. Will Crowdstrike survive this? Time will tell. Hopefully they will as it arguable is a negligible mistake, but not intentionally harmful. Accidents happen and some poor guy feels awful for pushing that button to push that update. Microsoft will be dragged into this as well. Their bias towards other countries/companies in regards to allowing access to such levels is coming to bite them. Just like comparing to Apple who don't give such access. Microsoft were warned of these risks but didn't want to address them. So some companies will come for them as being complicit knowing they were warned many times this could happen and would be participating but knowingly allowing these risks and not putting safeguards in place. Regardless, this will cost Microsoft financially, even if not through direct settlements or lawsuits.
From what I've read this isn't the first time something like this happened. It happened before but affected Linux systems and since the market share of the kernel is so low it wasn't widely reported until the Windows outage sort of linked the two together.
That's interesting. I'd heard about the MacAfee issue with the current Crowdstrike CEO but I hadn't heard that there was an issue with Linux systems. I did a quick session with ChatGPT and it confirms this and gave me the following website. Thanks for bringing this up! God bless!
www.theregister.com/2024/07/21/crowdstrike_linux_crashes_restoration_tools/
and
www.neowin.net/news/crowdstrike-broke-debian-and-rocky-linux-months-ago-but-no-one-noticed/
I'm about to push out a Debian image that will allow you to boot, run a script, and automatically remove the files that crowdstrike has been causing the problem. I'm learning how to use Debian live-build anyways, and I want to help, since this particular issue did not effect my company
Will it work on drives encrypted with Bitlocker? I'm guessing no. God bless!
@@NetworkAdminLifeno you would need an recovery key for that but any one in dev can build a shell script that can fetch the key if you have any supporting infrastructure for that
And all government facilities run on old IBM systems. It's pretty obvious why at this point. They wont be affected by their own corruption.
excellent point. main frames are never going to be obsolete. in this day and age of having powerful computers in our pockets and around our wrists, you would like main frames would be gone. but no. they are stable and very safe from most viruses , attacks and bad updates
Absolutely. No one knows how to write code for older IBM or even DEC VAX systems. Crowdstrike's days are numbered, I believe. God bless!
@@onlythestrongsurvive Yep. Newer isn't always better. But it is faster. God bless!
Never rely on one security company
It makes me so happy to find a good godly man in a random tech video. God bless you, dear brother in Christ. Peace and grace to you and your family. Amen.
Thank you brother! So glad it encourages you. It's a huge encouragement to me to find so many godly viewers out there! And it's fun to annoy the atheists. I know that's a character flaw on my part. I'll pray about that. God bless!
@@NetworkAdminLife You're being a testimony, and you're planting seeds, so that's not a flaw, but the gift of boldness. Keep piercing them with the sharp word of God.
@@Klementhor Hey brother I appreciate that more than I can tell you. Steel is sharpening steel at this very moment. For Christ's Crown and Covenant!
Northeast hospital Systems guy here...similar story...it was a crazy night...mine started at 3:30AM and went until Saturday. Also, fix that VNX! it has an amber light on the top shelf.
Yeah it was fun wasn't it. Every department thought they were most important and that we should start with them. FYI it went ED, Pharmacy, Radiology, OR, ICU, and then whoever screamed the loudest from there. I loved the folks who would say stupid stuff like "can you get me a new mouse while you're here?" Sure. I got nothing better to do. Not for nothing but you do realize the whole hospital is down right? God bless!
We were very fortunate in my IT group, we had only tested Crowdstrike a few months ago on a few test machines, but decided not to implement it. We did have to clean out a few Windows registries to clear out the remnants, but nothing like you guys had to do. All of our laptops have Bitlocker turned on, but not our workstations. I was very glad we didn't have Crowdstrike, as I was the night phone tech on duty on Friday!
Being the on-call guy when this hit was no fun at all. The clean up would have been really quick if not for bitlocker. On. Every. Machine. Ugh. God bless!
They pushed a driver-file (CS is installed as a driver and reads its code from those .sys files, it's kind of a backdoor) with all zero's in it and that caused the issue. Normally the deployment process should be secured in a thousand ways and software should be tested for ten thousand scenario's before carefully deploying it, region by region. To push untested (in every test this would have popped up) software to 8,5 million production systems that are with customers who put a enormous amount of trust in your company (since they have privileged access to their core systems) is such a sign of incompetence and a complete lack of internal processes or controls that I would kick CS to the curb immediately and terminate the contract. There is no excuse for this.
100% agree. Compared to CS, my company has a tiny number of people receiving automatic updates from us. Before any update goes out, it is tested for multiple days on several computers. Then we send it only to a small subset of users who have volunteered to be beta testers. It runs there for a week or two before we push it out to everyone else. Pushing out a worldwide update to millions of computers with little or no testing and no small-scale beta testing is insanely irresponsible. In addition to that, it's irresponsible for CIOs to allow a company to push kernel-mode updates without their knowledge, permission, and testing. What if CS had been hacked and ransomware was pushed out?
I'm still keeping corporate sabotage in the back of my mind. What if they did employ all the safeguards and the file got filled with zeros just prior to deployment. We'll probably never know what went down. God bless!
For 100th time in 3 days, For those of you who apparently still dont get it, It wasnt Microsofts update, It was CrowdStrike's update ... They pushed it into production rather than testing it first
But Linux.... Yeah, I love all these people saying I should be using Linux servers as if 1) that's my decision to make, 2) that I even manage the server environment (I don't), 3) that all large corporations have a crew of Linux admins just waiting in reserve to be sent in to save the day. God bless!
@@NetworkAdminLife Linux can have its own problems.
@@NetworkAdminLifeCrowdStrike has affected Linux as well, just not this time.
Yes, but everyone saw the MS bluescreen on the news. And my sympathy is lacking as MS enabled this Cloudstrike problem by treating a signed driver to execute psuedo-code masquerading as an unchanging part of the kernel.....so it maybe Crowdstrike, but MS will take the hit as well.
@@joerhodes658 You give users WAY too much credit. Executives will just uninstall Crowdstrike for their organization - because the news said CS did it.
Hehe, every time some duffus claims to me that vga is dead I point to the massive server room and laugh
I don't care one way or another which cable, just as long as it works as expected.
Yep. And most of them still have serial ports too. God bless!
Most of our Computers are Azure joins so it was really easy to find the BitLocker key. problem is I’m dyslexic and I had to enter in that long key and I kept messing it up 😭
Yeah we could find most of them in AD but not all. And then some PC's weren't labelled with their names so that was a pain. I hate bitlocker. God bless!
I decided to changing careers after getting my associates, I'm 29 doing comptia A+ and the whole trifecta (sec and net). Going to do the Azure cert too but I love learning this stuff, hoping to land on a helpdesk job here in Portland and getting my feet wet and continue to work my way up. Thanks for posting these kind of videos for another perpective in the enviroment. I love Mike Myers but you can only watch so many videos before I am wanting to be hands on. Thank you
You can do it. Just keep going. God bless!
must be the network, call the network guy lol
LOL! True. However, I was on call so I would be the right guy to call. And I called the server guy in. Luckily it was still early and he was still awake. It truly was a case of, we rebooted the server and fixed the network. Funny how many people were asking us how long it would take to fix the network. There were gritted teeth behind my smile. God bless!
Love that the CIO was willing to get his hands dirty with you guys.
Yes, we all thanked him on the after-action meeting we had. God bless!
Compared to CS, my company has a tiny number of people receiving automatic updates from us. Before any update goes out, it is tested for multiple days on several computers. Then we send it only to a small subset of users who have volunteered to be beta testers. It runs there for a week or two before we push it out to everyone else. Pushing out a worldwide update to millions of computers with little or no testing and no limited-scale beta testing is insanely irresponsible.
In addition to that, it's irresponsible for CIOs to allow a company to push kernel-mode updates without their knowledge, permission, and testing. What if CS had been hacked and ransomware was pushed out?
That's what we do for Microsoft patches. REALLY wish we could do that with Crowdstrike. God bless!
@@NetworkAdminLife This would be a great feature for CS. I have sent this ability in a feature request several times. I feel for you; I walked in at 5:00 AM to 110 end-user computers and all of my servers BSOD. We had everything going by 11:00 AM, only having to restore one MSSQL server from last night's backup. I have had Windows updates do far worse damage, as far as kicking CS to the curb, not happening. I still suspect internal sabotage or revision errors.
I work in IT in intel ireland, I knew it was bad when highups at the site were waiting for me at my desk when I walked in at 8am. That was a fun 14 hours of overtime
You know the best news I got was when I turned to the BBC News website (because ABC/NBC/Fox/CNN hadn't woken up yet, and learned that it was a worldwide problem. My first thought was, "Oh, thank God! I didn't cause this!" And from a Dutchman who is part Irish (isn't everyone), Éirinn go Brách and God bless!
Lol me as a network engineer working in a hospital hearing this is like i asked him telling my story haha
Ha! You're welcome! Kind of a weird feeling isn't it? God bless!
The medical doctors have a crash cart, and the IT doctors have a crash cart. Nice!
Yeah, seemed appropriate to call i that. God bless!
Obvious that due diligence was not performed by Cloudstrike, testing that update. I hope you have your legal dept filing an action.
That or corporate sabotage. God bless!
I still find it amazing that a hospital can no longer provide ANY essential services when IT goes down (name your incident here). I worked IT at a SoCal hospital and we had a network outage hit us one time. Same result...they closed the entire hospital down. Amazed how far medically we've advanced and yet everything is held in place by one pin/Jenga piece. You'd think that they have a backup system (backup to how to provide medical services with manual record keeping, not an IT backup) in place in case something serious happened over an extended amount of time.
The problem is records. What you suggest would lead to massive amounts of paper being printed daily and having to update them continuously just in case you ever need them. Furthermore, MRI and Xray machines all run on computers. So do modern dispensaries, autoclaves, lab equipment, ....
Of course they can still provide essential services. There are doctors and nurses to attend to patients. They are doing non-elective surgery. It's just hard to register new patients and do accounting.
This affected the user's desktops. What should the backup plan be for desktops? Our servers were back online within a couple of hours. They'll wait longer in the waiting room than that. God bless!
I appreciate the fact that you are willing to be in and help the situation. I know your title states a network admin and yeah well the network is fine, this a OS level issue aka software based. But yet you are willing to drive in and offer what you can to help. I've been at places where if its not a certain admin role and they arent affected they SIMPLY will not help or even assist. I give you kudos!
Situations like that it's all hands on deck. God bless!
Bitlocker is a pain in the arse! Network admins should have a global common code for all machines, and plugging in a USB key with it on should be enough. Fast access for just this scenario! Then run a script to delete the offending files.
"global common code" that would defeat the whole purpose of bitlocker.
Well this Network Admin doesn't work on desktops so if such a beast exists that would be mighty handy to have. If you know how to generate that I'd be interested in know. That would have been awesome. God bless!
More power to you guys. I truly feel blessed I was laid off a few weeks ago before this happened. Proud of all the IT folks currenly working this mess! I miss my job but not that much right now. :)
When I finally got home I told my wife if we have one more night like this I'm retiring. Getting too old for this crap! God bless!
Imagine the pressure those crowdstrike engineers must be under.
The pressure to.... release their updates prematurely?
Especially considering they probably won’t have a job at all after everyone and everybody sues crowdstrike into oblivion
Its the engineers who had systems thrown into a bootloop ruining their weekend that are under pressure. Crowdstrike is in damage control.
Clearly not enough to disregard pretesting, roll out on a Thursday night for Friday, and to roll out an update at a mass level instead of in waves. Complete and utter failure of quality control and complete idiocy
Pressure to test their updates instead of releasing them willy nilly.
How much you want to bet nothing will happen to crowdstrike
They will lose a lot of customers next year... So they will suffer
I understand where you're coming from but what happened can't just be swept under the rug. I don't see how Crowdstrike stays in business after this. The lawsuits alone... God bless!
@@SRPH1 Absolutely. God bless!
@SRPH1 They'll give another vendor a chance to brick their devices?
@NetworkAdminLife I agree, they will hurt but will it be from small customers or conglomerates? I have no idea
Why dont they change their company name to "Missile Strike"
Or Skynet LOL.
Ooh, that's a good idea. We should have a new name contest for Crowdstrike. Winner becomes Internet famous on my channel only! God bless!
@@NetworkAdminLife hey, btw. it's a memorable day for me too. It's my birthday last Friday.
Call them Clownstrike
We absolutely do! God bless!
The worst part, other than software meant to protect against cyber attack perpetuating the worst cyberattack in history, is that people died due to the multiple hospital shutdowns that required reroutes.
Yeah that sucks. I Have family that works for major utility that was affected. Not be able to respond to gas leaks and other shenanigans is also a yikes scenario.
Yeah, that's why were were working so hard to restore hospital systems. I mean, what if it was my wife or child that was on the way to MY hospital. This was akin to the sheep dog killing all the sheep. God bless!
@@aman4everchanged......129 Agreed. Whole lot of nightmare scenarios that most of us never consider. Shows we should always have a backup plan. God bless!
This is why AI cant take over since we always needs folks to to the heavy lifting
The AI can't fix itself it it blue screens. God bless!
the way night shift workers do it is blackout curtains
Black out curtains and a fan/noise machine
@@rmo9808 fair enough I just still use them as day shift so I didn't list them
Fun Friday. I work in IT for a company that owns a lot of fast food style restaurants… thank goodness we have a great team and had all machines back up before lunch. Won’t forget that one lol
That is awesome! Wish we had our back up that fast. God bless!
really glad i quit IT this year in january after 10 years. those days like july 19 will not be missed!!
IT already is aweful on a normal day but days like this are just hell. on top of that my company sucked so hard. a merge with another 3 companies made it just worse. only our boss got something out of it... until 50% quit.
awful
so what job u doing right now?
@@amano22 i work as a welder and mechanic again. I do manage some IT stuff in my new company but it is just easy stuff.
It is just one company with an on premise system i build.
They used to be a customer of mine.
That's awesome! God bless!
08:50 "Hospitals shout down" How many people died?
Don't know. They were diverted to an area hospital that did not use Crowdstrike. God bless!
I'm I wrong in my thinking. At every company we have a system admin team and they manage the patching for software so new versions for the software isn't pushed to production until it tested in a small testing group. If that was the case wouldn't all these issues have been avoided anyway.
We do that for Windows Updates. However, Crowdstrike manages that themselves. They push out updates in real time and there's no way to roll it out in phases. Sucks. God bless!
Also my son's birthday on the the 19th and as the security engineer I too pulled a triple with our team. This is a really bad look for all security teams. When I worked for a hospital it often took months to deploy patches to ensure due diligence was maintained prior to reaching critical systems. Crowdstrike has some serious reputation repair in store.
Well said. Happy birthday to your son! God bless!
I'm not your son, but I turned 42 on July 19th, the CrowdStrike Catastrophe Day. Happy birthday to him.
Funny enough... I am in medical I.T. and yeah.... busy busy day it was.
Well happy birthday to you too! Man, your birthday present sucked! LOL! We're still digging out and hope you've got your operation back up and running. God bless!
Thanks for making this video, this is the first time I am coming across your channel and was really interested to see how this situation was handled. I do have one question though, I never saw one of these bluescreens but sometimes they show the driver file, did it show anything related to CrowdStrike because from everything I was hearing from a lot of people is that it didn't, that must have been terrifying. Massive respect to everyone who has had to deal with this situation, you guys are the true hero's
"dave's garage" channel ex microsoft engineer has a great technical description of how it caused the issue
If you review the dump of the error, issue appears to be an object was referenced that doesn't exist.
Glad you got something out of the video. God bless!
This is Soooo classic; if you've been in IT for 20+ years.
There's nothing classic about this. All you did was just tell people you've not been in IT
@@KB-nt7eg This really isn't the first time an antivirus programed screwed up computers. I think that might be what they are referring to.
This is the first time it hasn't been Microsoft. Okay, I think maybe Norton AV screwed up maybe 15-20 years ago too. But usually we're backing out MS patches. God bless!
@@KB-nt7eg It was certainly new to the Crowdstrike CEO. Bro' was about to puke on live TV. God bless!
@@joebleed Yep. It's been a while though. Last one I remember was either Norton or Symantec AV and we had to run around and remove some virus definition files. God bless!
From all the videos I’ve seen, not a single one has had an easy fix like running a script to delete the file. This is one for the history books . Glad to see your up and running.
Microsoft just came out with q patch tool
I would’ve written a scripts. I would’ve created a little pxe boot to boot a small Linux with NTFS driver boot the little Linux mount the ntfs system partition delete the file restart and voila. And you can query the EntraID to get the keys from AAD.
I’m a true engineer/developer because I am too lazy to do things by hand.
As already mentioned they did come out with a repair tool. Sort of. You'll still need the Bitlocker key. God bless!
As another Tech in Support... I was only one in company to do the restoration... FRIDAY !!! Not Hard Fix... Time consuming
Must not have bitlocker enabled. I hope. God bless!
Crowdstrike CEO was probably losing millions in stock by the minute. No wonder he felt like throwing up.
Yeah, I've got to think there is going to be a class action lawsuit or two out of this. I don't see how Crowdstrike survives. God bless!
So, what's the most important lesson that you and everyone else learn from this?
There are multiple lessons here for those that want to learn them. There are also a few that , it would seem, people need to relearn. The primary one here would be that automatic updates are a rolling failure cascade waiting to happen.
I’m confused as to how companies allow updates without testing or phased rollout.
Apparently the way the package was configured bypassed the setting for don't install latest update... atleast that's what I've seen elsewhere.
@Agnemons
I have over 700 endpoints in eMAR across Canada. We had to disable the Windows Update service by changing the Run As to the disabled .\Guest user. We force them into UWF lockdown mode and stage our updates. The benefit is Windows 10 Enterprise LTSC runs like a toaster at optimum performance throughout the life-cycle of the device. No phone calls since 2016! Each image takes about 30-60 hours to develop which we do every quarter and is Level 2 CIS and HIPAA compliant.
I'd like to hear the answers too! God bless!
I hope our county thinks about that. God bless!
When I read to early alerts it made my head hurt ... grateful this wasn't a real hack. Cheers ! Thank you for sharing
We don't know why the update didn't work or why it was sent out without being thoroughly vetted. Could still be a hack. Crowdstrike will probably never tell us. God bless!
I wonder how many companies will sue.
Bring on the popcorn and shots😂😂
But, but, but....! They offered a $10 Uber Eats gift card to all of their affected clients! There, all better now, right? Right? Hello? Anyone? God bless!
Now that things have settled down, have you given any thought into ditching CrowdStrike? Issues such as this must make you question whether it's features outweigh the problems that it has caused.
Not my decision. I doubt the county will ditch it. But I'm sure many others will. God bless!
“Turned on my trusty laptop….blue screened….it does that sometimes”. Enough said!
It dual boots to Linux Mint but at that point I figured I better high-tail it to work anyway. God bless!
Modern day heroes didn't have rainbow hair or lumberjack beard, but they went to the heat tunnel *ON* Sundays.
We do what we can. God bless!
We were affected too but used it as a learning exercise. Did it suck, yes but we learned something that day.
I learned that I hate Bitlocker. I REALLY wish they had a master key that you could use to decrypt any disk. The NSA probably has that. God bless!
@@NetworkAdminLife We stored the bitlocker Keys for clients in their Password Manager so you can get it using the webclient. WE do not used it on Servers. Servers run 24/7 so Bitlocker make No Sense. And If someone took Hardware from your Server rooms you got bigger Problems.......
My company wouldn't allow me to even take a pic anywhere in the data center.
I've worked in those places as well. God bless!
That was fascinating to listen to, thank you 👍
Glad you enjoyed it! God bless!
Thanks for sharing your experience during this historic meltdown. It was very interesting to hear how a massive organization like a hospital handled this situation.
Glad you enjoyed it. The message of the day was all hands on deck. God bless!
We're a new CS customer...def a day for the books (IT for 25 years).
Then you should know you should not be using it.
@@paulbryan6716 Those decisions were made WAY above my paygrade!
Well, welcome to the family. Therapy sessions are on Thursday nights. Bring cookies. God bless!
Thanks for this fascinating insight into a day in the life of a Network Admin. You definitely had a day of hell there with the CrowdStrike fiasco. I remember going through some Bitlocker hell myself a few years ago. Not fun!
Yeah, it was fun. Still is. We're still doing mop up on the Monday after. God bless!
Love the war story brother. I was doing a VCF SDDC NSX update Thursday night when this fiasco went down. I thought I took everything down doing that, until I saw stuff outside of our SDDC effected and going down in Site 24x7.
Oh man I bet you were pooping bricks for a moment or two! I tell you, I was SO relieve when I heard this was a world wide problem. I kept thinking we had a virus or something. Had me wondering if I had opened something up on the firewall. God bless!
We had all hands on deck. I work on another non IT engineering team and fixed 80 Servers and pcs - I can't imagine what the rest of our IT team had to do. Wake up call at 4:55am.
It was definately a terrible, horrible, no-good, very bad day. God bless!
oh look another boomer IT dude. IT WAS YOUR FAULT. Own up to it, and apologize to your business.
We were lucky, it only hit our managed ERP, first call was 3:45 AM Eastern PROD system was back online at 09:45 PM Easter 17 hours, but we are back up. Keep up the good work Brother.
Thank you brother! God bless!
How long will crowdstrike stay in business?
I forget where I heard this and can't find it, but "It takes about 6 weeks for the generic public to forget tragedy."
I think it holds rather true in IT as well. Almost every large enough tech firm has had major screwups, and they're all (for the most part) still here. More often than not, the tech firms that go out of business are the ones who forget to evolve and adapt to technological advancements.
@@AKJordansKids2009 afaik Boeing is still in business
We'll see after the lawyers get done with them. God bless!
Dump CrowdStrike!
Wish I could. God bless!
So glad to be retired! I feel your pain, been there done that many times. So now have a few meetings and make it better to prepare for the next one.
I'm looking forward to the day when I never have to worry about Crowdstrike again. God bless!
What a nightmare! Bravo for getting a handle on the cause and fix so quickly. Exhausting though! Well done. Did you have to produce a Lessons Learned report to management? Best.
Well, management was out there with us. I don't particularly like my CIO but I have a lot of respect for him now. He was out in the trenches with us and put in more hours than I did. I think he's writing that lessons learned document on his own. I'm sure we'll have an after-action meeting. God bless!
this is not a networking problem, send this ticket back to help desk !
I said something very similar. The joke did not go over well. God bless!
Great overview from a real like network admin. Can't get any clearer, you've got your point across with respect to the criticality of what happens when your servers go down. Especially in hospitals.
No matter how secure and redundant your systems are, there is always a vulnerability that no one has thought of. God bless!
My employer ( thousands of ms endpoints) is all hands on deck. All weekend, still ongoing ( can you say bit locker). Kind of makes me glad i have a broken foot.
Dude, that's all I had to do was break my foot??! Wish I had known! Yeah, I'm still cleaning up the odd server here and there on the Sunday following. Crazy days. God bless!
Thanks for the video - important for historic and educational reasons!
I should at least get a T shirt for this. God bless!
@@NetworkAdminLife For sure! "07/19/24 - I survived the strike"
Being a recently retired Cyber Security Professional, all I can say is that Friday's events show a major computer system vulnerability in the World. I won't go into one of the possible scenarios, but it's not good.
I’ll answer “one” for you. It’s windows!
@@HopelessAutistic no it isnt windows. Microsoft had to allow access to the Kernel as 3rd party companies whinged that they didnt have access to it to write their drivers. This is the consequence of it. Crowdstrike a month ago released a dodgy update for linux so not all platforms are 100% bulletproof
My mind runs wild with possibilities on how much worse the timing could have been. God bless!
@@emilianoabarca I disagree. Microsoft made a cesspool of operating systems - this case maybe someone else’s. But the underlying issue is windows is a joke and is a fragile operating environment. You can get all literally all you want but you won’t change me. :P
@@HopelessAutistic that’s totally cool. I don’t have the slightest interest in changing your mind.
For me, our security desk called me at 7:30 in the morning because their acces control software couldn‘t connect. I was in the office at 8 and saw half of the PCs in our office building showing a blue screen, around 400 machines. For some strange reason, all 4 laptops in the IT office were fine although they are running 24/7, but I could not access much as the DHCP, DNS and other servers were offline, so that I needed to configure that with static settings to even start figuring out what exactly wasn‘t working.
We created a huge Bitlocker export file that day and shared that with people in the office and in home office so that everyone could fix their machine - this was the only scalable option and we rotated the keys afterwards.
This is one of those, "I remember where I was during the great CrowdStrike fiasco of 2024". We'll all remember. God bless!
So many young people have been complaining and dodging work. This is the time to be the hero! Yeah people always blame IT but if you do the right thing people recognize it over time
We did get some recognition this time around. But not until they realized it wasn't our fault. Our cafeteria sent over free pizzas, so there's that. God bless!
Cheers for the cool story.
Must have been a crazy day for the cio, fighting in the trenches, having to direct the troops and dealing with the politics from inside the organization.
Yeah, I think he grew up that day. God bless!
Happy Blue Screen Day! 😄
Lucky You!
We have over 5000 Servers in our Data Center and over 1200 Microsoft Azure Servers running various Windows Servers OS's WITH Bitlocker enabled.
What a headache! 😰😡😫
So you'll get some sleep, what, next month? Hang in there my friend. God bless!
Although my company was unaffected directly, some of our 3rd party service providers were impacted and we implemented some business continuity plans. Dodged a bullet. Already setup start of day meeting on Monday to go over continued impacts. Also need to review change control policy as well as software inventory to see if we have an similar IT risk that isn’t properly described and managed.
Yeah, I don't think anyone saw this coming. Business continuity plans have to now account for "what we don't know". Glad I only have to follow orders. God bless!
If your are IT Person, Our job or mission is to repair everything broken or not working + broken coffee machine as well
i almost started to say something against it, but then i remembered i've actually driven the coffee machines to repair, multiple times....doh
We actually have a broken coffee machine at work. But the cafeteria is right across the sidewalk so we just go there for coffee. God bless!
Definitely one of the most stressful times of my IT career... Crowdstrike really dropped the ball on this one.
Amen brother. God bless!
We got off CrowdStrike recently so we dodged that shit lol
Good on you for that! God bless!