The Crowdstrike Falcon Apocalypse - Here's how my night went

แชร์
ฝัง
  • เผยแพร่เมื่อ 6 พ.ย. 2024

ความคิดเห็น • 355

  • @2GuysTek
    @2GuysTek  3 หลายเดือนก่อน +1

    For all the armchair quarterbacks out there who are interested in how Crowdstrike came to make such a disastrous mistake, they’ve released a preliminary post-incident report. It’s worth the read: www.crowdstrike.com/falcon-content-update-remediation-and-guidance-hub/

  • @dosmaiz7361
    @dosmaiz7361 3 หลายเดือนก่อน +168

    RIP to all IT people dealing with bitlockered systems.

    • @2GuysTek
      @2GuysTek  3 หลายเดือนก่อน +18

      @@dosmaiz7361 Best have those recovery keys handy!

    • @xerr0n
      @xerr0n 3 หลายเดือนก่อน +3

      yeah, took bitlocker down globally until i could get a handle on that better, after a better active backup system and multi key storage im ready to get it back up again.
      i work on a tiny place though so its not too bad

    • @slomotrainwreck
      @slomotrainwreck 3 หลายเดือนก่อน +10

      @@2GuysTek I'm sorry, I have to say this.😆 "Oh! No problem! The BitLocker recovery keys are on a Post It stuck to the bottom of my keyboard"... 🤣🤣🤣

    • @S3ndIt13
      @S3ndIt13 3 หลายเดือนก่อน

      @2GuysTek yeah, they're on an offsite system that is also experiencing this issue. 😂🫠

    • @syte_y
      @syte_y 3 หลายเดือนก่อน +1

      A reference to lawnmower man ❤️

  • @colin2utube
    @colin2utube 3 หลายเดือนก่อน +83

    As a retired IT worker who went through Y2K and is still irritated by all those who said all the work we put in to prevent this sort of outcome was a waste of time and resources, I feel pained for those doing the firefighting, hopeful they get the recognition they deserve, but suspect they'll be in part scape goated for not having (unaffordable) automated backup and recovery systems in place.

    • @kirishima638
      @kirishima638 3 หลายเดือนก่อน +16

      Yes this annoys me too. I hear people saying Y2K was a scam, that nothing happened etc. Yes, because we worked hard to prevent it!
      The people who anticipate and mitigate potential issues such that they never happen NEVER get any credit.

    • @Paul_Wetor
      @Paul_Wetor 3 หลายเดือนก่อน +5

      The alarm had to be sounded so that action was taken. And because action was taken, nothing happened. Which made the problem seem to be much ado about nothing. Prevention happens out of public view. (The old Harry Chapin song "The Rock" is a perfect example).

    • @muhdiversity7409
      @muhdiversity7409 3 หลายเดือนก่อน +3

      So glad I was put in a position where I decided that retirement was better than dealing with more of this crap. I spoke to a former colleague and he said everything is down. This is a company with over 300K employees and the vast majority do not have physical machines because they are all accessed via VDI in massive data centers. Never has being retired felt so good. I hope they had a good strategy for all those bitlocker keys. lmao.
      I used to work with a guy who was forever causing production issues. Because things always failed in his code he was the one invariably fixing the bugs of his making. Unlike y2k this guy was seen as a hero because he was always saving the day. Humans are stupid. Seriously. Especially the management kind.

    • @kirishima638
      @kirishima638 3 หลายเดือนก่อน

      @@muhdiversity7409 this is like people who are lauded for making the most commits on gitlab, when in reality they are fixing hundreds of mistakes.
      Good developers make fewer commits, not more.

    • @mrlawilliamsukwarmachine4904
      @mrlawilliamsukwarmachine4904 3 หลายเดือนก่อน +1

      Maybe this was the Great Reset and we’re in 2000 now. 😝

  • @johnp312
    @johnp312 3 หลายเดือนก่อน +53

    Working for a company that's still recovering from a cyberattack (we switched to CrowdStrike in the wake of that), this was the worst nightmare I could wake up to.

  • @msmithsr01
    @msmithsr01 3 หลายเดือนก่อน +26

    I have to admit, this took me back to my IT days 20 years ago and i literally had chills up and down my back. My heart goes out to all the key boards warriors who had to go through what they've gone through and those who are still plugging away. There's got to be a better way and of course that's the same thing we said 20 years ago...

  • @JustWatchMeDoThis
    @JustWatchMeDoThis 3 หลายเดือนก่อน +52

    And suddenly, we went from no cash to a paper sign that says cash only.

    • @2GuysTek
      @2GuysTek  3 หลายเดือนก่อน +14

      I love this statement. There was a moment, during the *WTF IS GOING ON?!* stage of this whole thing, that it felt like things were going all Mr. Robot reeeal fast.

    • @todd.mitchell
      @todd.mitchell 3 หลายเดือนก่อน +3

      Brilliant observation

    • @realfingertrouble
      @realfingertrouble 3 หลายเดือนก่อน +5

      That actually happened in shops here,. ATMs were out and the merchant systems were borked. Cash.
      We really need to keep cash around.

    • @peppigue
      @peppigue 3 หลายเดือนก่อน

      ​@@realfingertroubletwo or more vertically disjunct payment infrastructures would suffice. not trivial, but we're talking essential infrastructure here. analog backup infrastructure will be much more expensive to set up

  • @lampkinmedia
    @lampkinmedia 3 หลายเดือนก่อน +15

    I appreciate you guys and gals. I work for a small IT company out of Alaska. We were not affected thankfully but I can't imagine the mass scale of this outage and the tedious manual work that will have to be implemented
    in order to get all systems back up. I'm in the sales dept. I bring in new clients for the company I work for. I've always had respect for Tech. I'm somewhat of a nerd/ sales guru. and appreciate the work behind the scenes
    that keeps systems running smooth and safe.

  • @MicroAdventuresCA
    @MicroAdventuresCA 3 หลายเดือนก่อน +50

    We weren’t affected, but in the end it’s just dumb luck that we chose a different EDR product. Lots of lessons learned today, and I’m sure there are going to be a lot of great discussions in the coming weeks. Sleep well! You deserve it.

    • @quantumangel
      @quantumangel 3 หลายเดือนก่อน +3

      Just use Linux.
      Seriously, it's way better and doesn't have this kind of cr*p.

    • @ichangednametoamorecringyo1489
      @ichangednametoamorecringyo1489 3 หลายเดือนก่อน +9

      ​@@quantumangel bro this isn't a Linux > Windows issue.

    • @J-wm4ss
      @J-wm4ss 3 หลายเดือนก่อน +4

      ​@@quantumangelcrowd strike broke Linux machines a while back lol

    • @quantumangel
      @quantumangel 3 หลายเดือนก่อน +2

      @@ichangednametoamorecringyo1489 No this is a "windows is so bad it broke critical infrastructure" issue.
      It's a "windows is s terrible os" issue.
      It's a anything > windows issue.

    • @quantumangel
      @quantumangel 3 หลายเดือนก่อน +2

      @@J-wm4ss ​ and yet it didn't disrupt half the world like windows did; even though *most servers run Linux.*

  • @gloryhalleluiah
    @gloryhalleluiah 3 หลายเดือนก่อน +1

    It’s a Nightmare, you wouldn’t want it on your worst enemy…

  • @dragraceold2new
    @dragraceold2new 3 หลายเดือนก่อน +1

    Crowdstrike is a proud world economic forum company. The plan for a global cyber attack isn't to have someone attack you but for the main cyber attack prevention company to actually be the point of failure in replacement of a "global cyber attack"

  • @Reformatt
    @Reformatt 3 หลายเดือนก่อน

    Seeing so many machines down all at the same time was definitely the craziest thing I have ever seen in my 25+ years in IT

  • @foxale08
    @foxale08 3 หลายเดือนก่อน +25

    Something like this was inevitable in a world where enterprises don't control/delay updates and leave it to the vendors.

    • @2GuysTek
      @2GuysTek  3 หลายเดือนก่อน +5

      This isn’t the first time I’ve heard this since the incident. I remember back in the day with Symantec Enterprise that we’d have to ‘bless’ the definitions released. But these days that’s just not how EDR/AV works. It’s part of the promise of these next-gen tools is that you benefit from their quick actions in response to events happening in the security space. And yes, that’s exactly how we got here unfortunately.

    • @Craig-ml8nw
      @Craig-ml8nw 3 หลายเดือนก่อน +7

      I'm pretty sure this was not in a sensor update. This was something Crowdstrike pushed out to everyone. The bulk of my environment is set for N-2 sensor updates. So theoretically I should have avoided this. That didn't happen we got zapped everywhere except the PCs that were turned off overnight.

    • @shaochic
      @shaochic 3 หลายเดือนก่อน +3

      When I used to run McAfee EPO back in the day, we had a standard 12 hour delay from new DAT released to the first push to our environment. Saved us once on a bad update deleting system files on servers.

  • @TomNimitz
    @TomNimitz 3 หลายเดือนก่อน +4

    The first thing I thought about when I heard about the outage was the line "Skynet IS the virus".

  • @Craig-ml8nw
    @Craig-ml8nw 3 หลายเดือนก่อน +25

    My night/day started at 11:41 PM central time. Finished at 7:30PM tonight

    • @2GuysTek
      @2GuysTek  3 หลายเดือนก่อน +2

      @@Craig-ml8nw you earned your rest!

  • @druxpack8531
    @druxpack8531 3 หลายเดือนก่อน +18

    right there with you...worked from 3:30AM to 8:00PM, with another long day tomorrow....

    • @2GuysTek
      @2GuysTek  3 หลายเดือนก่อน +3

      Stay strong, sleep well!

    • @simonmasters3295
      @simonmasters3295 3 หลายเดือนก่อน

      Give it up! You worked hard for one day, you didn't have a clue what was happening and you were stressed. And you have the same tomorrow...
      ...welcome to our world.

  • @RobertFoxL
    @RobertFoxL 3 หลายเดือนก่อน +1

    OUCH! 😮 All in the name of "Security" . . . Now we see how fragile our systems really are - and how dependent we are in today's technology! 🤐 Now we also know that major events aren't always caused by bad actors!! 😬 Stay Vigilant and Keep Safe !! 😷 #majorincident #dependencies #securityglitch

  • @freddiecrumb77
    @freddiecrumb77 3 หลายเดือนก่อน +7

    We had Azure VM and Microsoft suggested doing a disaster recovery process, which includes changing the region, etc. I was determined to not mess with the VMs because they are set up very specifically to the apps we had - basically I was hoping Microsoft would be willing to fix it, since I didn't do anything to break it. 45 minutes later it came on, I was glad that I waited.

  • @raydall3734
    @raydall3734 3 หลายเดือนก่อน +64

    Yesterday was a big win for CrowdStrike. Finally a virus protection program that disabled the most prolific spyware program on the internet - Microsoft Windows.
    No Linux/Mac products were harmed.

    • @2GuysTek
      @2GuysTek  3 หลายเดือนก่อน +8

      Ba dum tis! 😂🤣

    • @franknoneofya9585
      @franknoneofya9585 3 หลายเดือนก่อน +4

      No didn't happen but could have happened. Crowdstrike has agents for all major OS's. Also if you have any type of reputable people, that push would have never occurred. No matter what company, nothing gets pushed into your environment without thorough testing.

    • @reaperinsaltbrine5211
      @reaperinsaltbrine5211 3 หลายเดือนก่อน +4

      Old adage: "Windows is not a virus: viruses DO work"

    • @reaperinsaltbrine5211
      @reaperinsaltbrine5211 3 หลายเดือนก่อน

      Not as if Android or IOS would do any narcing on their users......

    • @tibbydudeza
      @tibbydudeza 3 หลายเดือนก่อน +7

      Actually it happened to Linux servers in April as well … did not make the same headlines.

  • @noahz
    @noahz 3 หลายเดือนก่อน +11

    Happy to learn that I'm not the only one who thought of Lawnmower Man 🤓

  • @boringNerd
    @boringNerd 3 หลายเดือนก่อน +5

    Thanks for sharing your experience. I am not working in IT, but I do work in the tech industry. I am lucky my company don't use Crowdstrike, and so far in my country, only the airport and a few airlines were affected. I am not travelling, so I just went about with my day. To everyone involved in fixing this mess, thank you. I try my best to explain to my friends and family what is going on, and I have been emphasizing the situation the sysadmins, IT helpdesk staffs are facing, with the occasional F U to Crowdstrike. Everyone in IT should be appreciated more and I can only hope this can be studied every where and something can be done to prevent the same thing from happening again. Remember it is not just the good guys learning from this outage, I am pretty sure the bad guys, the thread actors are also learning from this.

  • @Cysecsg
    @Cysecsg 3 หลายเดือนก่อน +3

    Every organisation’s IT teams find solace in each other demise thinking they are not alone to face this sh!t lol

  • @shaunerickson2858
    @shaunerickson2858 3 หลายเดือนก่อน +3

    The rather large global company I work for, doesn't use Crowdstrike, so, amazingly, I've been able to sit back at eat popcorn, all while saying "there but for the grace of God go I". I'm so glad I dodged this bullet.

  • @Vdiago
    @Vdiago 3 หลายเดือนก่อน +39

    Lets move everything to the cloud!!! Great idea! 😂😂😂

    • @cbdougla
      @cbdougla 3 หลายเดือนก่อน +7

      It's not just the cloud. It was just about every Windows server running Crowdstrike including local, physical machines.

    • @mrman991
      @mrman991 3 หลายเดือนก่อน +4

      While I agree that using one tool for all things is a bad idea, this isn't that kind of situation.
      Mistakes and accidents happen, expecting anything to be 100% all the time isn't realistic.
      How people react and fix the issues is always what matters way way more.
      "Cloud" has its uses, mostly thats if you want burst or global infratructure, or just have money you want to get rid of.
      This impacted on-prem just as much as cloud stuff though.

    • @quantumangel
      @quantumangel 3 หลายเดือนก่อน +3

      Actually, the cloud runs mostly on Linux, which is why it was largely unaffected.

    • @Vdiago
      @Vdiago 3 หลายเดือนก่อน +2

      @@quantumangel i was mainly referring to office 365. Anyway, you are right. There is no fucking cloud. Its only someone's else Linux server.

  • @jcpflier6703
    @jcpflier6703 3 หลายเดือนก่อน +7

    Fantastic video! I loved how you told the story of how it began at your org. That's exactly how I felt. What I suspect is going to come out of this is that CrowdStrike may have to segment its solution updates by industry. This failure was massive and even though they will check it countless times going forward, my guess is that companies will come forward and say that this cannot happen again. Just the fact that they test, test, test, and re-test going forward will not be enough. We saw a lot of side channel attacks coming in the form of phishing. While the event was not malicious, it did have somewhat the same net result and that is unavailability of services. What's also got put out front and center is, all the major companies that use Crowdstrike. I am sure that going forward they will test the hell out of their updates, but segmentation by industry is probably where they will need to go for rolling out global updates so that things like Healthcare or Travel are not impacted at the same time in the event of an issue. This will also place more focus on Microsoft Azure. While it's not Microsoft's fault, we will need to start planning for DR scenarios that we once thought were pie in the sky issues. Now those pie in the sky scenarios are real. Very real.

    • @2GuysTek
      @2GuysTek  3 หลายเดือนก่อน +1

      I appreciate it! We also saw a MASSIVE jump in targeted Crowdstrike phishing emails-don't let any good tragedy go to waste, I suppose. I also agree that there needs to be more opt-in for updates, like the update rings for betas of Windows, for example. It's a tough position to be in because, in the event of aggressive malicious attacks, you wanna get your protection ASAP, and that typically means pushing it as soon as you can. But the reach and the fact that Crowdstrike only services businesses (and big ones at that) means that from a supply chain perspective, it's also an incredibly massive risk, as we've now seen first-hand.

    • @kennethjones8683
      @kennethjones8683 3 หลายเดือนก่อน +3

      CrowdStrike should have eaten their own dog food before releasing this to the public. Play stupid games - win stupid prizes...

  • @TraciD512
    @TraciD512 3 หลายเดือนก่อน +5

    This is an EXCELLENT RECAP
    Thank YOU for explaining it for regular folks.

    • @2GuysTek
      @2GuysTek  3 หลายเดือนก่อน

      So glad you liked it! I appreciate the comment!

  • @robrjones99
    @robrjones99 3 หลายเดือนก่อน +7

    It was a rough day for many for sure. I work for a university and it impacted a LOT of our production systems. I agree it's the best on the market. I think there may be some diversification in the platforms companies use after this, but I can't see there being any mass defections. Just my less than 2 cents. I'm tired too and I hope the weekend for everyone get much better. I appreciate your content and have learned a lot.

  • @Banzai431
    @Banzai431 3 หลายเดือนก่อน +8

    We worked in to Saturday morning man. It was brutal. When it was first happening I was like you. Immediately thought it was some kind of attack. Then I went online and saw it was happening everywhere. Crowdstrike's valuation is going to be interesting on Monday when trading starts again. There's still a whole bunch of people on remote laptops that are locked out due to Bitlocker. This is going to be such a PITA.

    • @2GuysTek
      @2GuysTek  3 หลายเดือนก่อน +3

      100% with ya! Not a great way to start a weekend - I hope you get some sleep!

  • @keithstarks1433
    @keithstarks1433 3 หลายเดือนก่อน +4

    This was like listening to some strange dude describe my Friday.

  • @jimward204
    @jimward204 3 หลายเดือนก่อน +1

    As a non-tech person, I knew it looked bad yesterday when I started looking at flight delays. Hats off to you and all of the other IT folks dealing with this nightmare!

  • @shawnadams1965
    @shawnadams1965 3 หลายเดือนก่อน +4

    I feel your pain. Symantec Corporate Antivirus did this to me years ago... I had to boot over 100 computers using a USB stick with a fix on it and I was the only IT person in the company at the time. We got rid of it within the next week and switched to Sophos. "Knock on wood" that never happens again.

  • @OriginalWaltSilva
    @OriginalWaltSilva 3 หลายเดือนก่อน +4

    Great video and thanks for sharing your "horror story". I'm a retired IT guy, and still very much like to keep an ear to the ground on IT matters. When this event happened, I was hoping someone who was in the weeds in handling this event within their org would post their story so I can hear details. Having been through similar (albeit less widespread impactful) events, I can 100% relate. Still morbidly curious, I guess.

    • @2GuysTek
      @2GuysTek  3 หลายเดือนก่อน +1

      You're on the flip side! You get to enjoy it from the pure curiosity perspective!

  • @Consequator
    @Consequator 3 หลายเดือนก่อน +7

    If you have VMWare, you could fire up a VM that has PowerCLI, shut down all the windows VM's, then loop through every disk image to mount it on the new VM, delete that file and unmount it again.
    PROBABLY best to use a Linux VM for this as Linux is far 'nicer' when it comes to mounting and unmounting drives.
    I'm GUESSING other hypervisors would have similar tools.
    HyperV shops will probably be an even bigger level of screwed with dataloss due to the hypervisors themselves resetting.
    Then there's next level screwed if bitlocker is in use.

  • @jeffolson4731
    @jeffolson4731 3 หลายเดือนก่อน +8

    The company I work for was hit hard. To make matters more fun some of our smaller locations, like mine, don’t have any IT personnel. That meant that 2 of us who know computers had to get all of them working once we were given instructions. On my computer I have local admin rights so on that computer I was able boot into safe mode and delete the file. On all other computers we either had to PXE boot or go through the Microsoft menus to get to the command prompt.
    We still have 3 computers with Bitlocker that we cannot access. No one in the company has the correct access code.

    • @2GuysTek
      @2GuysTek  3 หลายเดือนก่อน +1

      We had a few workstations in that same boat as well.

  • @DaleCunningham_DBA
    @DaleCunningham_DBA 3 หลายเดือนก่อน +8

    Got my PROD SQL Servers back online within 6 hours! 300 TB Apocalypse!!!

  • @usefulprogrammer9880
    @usefulprogrammer9880 3 หลายเดือนก่อน +5

    To provide greater clarity, the primary issue here is that cloudstrike falcon runs at kernel level and has internet connected privileged binaries so as to provide rolling updates to threat protection offerings. This was and always has been a massive attack vector that’s yet to have been exploited at a high level. It just so happens to have exposed its vulnerability to the world through incompetence. If Microsoft was half intelligent they’d step in and implement something similar to Apple who provides a system extension abstraction layer so as to not give direct kernel access. They will not do this, because they are stupid arrogant and lazy. Instead, I think most critical organizations should steer clear of crowdstrike and similar kernel level applications with internet dependent binaries. Even if that leaves you slightly more vulnerable to other threats.

    • @melindagallegan5093
      @melindagallegan5093 3 หลายเดือนก่อน

      System extension abstraction layer? Is this also a reason why Mac OS is less prone to acquiring viruses?

    • @zemm9003
      @zemm9003 3 หลายเดือนก่อน

      It's even worse because now everybody in the world knows that CS has this vulnerability and all hackers will be looking into hacking Crowdstrike because by doing so they can easily access your PC.

    • @felixkraas2425
      @felixkraas2425 หลายเดือนก่อน +1

      I am responsible for our infrastructure where I work. And I was one of the people with a box of popcorn and a big "told you so" grin on his face when this went down. What you said is exactly the reason we will never have something like Falcon on our systems. Something I had to have lengthy arguments with the boss about.

  • @stevemeier7876
    @stevemeier7876 3 หลายเดือนก่อน +12

    I work for a large company that use to use crowstrike...we removed and replaced with MS E5 License Security Stuff when we went to e5 as a cost thing ..... But we had 40 machines still affected as even though it was removed we had had instances of crowdstrike re installing itself...real bizarre ...those 2 Servers have been fixed and the other PC's for users will be fixed when Desktop Staff come in Monday as they were at remote sites...what a mess.....

    • @xerr0n
      @xerr0n 3 หลายเดือนก่อน +2

      ok, so now i am thinking that crowdstrike really is more malware at this point, i don't like sticky stuff.

    • @jrr851
      @jrr851 3 หลายเดือนก่อน +2

      Id audit your GPO. Might be an old policy kicking around reinstalling Falcon.

    • @2GuysTek
      @2GuysTek  3 หลายเดือนก่อน +1

      @@jrr851 This is exactly what I thought too, or SCCM collection.

  • @SkandiaAUS
    @SkandiaAUS 3 หลายเดือนก่อน

    I'm a contractor to a large Australian retailer and I have to say they have their shit together. Had thousands of POS systems back up over the weekend and then the hundreds of the support laptops by mid morning today. Came down to them being easily able to report on the Bitlocker keys, and admin passwords to gain access to the CS folder.
    I have a new found respect for disaster preparedness and recovery.

  • @sinisterpisces
    @sinisterpisces 3 หลายเดือนก่อน +1

    Thank you for taking the time to put this together when you're clearly so exhausted. This is exactly the kind of content we need after an event like this, from real professionals who are there. There's too many "lol why didn't they just ..." Reddit Keyboard Commandos running around making me wish I had a way to smack people through my screen. Respect.
    One thing I still don't understand is how this actually got out of Crowdstrike's development systems (?) and into their production deployment systems. I realize that one of the big benefits of CS is its rapid response nature, I think backed in part by ML (?), but surely there should have been some sort of internal unit test package or an equivalent to catch a killer update before it was pushed?
    High speed automation with no QC is an invitation to exactly this sort of disaster.
    Given how many medical/hospital systems were impacted, including in some locations EMS helicopter service, CS will be lucky if they don't get sued for contributing to someone's death or injury.

  • @timwhite264
    @timwhite264 3 หลายเดือนก่อน

    Great recap of your very real world experiences! Thank you for sharing sir!

  • @bbay3rs
    @bbay3rs 3 หลายเดือนก่อน

    That sucks, sorry to hear. Im glad I didnt have to deal with it but Ive been through a malware event and know the feeling. Hang in there!

  • @SuburbanBBQ
    @SuburbanBBQ 3 หลายเดือนก่อน +2

    My heart goes out to everyone affected. Way to hang in there, Rich! We were not impacted, but know quite a few shops here in Houston that were.

    • @2GuysTek
      @2GuysTek  3 หลายเดือนก่อน +2

      We live in unprecedented times my friend!

  • @maryjohansson3627
    @maryjohansson3627 3 หลายเดือนก่อน

    Very clear and concise. Thanks so much for the video explaining what has to happen to delete the file. What a task to have to touch every machine!

  • @mrman991
    @mrman991 3 หลายเดือนก่อน +6

    Sure was a day for us.
    Luckily, by the time we got to the office the fix was know.
    It could have been a hell of a lot worse.
    Our DCs were good, a few servers went down but came back quickly.
    Then we trawled through all the desktops over an 8 hour stint.
    As IT, all we got from the business was praise because we had constant coms and updates running throughout the day and offloaded a bunch of the reporting responsibilities to the team leads rather than getting all users to make lots of noise reporting.

    • @2GuysTek
      @2GuysTek  3 หลายเดือนก่อน +2

      We had great support from our leadership as well - course, it wasn't our faults either, which means blaming IT wouldn't have done anyone much good.😂

  • @pauldunecat
    @pauldunecat 3 หลายเดือนก่อน +3

    Best vid so far on this topic. We had a long day as well, I got called at 3am my local time, got to take a nap at 5pm.

    • @2GuysTek
      @2GuysTek  3 หลายเดือนก่อน +2

      I appreciate it. There are a lot of videos out about this on TH-cam, and it’s pretty obvious that many of the people talking about it don’t have the first-hand experience in actually working it. Get some rest, you earned it!

  • @lak1294
    @lak1294 3 หลายเดือนก่อน +1

    Really great boots on the ground, "this is how it went (down)" account. Surreal. Thank you!

  • @GaryL3803
    @GaryL3803 3 หลายเดือนก่อน +4

    I understand from reliable sources, IT techs that understand servers, laptops and such, that the reason that only Microsoft systems were affected is that Falcon must communicate with the Microsoft kernel, the lowest level of the Operating System. Corruption in the kernel causes failure of the whole OS and must be restored in safe system mode. Linux and MacOS operation systems do not allow applications to communicate directly and so cannot be corrupted by a wayward application. Only the wayward application will fail on Linux and MacOS and could be easily rolled back to the original application.
    The Mac OS was rewritten some years ago to a Linux/Unix type architecture. I wonder if there is any kernel protection that will be implemented in the Microsoft OS anytime soon.

    • @zemm9003
      @zemm9003 3 หลายเดือนก่อน

      Windows provides awesome backdoors for the NSA/CIA to exploit and I believe this is the reason it was never changed.

    • @GaryL3803
      @GaryL3803 3 หลายเดือนก่อน

      @@zemm9003 You neglected to mention the FBI, DIA, DHS and ??? 🙂

    • @zemm9003
      @zemm9003 3 หลายเดือนก่อน

      @@GaryL3803 maybe. But I wouldn't know about that and mentioning them would be controversial. Therefore I mentioned only agencies that are well known to have exploited Windows' vulnerabilities in the past.

  • @edwinrosales6322
    @edwinrosales6322 3 หลายเดือนก่อน

    Thanks for sharing your experience; hope you can get some sleep!

  • @markusdiersbock4573
    @markusdiersbock4573 3 หลายเดือนก่อน +1

    Really, the World-Wide Fail wasn't caused by the bad update.
    The FAIL was in rolling out the update to EVERYONE at once -- a blitz. Had they done a Canary deployment and slow-roll, the problems would have tiny

    • @2GuysTek
      @2GuysTek  3 หลายเดือนก่อน

      100% I suspect there will be some significant changes to their deployment process for updates moving forward.

  • @Bookcity300
    @Bookcity300 3 หลายเดือนก่อน

    Good job -first person POV. Thanks!

  • @heliozone
    @heliozone 3 หลายเดือนก่อน

    Automated updates are a time bomb waiting to explode. Good luck, because you will need it!

  • @shlomomarkman6374
    @shlomomarkman6374 3 หลายเดือนก่อน +1

    I'm an employee at a small company and had this crap on Friday morning. Our IT guy was abroad in vacation and he got the choise between finding the next flight or sending each employee the admin password by private e-mail. He chose to give us the passwords and change them when he comes back.

    • @2GuysTek
      @2GuysTek  3 หลายเดือนก่อน

      That's a way to do it. Desperate times call for desperate measures I suppose! :-)

  • @gold_7944
    @gold_7944 3 หลายเดือนก่อน +1

    The ITasS company I work for fielded 140 calls had 4 simultaneous war rooms on of which wrapped up at about 1 am this morning

  • @kirishima638
    @kirishima638 3 หลายเดือนก่อน +2

    This was the result of Crowdstrike’s incompetence but Microsoft is also to blame for not adequately protecting their OS at the kernel level.
    No 3rd party software, no matter the source, should be able to root-kit an OS like this, requiring manual deletion of files like it’s the 1990s.
    This kind of failure is not possible on MacOS for example.

  • @nathangarvey797
    @nathangarvey797 3 หลายเดือนก่อน +2

    I was lucky that my org is currently only using crowdstrike for a few non-critical systems. I feel for you, and hope you get some recovery rest
    It will be interesting to see what lessons the industry (and world in general) take from this.

  • @jacksonrodrigobraga3942
    @jacksonrodrigobraga3942 3 หลายเดือนก่อน +3

    4k VMs, 3,5k notebooks, 4 gallons of coffee, 3 pizzas aaaaaannnddddddd counting........... (oh.. 2 keyboards too, RIP F8 key 😞)

    • @reaperinsaltbrine5211
      @reaperinsaltbrine5211 3 หลายเดือนก่อน

      Why don't you use one of them old HP keyboards? Those last for eternity and a day :D

    • @2GuysTek
      @2GuysTek  3 หลายเดือนก่อน

      I was on a steady diet of Redbull myself! Best of luck and I hope you get some sleep soon!

  • @ChriaraCass
    @ChriaraCass 3 หลายเดือนก่อน

    Really relate to your story. I’m in Australia so it hit our IT team at Friday 3pm and I’m in development (not ops) and my surface crashed, then my colleagues did. Reported it to my boss then walked away while it rebooted. In the bathroom a woman from a totally different company sharing the building with us tells me “all our pcs crashed so we’re leaving for early drinks at the pub 😂”. I was so relieved! If I’d seen the SQL servers drop first before my laptop that wouldn’t have been a fun moment to live in at all.

  • @linh8997
    @linh8997 3 หลายเดือนก่อน +4

    Lol. I am just a retired low-level PC Tech. When I got out of bed yesterday I heard the words 'bad update' on tv, so I ran in and quickly did an rst r u i on my own desktop. But apparently I was not even affected, anyhow. But the machines at the hospital where I work as a housekeeper are, and they are all bitlockered VMs. And they have a tiny IT Department with inexperienced techs. That Hands-On solution is not too difficult, as long as you can get the key. Should I offer them my services on a per system basis? More fun than cleaning toilets ! 😅
    (Ouch. Poor you!! )

    • @mikedee8876
      @mikedee8876 3 หลายเดือนก่อน +1

      I am also retired IT currently cleaning stalls in a nearby dude Ranch.....nothing could drag me away from this cushy, no brain, paid exercise job in paradise....to return to a workplace of pure stress......dont go back, is my suggestion.

  • @CitiesTurnedToDust
    @CitiesTurnedToDust 3 หลายเดือนก่อน +1

    FYI if you were creating new vms and attaching disks you could have scripted that part. Think about using powercli next time

    • @2GuysTek
      @2GuysTek  3 หลายเดือนก่อน

      Agreed. There are now numerous ways to recover from the situation, but at the time this was the guidance directly from Crowdstrike.

  • @Vfnumbers
    @Vfnumbers 3 หลายเดือนก่อน

    Worked through from 11pm to 10am
    Was right in the middle of it as all out monitors vms, pc, db, all bsod.

  • @tma2001
    @tma2001 3 หลายเดือนก่อน +4

    The more I read about it the more insane it becomes - it wasn't a simple bug but literally a driver file of all zeros! How the hell wasn't it hash checked by the client update process before installation and requesting a reboot ? Or how the hell did ClownStrike update push server not also do basic sanity checks before sending it to _every_ customer at the same time ? /facepalm

    • @mikedee8876
      @mikedee8876 3 หลายเดือนก่อน +1

      to an X86 Machine coder....00 is a NoOp......NoOp means no operation, and the hardware processor goes to the next instruction, which would also be a NoOp and so on until it hits code that does something. NoOps are often used in software timers where the processor loops on NoOps for a certain count, such as a millisecond delay to de-bounce a keyboard.......a processor looping on NoOps will do so forever at full speed until the power goes off.......NoOP is built into the hardware, and I assume, still cannot be changed.................not sure if this helps

    • @tma2001
      @tma2001 3 หลายเดือนก่อน

      @@mikedee8876 on x86 opcode 0x00 is a ADD instruction, NOP is 0x90.
      According to CrowdStrike Blog technical update, the .sys file is not a kernel file but a data config file. What little detail they give is not very clear as to the actual logic error that caused the crash (others have mentioned file parsing logic that resulted in a null ptr memory reference from the crash dump log).

  • @dirkfromhein
    @dirkfromhein 3 หลายเดือนก่อน +3

    The only big question is why are drivers for windows not signed and then validated by the OS before being installed/loaded? That just seems so obvious. You would think there would also be a transaction log for most recently installed “system” stuff and if more than three reboots are triggered it would rollback the install log and quarantine the most recently installed items. That’s how I’ve designed patching systems in the past for mission critical infrastructure software systems. I had no idea this many people still used windows! I’ve not touched windows for 25yrs.

    • @reaperinsaltbrine5211
      @reaperinsaltbrine5211 3 หลายเดือนก่อน +3

      Agreed on the patch and update part. Even though this is not specific to Win, I believe. As for why so many use it...well there is a huge swath os programs that only runs on DOS/Windows. I've seen DOS software still in production that was written over 20 years ago. We still have XP and W7 machines because they are connected to costly equipment whose drivers don't work with anything newer :(. Also most of the consulting and law firm people have tons of excel files so filled with custom macros that switching is pretty much a nonstarter :/ This amount of technological debt will bita us in the bottom sooner, rather than later.

    • @dirkfromhein
      @dirkfromhein 3 หลายเดือนก่อน +1

      ​@@reaperinsaltbrine5211 Oh I have the same actually - I have two XP machines (not networked) that run the software to control my SAAB and another one to connect to my 350Z. 😆 I'm pretty sure the SAAB software is not going to be updated.

  • @joelrobert4053
    @joelrobert4053 3 หลายเดือนก่อน +2

    Yesterday was like a digital nuclear bomb going off

  • @mwatson536
    @mwatson536 3 หลายเดือนก่อน +1

    Did not have issues at my site since we don't use Falcon, but my son's company did. I did not know that Azure did not have a console that sucks, and I was just toying with looking into it more. Whew, I dodged a bullet there. This reminds me of the MS service pack nightmare, which I did have to pull long days for. Good luck, folks. I hope you all get some sleep soon. Cheers

  • @mathesonstep
    @mathesonstep 3 หลายเดือนก่อน +1

    I wonder how long this will take to fully recover from, think about all the laptops that will probably have to be shipped back to the office and the kiosks that are hard to gain physical access too

  • @stevengill1736
    @stevengill1736 3 หลายเดือนก่อน +1

    Wow, you got a front row seat....to me the fact that this was even possible indicates how fragile things are. What do you think we can do to prevent such a catastrophe in the future?

  • @joelrobert4053
    @joelrobert4053 3 หลายเดือนก่อน +1

    I’m a cybersecurity analyst and my company was on high alert yesterday as we assumed initially this was a cyberattack

    • @2GuysTek
      @2GuysTek  3 หลายเดือนก่อน +1

      Maaan, for a while we all were! I was about to call together our IRT!

  • @richt6353
    @richt6353 3 หลายเดือนก่อน +1

    Thank You for this EXCELLENT ANALYSIS!

  • @DawnofFab
    @DawnofFab 3 หลายเดือนก่อน +3

    Co. I work for still has over 12k users to bring back online. It's been a nightmare going on 48 hrs now

    • @2GuysTek
      @2GuysTek  3 หลายเดือนก่อน

      I feel for you! Here's hoping for a speedy recovery!

  • @SkaterJay789
    @SkaterJay789 3 หลายเดือนก่อน

    Man this is my NIGHTMARE, we dodged this being on S1 instead but the description of figuring it out. Nightmare fuel

  • @enviropediaxr6007
    @enviropediaxr6007 3 หลายเดือนก่อน

    Loved hearing this story! Unbelievable this could not be handled without direct IT people intervening. What are you suggesting for VMs for startups that want to colocate who do not want to get locked into the Broadcom ecosystem?

    • @2GuysTek
      @2GuysTek  3 หลายเดือนก่อน

      Tough question to answer in a simple paragraph, but the short answer I would say is this - if you're a startup and you have the freedom to choose your deployments and want to take advantage of the 'cloud', I'd recommend you look for services that are cloud-first and not necessarily running on fat VMs in someone else's datacenter. But if you're looking to run VMs in a colo or in a hyperscaler like Azure, looking for providers that can offer you console access to your VMs is a huge benefit (something MS doesn't do.)

  • @MrZhods
    @MrZhods 3 หลายเดือนก่อน +4

    My question is after working as an update/patching admin why would you push an update on a Thursday/Friday best practices patch Tuesdays for those who know 🤔

    • @reaperinsaltbrine5211
      @reaperinsaltbrine5211 3 หลายเดือนก่อน +2

      The usual answer for that if you followed the reddit comment stream or here would be the usual "because EDR/AV/DLP/whatever" is different and "zero days" etc. My hunch is that they may have realized they had a vuln in their kernel driver and wanted it gone before clients got wind of it. Or they might had a supply chain attack....

    • @MrZhods
      @MrZhods 3 หลายเดือนก่อน +1

      @reaperinsaltbrine5211 I assume it was something that would have been catastrophic if it hadn't been addressed immediately before any vulnerability was published but I stand by my no patch Fridays from personal experience

    • @reaperinsaltbrine5211
      @reaperinsaltbrine5211 3 หลายเดือนก่อน +1

      @@MrZhods Agreed. And it is still just a hunch.....

  • @TITAN2K
    @TITAN2K 3 หลายเดือนก่อน +1

    Man i feel all your situation loins and 100% feel for you guys. Luckily this time, I Sitting pretty with S1 today.

  • @ernstlemm9379
    @ernstlemm9379 3 หลายเดือนก่อน

    Great video.
    The life of an it-guy.
    Also, like almost every it person, no self reflection on something like "system design" and fall back scenarios.

  • @jonboy2950
    @jonboy2950 3 หลายเดือนก่อน +1

    Not to mention the people with bitlocker and no access to their keys.

  • @softwarephil1709
    @softwarephil1709 3 หลายเดือนก่อน +1

    Is your company going to continue to use CloudStrike?

    • @2GuysTek
      @2GuysTek  3 หลายเดือนก่อน

      Short answer is yes. Crowdstrike is still hands-down the best EDR software on the market today. And while this screw up will go down in the books as (hopefully) a once-in-a-lifetime event, to make the knee-jerk reaction to pull it for a worse product is short-sighted. I think you'll find that most companies impacted will likely stay with CSF.

  • @sergheiadrian
    @sergheiadrian 3 หลายเดือนก่อน +1

    Will you guys continue using CrowdStrike Falcon software? Also, CrowdStrike suggested a few reboots (up to 15!) may give their software the change to pull the latest update and fix the problem. Did that work in practice?

    • @jamespong6588
      @jamespong6588 3 หลายเดือนก่อน +1

      Technically I don't think you can really even truly uninstall it without reformatting the hard drive

    • @selvestre
      @selvestre 3 หลายเดือนก่อน

      @@jamespong6588 You get an uninstall key from infosec/it or whoever administers CS. I think it's specific to each host. When you uninstall the software, it will ask for that key. I've uninstalled/installed many times. In my environment it interferes with product testing.

  • @LaskoviyMayOfficialChannel
    @LaskoviyMayOfficialChannel 3 หลายเดือนก่อน

    Respect

  • @petermatthews-ob1dg
    @petermatthews-ob1dg 3 หลายเดือนก่อน

    so one man, on his own, stopped the world, we are doomed!

  • @Zekrom569
    @Zekrom569 3 หลายเดือนก่อน

    RIP, in some airports they did go back to giving handwritten boarding passes

  • @Juhsentuh
    @Juhsentuh 3 หลายเดือนก่อน

    I’m an IT System Analyst, crazy how much we rely on computers. Yesterday was insane. By the way in healthcare.. FUN

  • @adamcadd
    @adamcadd 3 หลายเดือนก่อน

    years ago, I worked on a supportdesk and watched a whole bank scream to a halt after Symantec was rolled out. As soon as it loaded the first time, it ran a full scan pegging the IO and CPUs making devices unusable for hours, if someone restarted, it started over

  • @cantbuyrespect
    @cantbuyrespect 3 หลายเดือนก่อน

    I work in IT for a large company. I personally repaired about 50 of these via the phone to the users in the past two days. It is a mess for sure.

  • @pikeyMcBarkin
    @pikeyMcBarkin 3 หลายเดือนก่อน

    Worst thing that happened to me at my job was UKG was down for ~9 hours No one could clock in/out etc.

  • @Seandotcom
    @Seandotcom 3 หลายเดือนก่อน

    I just spent an hour (50 min on hold, they are SLAMMED) on the phone with my IT guy to get the bitlocker key for my workstation. I told him he’s fighting the good fight and to get some sleep.

  • @LaziestTechinCyberSec
    @LaziestTechinCyberSec 3 หลายเดือนก่อน

    So I wonder what happens to the executive who chose Crowdstrike in each company?
    And more generally, how do you actually select an EDR/EPP/XDR?

  • @RonaldBartels
    @RonaldBartels 3 หลายเดือนก่อน

    Great video. Did any one post a packet capture of what was happening on the wire?

    • @2GuysTek
      @2GuysTek  3 หลายเดือนก่อน

      Not that I'm aware of. The situation was well documented and communicated by CS, so there was little question about what was happening. I will say however that there still is a bit of ambiguity as to why some systems were able to self-recover and others weren't.

  • @markbeck2236
    @markbeck2236 3 หลายเดือนก่อน +1

    Sorry that you got affected. Sounds like a hell of a thing.

  • @JustWatchMeDoThis
    @JustWatchMeDoThis 3 หลายเดือนก่อน +2

    I do wonder why huge companies are not diversified more. It is crazy that this can happen all over the world and shut so much down. Even Crowdstrike should be fully diversified I would think where under no circumstances everything goes out to all streams on the same day. I am not IT so I don't know the terminology, but I think you know what I mean.
    Even in this case it seems redundancy would not have helped because it was a single bad file that went to every computer. But if out of 5000 computers, they had 5 different services or versions of a service where that are getting a different set of downloads on each stream in a single day, this would have only shut down a fifth of them, leaving only a fifth of them disabled for a bit and 80% still operational and able to find the problem before the world comes to a halt.
    I feel like this incident has opened floodgates and we can and likely will see malciousness on this scale or worse sooner than later.
    What are your thoughts?

    • @2GuysTek
      @2GuysTek  3 หลายเดือนก่อน +2

      You're right to question their approach. At a minimum, you'd have expected a phased approach to pushing patches, but clearly, that's not how CS was operating. A bigger question for me is how did this get past QA? This patch took down almost 100% of all of the clients who received it, with the exception of older Windows 7/Server 2008 R2. With a failure rate that high, it should have gotten caught immediately, and the fact it wasn't suggested more about how they test and validate than anything.
      Someone in another comment also mentioned they were surprised that CS doesn't break down update deployments based on industry, which is a good idea. Maybe businesses in critical sectors, like hospitals, don't get cutting-edge update releases the instant they're available?

  • @erichunsberger869
    @erichunsberger869 3 หลายเดือนก่อน

    For Azure VMs hopefully you have snapshots enabled. Restore a snapshot of the OS disk prior to the event, then do an OS disk swap. Boot the machine.
    Worked most of the time, once in a while NLA complained, but restoring a different snapshot typically worked.
    We have separate data disks for our apps so data loss wasn't a concern

    • @2GuysTek
      @2GuysTek  3 หลายเดือนก่อน

      I'm glad you found a faster way to restore services!

  • @jrr851
    @jrr851 3 หลายเดือนก่อน +1

    Thanks for this. My MSP uses a competitor for our MDR, but my previous employer wasnt that lucky. Nuts.

  • @Quizidomo
    @Quizidomo 3 หลายเดือนก่อน +1

    Did the Azure storage outage cause Crowdstrike processes to download a null file stub meaning the mass BSOD outage was triggered by Microsoft but caused by Crowdstrike not parsing input files safely?

    • @2GuysTek
      @2GuysTek  3 หลายเดือนก่อน

      They were unrelated.

    • @Quizidomo
      @Quizidomo 3 หลายเดือนก่อน

      @@2GuysTek Who said though?

  • @lazarusblackwell6988
    @lazarusblackwell6988 3 หลายเดือนก่อน +1

    People:" waaah waaah i cant browse memes becaue of this!!"
    People in hospitals on a death bed:...... - - ........

  • @RJARRRPCGP
    @RJARRRPCGP 3 หลายเดือนก่อน

    2:43 was the CrowdStrike driver? That stop code otherwise, usually means loss of RAM stability, such as badly-seated RAM, or dirty RAM slots.

  • @scooterfoxx
    @scooterfoxx 3 หลายเดือนก่อน

    so does this affect people only with crowdstrike installed? Im not seeing any full actual details of how it affects you entirely, what is it that causes you to get it, ect. Is it crowdstrike that you download on your pc? or is it windows defender? what exactly? Since i can only pause windows updates for so long

  • @BalfourDeclaration1917
    @BalfourDeclaration1917 3 หลายเดือนก่อน +3

    The Chinese have been developing and deploying software for their own mission-critical systems for years now. When Huawei was banned from certain Western tech, they decided to write their own mobile OS (now in beta testing and will be stable by next year). Next, they will produce their own PC chips, server hardware, and storage systems. The aim of China is to be fully independent from a lot of foreign tech and avoid trouble like this.
    Also, the fact that US big tech companies get a free pass and a slap on the wrist after causing data theft due to incompetence in cybersecurity (e.g the Cambridge Analytica scandal involving Facebook), and now, global outages which cost precious time and billions of dollars, while the foreign ones are accused of data collection, user tracking, espionage, and election interference is pure hypocrisy.
    These are just my observations.

  • @robertbeyers2825
    @robertbeyers2825 3 หลายเดือนก่อน

    Info sec pro here. 100% my first thought after watching colleagues respond, we where not impacted, was omg this is it here is the ransomware event we all fear :[[[[

  • @benizraadacudao3020
    @benizraadacudao3020 3 หลายเดือนก่อน

    Thank you for sharing your experience sir.😊

  • @bobsidog
    @bobsidog 3 หลายเดือนก่อน

    I live in Austin TX and never heard of this company whoaaaa

  • @davidc5027
    @davidc5027 3 หลายเดือนก่อน

    Some companies are working with machines directly connected to the network, and during the reboot those machines will have from 10 to 12 seconds to which you can push a UNC update to delete the .sys file. Not sure fire by any means, but some are finding some success doing this. Seems to work the best for those machines just boot looping and are directly connected to the network. Edit - You may be able to leverage the Crowdstrike dashboard that will say what machines in your environment were affected. This is also helping enterprises identify which machines went down and have not come back up. Keep in mind "Enterprise" means thousands of machines and some of which can be scattered in various locations and even different countries. What's my take? There are advantages and disadvantages to running in kernel mode. The big advantage is visibility. We now know the possible disadvantage, but what happened to the days of when 3 reboots and the system automatically goes into Safe mode? Microsoft has some explaining to do as well.

  • @jimmichael3857
    @jimmichael3857 3 หลายเดือนก่อน +9

    Thanks for sharing your story. We don’t use CrowdStrike, but I 100% relate to your gut punch moment when you initially thought “this is it.”
    I’ve had that pit-of-the-stomach feeling upon seeing weird traffic in my own network several times over the last few years (so far we’re unscathed) and it truly sucks.
    I’m 30 years into my IT career, and the scourge of ransomware has turned what was once a fun job into an almost constant state of dread. IT is now almost 100% defense, and I hate that.

    • @2GuysTek
      @2GuysTek  3 หลายเดือนก่อน

      This! It's a constant shadow we live under. Mainly because we all know it's a matter of time, regardless of how on-point your cybersecurity game is.