The "cloud" is just some random servers in a single building somewhere just waiting to get destroyed by some natural disaster. If you don't back it up yourself then it won't be backed up.
@@tripplefives1402 so your own server can be destroyed by natural disaster if we go this way "if you don't back it up yourself then it won't be backed up." the story of this video literally refutes your point
@@MidnightMidas many managers are just so short-sighted, if their company fxxked up, they just move to another company. So they dont want to spend extra for long run
Somehow for companies nowadays having anything is seen as the biggest sin (having own engineers, own infrastructure, own cleaning ladies and doormen, own furniture, own cars...).
@@TelemarketerOO7 lmao, yes that is the pretence under which outsourcing gets into companies, but reality is the employees that stay make up for the shortcomings of the external workers or services. And speaking of cloud services, those are almost always more expensive than running on-site servers, for example in a manufacturing plant that does not need outside access to its IT services. Yet cloud gets forced into such usecases. Ridiculous.
@@TelemarketerOO7 one more on how people think these days, I'll just throw approx. numbers as this story 2 yrs old: A user needed Adobe Acrobat license. I said I will buy it for them for 120 USD and their boss needs to approve the purchase. The boss said "Why the heck are you buying something for 120 USD if you can get this for 10 USD a month?" And sent me a link to Adobe Acrobat subsription. I laid out to him that in 12 months, the subscription will have cost more than the licence. This person I had to teach the basic maths and economy was more experienced, educated and much better paid than me. Yet he was blind to this! All he saw was "10
@@askjacob haha just imagining someone who got fired a few months ago, on the toilet, getting a notification "It's been 1 year since [Google] backup was created, delete it now to free up space?" and just going "hehe 🫣😗😘😜 :3" and pressing "Yes" 😭🤣
Well if by internal tool they mean "a local dev clone of the repository" (just used to develop and test)... Who cares. Then they just should have never used it for production in the first case
@@WilcoVerhoef That was unprofessional from their side, it not single engineer to handle that task, sure others were with him when hes issued that command, and they didn't notice it, and how the F, google use experimental tool in a working environment, I'm single web developer and I test my web apps for weeks before send it to the client
make them king/queen of Australia while you're at it, I guarantee they could not possibly do a worse job at governing than the "elected" aussie govt. shoutout to TheJuiceMedia
Typically we Australians don't exclaim "Crikey" during a major tech outage. We use more traditional technical language including words like "shit" and "fuck".
To be fair i can easily see how this happens. Its unfortunately very common for engineers to build some tooling for testing or debugging in narrow engineering use cases. Then sales or another department comes along with a problem they need fixed WiTh tHe HigHeSt pRioRiTy and "borrow" the tool "just for this once". A year later the entire department ist misusing the tool or even some bastardized version of it for production use on actual customer data. No one fully understands what the tool does or what it was even meant to do, but it saves some time so everyone just uses it without a second thought.
" Its unfortunately very common for engineers to build some tooling for testing or debugging in narrow engineering use cases" True, sadly. I've seen a company trying to make their backups faster by backup less data. More importantly: to only backup production databases. Fair enough, but they used a naming convention to determine that it was a production database. The problm is that they did not communicate this new requirement to the rest of the company and just assumed they already knew. So a few months go by where a couple of "misnamed" databases are not backed up, old backups expire and are erased, and then the server starts running out of diskspace. Instead of giving the server more space they decide that "hey, we know which databases are production, so we can simply drop everything else". And they nuked whatever did not follow the new, unknown naming convention. Four customers lost their entire database, forever. Then basically the entire company was assigned to help with manual data entry to help get those customers at least to some state of operation again. And all because they didn't want to buy a $50 harddisk.
I am actually impressed that they managed to get back up and running in two weeks time after having their entire environment deleted, that's actually a great job by the unisuper it guys, hats off, kudos and respect for a job well done.
Having worked at Google, I'm a bit surprised that the DB didn't get moved to (essentially) the recycle bin for a couple weeks before the space was recovered. The group I was in dropped production DBs a couple of times and had them restored within minutes.
@@Blox117 I've used it a bunch. It's really sick, but once you've heard some you can kinda recognize the scratchier sounding voice and chord sequences!
I also agree with khaelkugler, however I'd sat the line at 8:59 is what gave it away for me, if you use suno a bit you'll notice a lot of lyrics come out like this. I'm up to being proven wrong, but I'm pretty sure it's AI generated using Suno.
The thing about Australians is that black swans are just regular swans here. Jokes aside, I'm not even surprised that UniSuper had that many redundant backups. Universities take their pension money seriously.
It was originally for university staff (thus the name and thus why a subsection of funds are called ‘industry funds’), but these days, it is open to anyone to join.
I guess one could argue that this particular issue couldn't have occurred if they hadn't moved to Google, but it's not really a good-faith argument, I think.
I remember when this happened and finding out they actually had good backups made me feel vindicated for all the clients I've had over the years who I've beat up over backups. Only problem (which you wouldn't expect in a cloud setup) is they probably never tested them which is probably why it took 2 weeks to get everything restored. But at least they didn't lose their data. Also that outro =D
DR tests are done often in Super IT. It depends how much data needs to be restored and how data feeds into one another. And of course observing that everything works as intended
When I saw a new video came out for Kevin, I was surprised Cloudstrike Falcon was not the next video. Perhaps he wants to wait for more information to come out so he gets the whole picture. To me I think it is pretty clear, the Falcoln driver had code which essentially said "execute stuff here" and the patch had a bad file full of zeros. if this had been tested by anyone at the company, their computer would have failed to boot, then they would have looked at it, realized their driver didn't validate what it was trying to run, and then have a laugh before fixing its validation. The fact that it accepted a null file suggests to me there is no validation at all and bad actors can put in whatever they want there and the driver will blindly execute it, so the problem isn't just the bad file bricking computer but that the driver is fundamentally not robust. So I think I get it. But maybe Kevin is waiting just in case there is more to the story. Who knows maybe there were major mistakes that led up to the big one. So Kevin is waiting to be sure he doesn't miss anything even if it means other people will cover it first.
@@alex_zetsu it's also possible that it *was* tested before being pushed to production and there was no issues, meaning something went wrong during whatever process actually distributes those patches and that's why it wasn't caught. my question would be why they weren't distributing new patches in waves instead of to all customers at once.
@@reapimuhs I don't think it was tested. We know the driver itself isn't being updated in that patch. And unlike some bugs which only display aberrant behavior on certain hardware, their driver doesn't work if it tries to execute a null file on any machine. Maybe it will later be revealed that the test environment in fact can run a null file, but I doubt that.
@@alex_zetsu what I was saying is that it's very possible that the file was *perfectly fine* and not null during testing, but that something went wrong during their process of *distributing* that file to production devices that caused it to become nulled. In which case, it would have looked perfectly fine during testing and the issue only would have appeared during production.
Can you even imagine how loud the "I _TOLD_ YOU SO!"'s must have been when the root cause was discovered? I've been in Google's side of the issue for stuff like this (on a much lesser, but critical scale) it's never fun. I can only hope that whoever ran the command a year prior was already working for some other company by the time this happened, and got a fat consultancy payment for taking the time to review with them.
The blame game is childish. Whoever ran the command was under pressure and had 1000 other things going on. The fault lies with the design of the system, which a ton of people contribute to. Running a command missing a parameter should not result in the cloud being deleted.
You're right that super is similar to a 401k - the main difference is that it's mandatory for all employers to contribute. Similar to the USA, the money is either invested in a mutual fund of some sort, or it's "self managed" where the employee chooses where to invest the money through a brokerage account.
Exactly. Its huge amounts of money and critical for peoples retirement as we have had this scheme for decades - $3.5 trillion (AUD) is under management. To give you an idea of how big this is, its double the countries GDP.
this is what I've said for ages as well. I've always wondered if there could be a coordinated attack from malware that could break out of containerization or virtualization to hit everything on the metal.
@@Edgodful It might seem boring for you but definitely remember that it's the people like you that keep everything from crashing and burning. People never think of how important the maintenence crew is until the building's on fire.
Lyrics: May 2020, GCVE took flight Seamless VMware migration to the cloud was in sight They had dashboards, metrics, operational tools Got it all figured out, Google devs ain't fools Then billion-dollar UniSuper comes around Needs a VMware cloud but their specs weren't found So then went and asked Google ops to spin that thing up Except there was a slight catch: It would soon blow up Engineers scrambled, their logs blank as the dawn Their vital infrastructure was just suddenly gone It was truly an unprecedented black swan event There was nothing UniSuper could do to prevent Backups are our lifeline, in backups we trust When prod is deleted, backups are a must Three copies, two media, and one off-site No backups, I'm leaving 'til you make it right
Yeah, this isn’t a backup thing, it’s disaster recovery. We do yearly gamedays where we try to restore our entire company to operations in another cloud in a week (which usually turns to two). Super impressive that they got it up and running in 2 weeks!
The Internet was supposed to be a Web where the failure of one thing wouldn't cause catastrophe, but now, ironically, it relies very heavily on a mere handful of major companies just to not implode.
The difference between making a backup and insurance is that making a backup is fast, cheap, and easy. However, insurance is designed to milk every penny out of you, provide generational debt the moment disaster strikes, and do nothing to actually help you in your time of need. Insurance is literally a scam and companies know it. That's why they send ludicrously sized fees for medical to insurance providers so they can make a bit extra on each service.
I figered this out quite a few years ago and never used any type of personal insurance again. They money I save over time just goes into an index fund which keeps growing in size.
The funny part will be when Google points them to the End User License Agreement they clicked on that says Google is not liable for any damages from this kind of thing. But people are still in love with "the cloud" for some reason.
You'd be surprised how much better the EULAs that companies big enough to have their own legal departments can negotiate compared to the boilerplate take-it-or-leave-it agreements "regular" businesses and individual consumers have to agree to.
I find it amazing that this was 100% google's fault. If for example Unisuper set it up incorrectly, blame still can be attributed to Google for having weird deletion priorities and unable to restore deleted cloud environments. But seeing as this was an internal test, there was no "recycling bin" and they even pressed the button to deploy.
Good thing they weren't on AWS, Amazon would have just denied it was their fault for as long as humany possible, did the bare minimum to fix it once they couldn't deny it any longer, and then forced Unisuper to sign an NDA in order to learn what went wrong.
I am so glad they had backups! Having a destructive option as the default was a major f-up. I wonder if the people involved in writing the tool were still around when it was released, hard to imagine how that issue could have gone unnoticed during review or at release time.
The notion of doing anything financially important online is just plain stupid. I computerized my spouse's business, years ago, and our company computers ran Quickbooks OFFLINE. Always offline. The program was installed from a disc. NO ONE had any way to get into our computers. Invoices were printed out and mailed to customers. Assuming computers have to be on the internet while in use is like assuming you can only drive a car on expressways and never back roads.
I'm glad I wasn't the only one here that loved the outro on this one. Great job on both the outro & the content (as always). My wife actually had super with UniSuper up until about 5 months ago, small world.
I'm a UniSuper customer. Although it wasn't their fault they still did a shit job communicating this out. And when it was resolved they sent an email squarely blaming Google and even had them put out a statement.
6:01 As a human being myself, I love seeing these kinds of posts when they turn out to be wrong. It's like seeing Wile E Coyote run confidently into a brick wall.
@@ahmoin /reads/ Note to self: Don't make any travel plans this Thursday, just in case... But yea, it's actually quite interesting that the empty file wasn't the actual root cause, and it was really all about an off-by-one index... And regexes. ADD: And... Did they just call a bunch of regexes "advanced AI?"
“Who TF have multi cloud back up?”
“The guy that saved this company”
Trueee
yeah i am amazed
Opportunity costs. What's cheaper: having a zero-fail backup, or losing everything?
The "cloud" is just some random servers in a single building somewhere just waiting to get destroyed by some natural disaster. If you don't back it up yourself then it won't be backed up.
@@tripplefives1402 so your own server can be destroyed by natural disaster if we go this way
"if you don't back it up yourself then it won't be backed up." the story of this video literally refutes your point
Was so shocked that the company followed the 3-2-1 rule.
well its better to pay more for backups than pay for the law suits from 600k people...
@@MidnightMidas many managers are just so short-sighted, if their company fxxked up, they just move to another company. So they dont want to spend extra for long run
That was a good decision
Whoever architected that is earning their salary.
@@guacaprole and also deserves a raise for that one.
Nice to know that a company was actually saved from this situation because they had actually followed best practices about IT... sadly rare
If I had to guess it's probably mandatory by law since they are handling retirement funds. Not sure though
@@promero14and you would be correct. Lots of compliance.
@@promero14 Yep. IT finance here. Required by law to keep back up, back up of back up, and a whole ass DRC server
Never Ever forget that "the cloud" is just a fancy name for someone else's computer and you're their mercy.
Somehow for companies nowadays having anything is seen as the biggest sin (having own engineers, own infrastructure, own cleaning ladies and doormen, own furniture, own cars...).
@@YS_Production yeah man, economies of scale have advantages
Unless your name is Kamala, in which case you believe it’s something that is floating above you ;)
@@TelemarketerOO7 lmao, yes that is the pretence under which outsourcing gets into companies, but reality is the employees that stay make up for the shortcomings of the external workers or services. And speaking of cloud services, those are almost always more expensive than running on-site servers, for example in a manufacturing plant that does not need outside access to its IT services. Yet cloud gets forced into such usecases. Ridiculous.
@@TelemarketerOO7 one more on how people think these days, I'll just throw approx. numbers as this story 2 yrs old: A user needed Adobe Acrobat license. I said I will buy it for them for 120 USD and their boss needs to approve the purchase. The boss said "Why the heck are you buying something for 120 USD if you can get this for 10 USD a month?" And sent me a link to Adobe Acrobat subsription. I laid out to him that in 12 months, the subscription will have cost more than the licence. This person I had to teach the basic maths and economy was more experienced, educated and much better paid than me. Yet he was blind to this! All he saw was "10
at least they had sensible defaults like deleting everything with no warning after 1 year
It probably messaged some dev that left 2 years ago
@@askjacob haha just imagining someone who got fired a few months ago, on the toilet, getting a notification "It's been 1 year since [Google] backup was created, delete it now to free up space?" and just going "hehe 🫣😗😘😜 :3" and pressing "Yes" 😭🤣
Well if by internal tool they mean "a local dev clone of the repository" (just used to develop and test)... Who cares. Then they just should have never used it for production in the first case
@@WilcoVerhoef That was unprofessional from their side, it not single engineer to handle that task, sure others were with him when hes issued that command, and they didn't notice it, and how the F, google use experimental tool in a working environment, I'm single web developer and I test my web apps for weeks before send it to the client
@@arduinoguru7233 *Because you have a strong ethical sense. OTOH, Google...*
whoever at unisuper pushed for double backups need an 8 figure check rn
Yep - definitively should be a percentage of the funds saved :)
@@pureabsolute4618 bean counters definitely won't be asking why those other backups are needed
make them king/queen of Australia while you're at it, I guarantee they could not possibly do a worse job at governing than the "elected" aussie govt.
shoutout to TheJuiceMedia
whoever ?wow , it's really that bad out there that you gotta rely on a maverick to do something sensible ?
@@BatkoNashBandera774 what don't you like about the current government, what policies do you not agree with
The outro slaps harder than the Crowdstrike outage
Cant wait for the kevin fang video on that
yeah it slaps damn
Too soon
Pure fire 🔥
@@wallyrogers2371 ratio
Typically we Australians don't exclaim "Crikey" during a major tech outage. We use more traditional technical language including words like "shit" and "fuck".
Next time I have a major outage at work, I need to remember to yell out "Crikey" now - after I've included all the swears!
Shicrifuck
Good to hear their grounded approach, despite not having any boots on the ground 😉
How bad does it have to get before Australians have to resort to the technical expression of punching through their monitor?
why not a "crikey-shit-fuck" for good measure?
why reverse a linked list when you can reverse a lifetimes work
😂
in rust!
(rust lifetimes reference)
underrated comment
@@justinliu7788 Camomo says hi and also that you're still not welcome back to that server.
@@BatkoNashBandera774 nobody cares shut up
To be fair i can easily see how this happens. Its unfortunately very common for engineers to build some tooling for testing or debugging in narrow engineering use cases. Then sales or another department comes along with a problem they need fixed WiTh tHe HigHeSt pRioRiTy and "borrow" the tool "just for this once". A year later the entire department ist misusing the tool or even some bastardized version of it for production use on actual customer data. No one fully understands what the tool does or what it was even meant to do, but it saves some time so everyone just uses it without a second thought.
Sadly true
Used to call this "Move Fast (into walls)"
" Its unfortunately very common for engineers to build some tooling for testing or debugging in narrow engineering use cases"
True, sadly.
I've seen a company trying to make their backups faster by backup less data. More importantly: to only backup production databases. Fair enough, but they used a naming convention to determine that it was a production database. The problm is that they did not communicate this new requirement to the rest of the company and just assumed they already knew.
So a few months go by where a couple of "misnamed" databases are not backed up, old backups expire and are erased, and then the server starts running out of diskspace. Instead of giving the server more space they decide that "hey, we know which databases are production, so we can simply drop everything else". And they nuked whatever did not follow the new, unknown naming convention. Four customers lost their entire database, forever.
Then basically the entire company was assigned to help with manual data entry to help get those customers at least to some state of operation again.
And all because they didn't want to buy a $50 harddisk.
common in saas
Nothing is ever being used only “temporarily”
I am actually impressed that they managed to get back up and running in two weeks time after having their entire environment deleted, that's actually a great job by the unisuper it guys, hats off, kudos and respect for a job well done.
Having worked at Google, I'm a bit surprised that the DB didn't get moved to (essentially) the recycle bin for a couple weeks before the space was recovered. The group I was in dropped production DBs a couple of times and had them restored within minutes.
@@darrennew8211Yeah, it lends more credence to this being an internal development tool's failsafe cleanup routine.
@@darrennew8211 No delete protection probably.
This guy is meatriding for real
This guy is meatriding for real
Outro had no business being as fire as it was😂😂
It was hella fire, but I think the singing was from Suno ai
@@khaelkugler dang how did you spot that
@@Blox117 I've used it a bunch. It's really sick, but once you've heard some you can kinda recognize the scratchier sounding voice and chord sequences!
Thank you i was going to close the video before the outro ❤
I also agree with khaelkugler, however I'd sat the line at 8:59 is what gave it away for me, if you use suno a bit you'll notice a lot of lyrics come out like this. I'm up to being proven wrong, but I'm pretty sure it's AI generated using Suno.
We all know how this would have been resolved had the deletion been to something worth substantially less than $135B
$10 gift card? 😅
@@everyhandletaken 🤣🤣
I like to think that some of the google employees that provisioned this were in Australia and they were with unisuper so they were extra motivated
@@everyhandletaken Nope, not even a response to the ticket. Google doesn't talk to peasants (people with a net worth less than 1B USD).
In that case Google wouldn't setup unique infrastructure just for you a year ago and nothing would have happened.
The thing about Australians is that black swans are just regular swans here.
Jokes aside, I'm not even surprised that UniSuper had that many redundant backups. Universities take their pension money seriously.
It was originally for university staff (thus the name and thus why a subsection of funds are called ‘industry funds’), but these days, it is open to anyone to join.
I was genuinely surprised when I learnt that white swans exist..
@@brydenquirk1176 pffftt... White swans... That's gotta be like a one a billion mutation.
Yea it must be albino or something
That outro track is a 10
"As a cloud engineer myself, I can confidently say this is complete bullshit, it is 100% UniSuper's fault"
😂 That was my favourite part
@@DespOIcito with all due respect, if you don't have the full context then maybe you shouldn't make statements like "with 100% confidence" 😂
I guess one could argue that this particular issue couldn't have occurred if they hadn't moved to Google, but it's not really a good-faith argument, I think.
"My Dad works at Nintendo"
@@Zxv975 agree speculation when you have no idea is asinine. The person was so confident too.
I remember when this happened and finding out they actually had good backups made me feel vindicated for all the clients I've had over the years who I've beat up over backups. Only problem (which you wouldn't expect in a cloud setup) is they probably never tested them which is probably why it took 2 weeks to get everything restored. But at least they didn't lose their data.
Also that outro =D
Better than entire Indonesian government that have all their data lost because of ransomware attack. and it looks like no backup
DR tests are done often in Super IT. It depends how much data needs to be restored and how data feeds into one another. And of course observing that everything works as intended
Keeping in mind just evaluating the loss of resources can take days, and that the backups had to come from a totally different system...
Return of the king
Can't wait for the future cloudstrike video
When I saw a new video came out for Kevin, I was surprised Cloudstrike Falcon was not the next video. Perhaps he wants to wait for more information to come out so he gets the whole picture. To me I think it is pretty clear, the Falcoln driver had code which essentially said "execute stuff here" and the patch had a bad file full of zeros. if this had been tested by anyone at the company, their computer would have failed to boot, then they would have looked at it, realized their driver didn't validate what it was trying to run, and then have a laugh before fixing its validation. The fact that it accepted a null file suggests to me there is no validation at all and bad actors can put in whatever they want there and the driver will blindly execute it, so the problem isn't just the bad file bricking computer but that the driver is fundamentally not robust. So I think I get it. But maybe Kevin is waiting just in case there is more to the story. Who knows maybe there were major mistakes that led up to the big one. So Kevin is waiting to be sure he doesn't miss anything even if it means other people will cover it first.
@@alex_zetsu it's also possible that it *was* tested before being pushed to production and there was no issues, meaning something went wrong during whatever process actually distributes those patches and that's why it wasn't caught. my question would be why they weren't distributing new patches in waves instead of to all customers at once.
@@reapimuhs I don't think it was tested. We know the driver itself isn't being updated in that patch. And unlike some bugs which only display aberrant behavior on certain hardware, their driver doesn't work if it tries to execute a null file on any machine. Maybe it will later be revealed that the test environment in fact can run a null file, but I doubt that.
@@alex_zetsu what I was saying is that it's very possible that the file was *perfectly fine* and not null during testing, but that something went wrong during their process of *distributing* that file to production devices that caused it to become nulled.
In which case, it would have looked perfectly fine during testing and the issue only would have appeared during production.
more explosions than a Michael Bay trilogy
that outro is fire
kevin's kicking off his music career :O
Can you even imagine how loud the "I _TOLD_ YOU SO!"'s must have been when the root cause was discovered? I've been in Google's side of the issue for stuff like this (on a much lesser, but critical scale) it's never fun. I can only hope that whoever ran the command a year prior was already working for some other company by the time this happened, and got a fat consultancy payment for taking the time to review with them.
The blame game is childish. Whoever ran the command was under pressure and had 1000 other things going on. The fault lies with the design of the system, which a ton of people contribute to. Running a command missing a parameter should not result in the cloud being deleted.
@@bosshog8844 It was probably run by someone far detached from the smelly dev basement it was ill-conceived in.
@@bosshog8844I agree that blaming a single person isn't accurate or helpful for anyone, but Google is squarely at fault here
You're right that super is similar to a 401k - the main difference is that it's mandatory for all employers to contribute. Similar to the USA, the money is either invested in a mutual fund of some sort, or it's "self managed" where the employee chooses where to invest the money through a brokerage account.
Exactly. Its huge amounts of money and critical for peoples retirement as we have had this scheme for decades - $3.5 trillion (AUD) is under management. To give you an idea of how big this is, its double the countries GDP.
It's refreshing seeing a company having backups on separate systems. Good management of Unisuper's tech department.
Can't wait for your CrowdStrike episode, it's gonna to be awesome 🤣
hahahaha
holy shit when the autotune came in i shat my pants, bars
This autotune drop is now my alarm tone for UniSuper project meetings. The office is gonna be lit.
Sounds like Suno AI but might be wrong there
I thought this was a good video until 8:18 and then it became a great video. Then the chorus kicked in and it became the best video of all time
Man, that outro caught me off guard, it didn't have any right to be as good as it was.
GitLab almost having 0 functioning backups will never get old xD
bro cooked with that rap, goddamn
Hm
🤔
Fr
Bro's a full time engineer, part time rapper.
judging by these videos, what they really need is missile and explosion protection
"I have a backup."
"THAT WAS OUR BACKUP!!!"
"Yes but I have another backup. Can't trust humans."
'Can't trust humans' basically sums up my experience so far in this world.
the rap at the end was hilarious 😂
Very funny, but was that AI? Towards the end it sounded slightly fried lol. Kinda scary that I'm even questioning this 😅
@@supercyclone8342 Either that or autotune
@@supercyclone8342why does it matter. It’s a gag/skit
@@Xamy- I'm not really bothered especially cause this is the perfect use case. I was just curious if I finally caught an AI song in the wild lol.
@@supercyclone8342 i believe it's suno ai
"When one of disaster strikes" typical insurance scamvider policy don't cover your case on
page 25/60 fine print
At the end of the day, all the "cloud" is is just some other random guy's computer. Count on no guarantees of availability.
this is what I've said for ages as well. I've always wondered if there could be a coordinated attack from malware that could break out of containerization or virtualization to hit everything on the metal.
*Everything on cloud*, they said. *It will be much better for everyone*, they said.
Yep “off prem” is just someone else’s prem.
Amazing that you actually did a bad Aussie accent instead of a British one. Big props.
I love the redditor commenting "I'm a cloud engineer, it's 100% impossible to be google's fault" and was entirely incorrect. Classic reddit behavior
6:00 is really representative of the average software "expert".
Solid google moment
Well, Google is American. Never trust these pigs. Better off using China based cloud providers
not a "backups" problem here so much as "disaster recovery". A salutary lesson......
This channel makes me feel better about the mistakes I've personally made and those I've seen at my workplaces.
This channel also makes me fell better on my boring ass it support pos
@@Edgodful It might seem boring for you but definitely remember that it's the people like you that keep everything from crashing and burning. People never think of how important the maintenence crew is until the building's on fire.
Lyrics:
May 2020, GCVE took flight
Seamless VMware migration to the cloud was in sight
They had dashboards, metrics, operational tools
Got it all figured out, Google devs ain't fools
Then billion-dollar UniSuper comes around
Needs a VMware cloud but their specs weren't found
So then went and asked Google ops to spin that thing up
Except there was a slight catch:
It would soon blow up
Engineers scrambled, their logs blank as the dawn
Their vital infrastructure was just suddenly gone
It was truly an unprecedented black swan event
There was nothing UniSuper could do to prevent
Backups are our lifeline, in backups we trust
When prod is deleted, backups are a must
Three copies, two media, and one off-site
No backups, I'm leaving 'til you make it right
“Kyle, why are we paying for multiple backups?”
he needs to release the full version of Backups on spotify fr!! 🔥🔥🔥
outro is legit
Yeah, this isn’t a backup thing, it’s disaster recovery. We do yearly gamedays where we try to restore our entire company to operations in another cloud in a week (which usually turns to two). Super impressive that they got it up and running in 2 weeks!
didn't expect you to spit bars at the end 🔥
The fact that all of our money is just relying on the internet not crashing for whatever reason is pretty horrific.
The Internet was supposed to be a Web where the failure of one thing wouldn't cause catastrophe, but now, ironically, it relies very heavily on a mere handful of major companies just to not implode.
The difference between making a backup and insurance is that making a backup is fast, cheap, and easy. However, insurance is designed to milk every penny out of you, provide generational debt the moment disaster strikes, and do nothing to actually help you in your time of need.
Insurance is literally a scam and companies know it. That's why they send ludicrously sized fees for medical to insurance providers so they can make a bit extra on each service.
I figered this out quite a few years ago and never used any type of personal insurance again. They money I save over time just goes into an index fund which keeps growing in size.
I just stumbled onto this video in my recommends and now I'm left with the fire outro stuck in my head. Subbed.
Very good having backups with another provider. From my observations that is very rare.
I briefly worked for a company that had the off site backup. I'm not sure if they followed the 321 rule but they at least ran 21.
Came from a review(?) of your video on another channel to leave this comment on the outro. It's wonderful.
i wonder if this incident lowered the cloud service price for UniSuper...
Google:"We know everything."
Also Google:"Fack i lost money by accident."
As an Aussie I can confirm this is how we talk when dealing with a sev 1
The funny part will be when Google points them to the End User License Agreement they clicked on that says Google is not liable for any damages from this kind of thing. But people are still in love with "the cloud" for some reason.
You'd be surprised how much better the EULAs that companies big enough to have their own legal departments can negotiate compared to the boilerplate take-it-or-leave-it agreements "regular" businesses and individual consumers have to agree to.
Even for tech giants such as Google, there is potential impact when unnoticed small details snowball into big issues.
when you run prod with test code, excitement is guaranteed
@@TrakoZG Means
I find it amazing that this was 100% google's fault. If for example Unisuper set it up incorrectly, blame still can be attributed to Google for having weird deletion priorities and unable to restore deleted cloud environments. But seeing as this was an internal test, there was no "recycling bin" and they even pressed the button to deploy.
DEI
That rap goes hard
crowdstrike video when?
Be a long time before all the details come out about that one
@@Hopgop1considering the lawsuits its going to be a long time
@@Hopgop1 The official explanation was released today with the exact reasons as to how this happened :D
Soon™️
would need a huge render farm for all the explosions, could rival a Michael Bay trilogy
I'm stealing that closing song for awareness training on backups
I'm surprised that a pension fund followed better data storage practices than some IT companies.
Good thing they weren't on AWS, Amazon would have just denied it was their fault for as long as humany possible, did the bare minimum to fix it once they couldn't deny it any longer, and then forced Unisuper to sign an NDA in order to learn what went wrong.
3:53 First it’s PowerShell for Linux, now Notepad for Linux? Next we’re gonna end up getting Windows Explorer for Linux 💀
the guy bashing UniSuper just be wrong the whole time was funny as hell
This backup song goes hard. Gonna play the chorus before every change.
Man the Outro
Bro
11/10
I am so glad they had backups! Having a destructive option as the default was a major f-up. I wonder if the people involved in writing the tool were still around when it was released, hard to imagine how that issue could have gone unnoticed during review or at release time.
I need a longer version of this song
Never have I ever waited so eagerly for TH-cam videos
Great video and very informative. Cheers from the UK! 🎉
Which is why everyone believing that moving all Data into someone elses Cloud is such a good idea ... it is crazy ...
Already liked and subbed.
When Crowdstrike? :D
Lovely Video.
Time to show this to my company here and restructure some backups
Looks like most of the bad was easily avoided by smart planning
Please release a radio or full length cut of that song, it's great.
Nice coverage of this incident all around.
Backups: On Fire, non-existent
Gitlab: Meh. Whatever
These videos are pure gold, omg 😂
The notion of doing anything financially important online is just plain stupid. I computerized my spouse's business, years ago, and our company computers ran Quickbooks OFFLINE. Always offline. The program was installed from a disc. NO ONE had any way to get into our computers. Invoices were printed out and mailed to customers. Assuming computers have to be on the internet while in use is like assuming you can only drive a car on expressways and never back roads.
I'm glad I wasn't the only one here that loved the outro on this one. Great job on both the outro & the content (as always).
My wife actually had super with UniSuper up until about 5 months ago, small world.
Bro that back up free style at the end was tha bomb!
Eminem has been really quiet since this banger has dropped
Nice video Kevin, you really put a lot of work into this and it shows. I also feel like it was explained in an understandable way, so keep 'em coming!
I'm a UniSuper customer. Although it wasn't their fault they still did a shit job communicating this out. And when it was resolved they sent an email squarely blaming Google and even had them put out a statement.
6:01 As a human being myself, I love seeing these kinds of posts when they turn out to be wrong. It's like seeing Wile E Coyote run confidently into a brick wall.
a kevin fang upload day is a great day
Those are some sick beats you dropping sir. I'm dancing in my chair here
That Outro track kinda went hard 😂
Main thing was they did not have proper disaster recovery plan, even if they had, they did not check that it's working.
5:59
"as a cloud engineer this is 100% unisuper's fault"
We need that outro as a full song on Spotify.
Straight fire outro
first video ive watched from this channel but that random song at the end caught me so off guard i had to subscribe😂👍
0:26 is it just me or does this sound like AI text to speech?
It is just you
It sounds like text to speech at first but as the video goes on it sounds more and more like an actual person.
it does completely
It definitely does
No, just sounds like FireShip
Amazing video mate and loved the outro😂💖
7:52 LMAO NOT THE JOJO REFERENCE
Kevin Fang, awesome video keep up the good content
Stick around for the song guys
The backup song goes SO hard
this is a great video and all but are we gonna cover the recent crowdstrike problems?
but it's such a BOOOOORING issue
He’s probably gonna wait until there’s more reports on it or smth if he covers it
@@konstakuosmanen it wasnt that boring, they recently released the "External Technical Root Cause Analysis" which has some funny stuff in it
Next year
@@ahmoin /reads/ Note to self: Don't make any travel plans this Thursday, just in case... But yea, it's actually quite interesting that the empty file wasn't the actual root cause, and it was really all about an off-by-one index... And regexes.
ADD: And... Did they just call a bunch of regexes "advanced AI?"
I can't wait for your video on the recent Crowdstrike issue, especially Delta's storyline 🤯