How Roblox Went Down For 73 Hours
ฝัง
- เผยแพร่เมื่อ 30 ก.ค. 2024
- A look into what happened behind the scenes during the longest outage in Roblox history.
Sources:
blog.roblox.com/2022/01/roblo...
www.hashicorp.com/resources/h...
roblox.fandom.com/wiki/2021_R...
roblox.fandom.com/wiki/Timeli...
news.ycombinator.com/item?id=...
raft.github.io/
www.lmdb.tech/media/20130329-d...
www.lmdb.tech/doc/
db.cs.cmu.edu/mmap-cidr2022/
• Free To Use Gameplay |...
Chapters:
0:00 Intro
0:33 HashiStack Explanation
4:47 Outage Investigation
8:20 Root Causes Found
11:30 Return to Service
12:19 Slow Leaders
15:56 Resolution
Corrections:
- At 9:44, the default unbuffered channel in Go does not hold any items and has a buffer size of 0. Sends to such a channel are blocked until another goroutine is ready to receive the value. The illustration in the video shows a *buffered channel of size 1* however the overall point still stands.
Music Credits:
- Firecracker by LEMMiNO ( • LEMMiNO - Firecracker ... )
- Impact Prelude by Kevin MacLeod
- We're Finally Landing by Home - วิทยาศาสตร์และเทคโนโลยี
Imagine just doing a Hobby project to understand a piece of Software and suddenly the complete Roblox Infrastrucure is build on it.
Open source developers: "Hey guys check out this thing I built in my spare time! It's not perfect but I'm making it freely available so other people can learn from--"
Large corporations: "FREE?? 👀👀🥵🥵👀👀"
i thought this was a joke,
xkcd 2347
@@xelspeth what is xkcd
@@shantilkhadatkar1195 webcomic. type that text into goold and youll get the comic.
every time I hear that someones hobby project caused a major outage somewhere I get the feeling that maybe big corporations should maybe check what software they are built on and support it's development/maintenance
XKCD 2347
Except that software relies on another software, which then relies on another software, which then relies on another software....
It can turn into an endless loop
@@humza890 it can't circular dependencies are usually rare and you can stop looking for dependencies once you've seen it. I also didn't mean, that every company has to look through all of their dependencies and maintain them all, but maybe picking a few or doing an audit of some of them every now and then would be beneficial to not only them, but the world as a whole
The Unix philosophy of "do one thing" and link against a ton of dependencies was a mistake.
It's what FUTO stands for.
the negative 900 million dollars hits hard 😭
Why? Trash game has trash income
@@_GhostMiner its not a game tho
its a game engine and hoster
@@Luna5829 **H O S T*
@@Luna5829 ackhually
@@_GhostMiner plenty of trash games hosted on roblox, plenty of great ones
Turns out, this video could be a great introduction to modern backend architecture and development.
I think all of his videos are a good resource for understanding different architectures and subsequently how fragile they can be lol
I worked at a global e-commerce company a year ago and their platform infrastructure is pretty similar, down to their use of etcd and go channel spaghetti 💀
and a great counter-example for troubleshooting....
the leaks are too
"A massive company with ... -$924 million net income" 💀
"Each minute of downtime costs us negative $1750, this must be fixed ASAP!"
@@klafbanglolol
That is absurd lmao
@@klafbang So does that mean they were earning money when they were down? 🤔
@@zaper2904no, because they still had expenses (developers trying to fix the servers) but reduced income (no micro transactions available).
Crowdstrike video when?
I still remember the day that it went down, people were blaming Chipotle (american fast casual chain) because they had an event that same day where you could claim a free burrito. People suspected that it was due to a mass influx of people, I knew (and a bunch of people too) that this wasn't an issue with influx of people. At the end of the day, it was a fun journey (more or less with the conspiracies, guessing correctly that it went down for 3 days months before this outage, and youtubers just milking on the outage). Thank you for making a video about this.
Wait. How in tf could Chipolte's traffic affect Roblox's servers. Whats the theoretical connection?
@@frezzingaces it was a sort of partnership between chipotle and roblox, so if you installed roblox and did a bunch of stuff you'd get a free burrito. I think that's what it was, roblox has done tons of these
Oh this happened during that time? Man the memes about the roblox crashes during its downtime were so enjoyable
@@baribari1000 Yeah, it was super easy too, you could do it in like 2 minutes on a new account, it gave you a free entrée instead of a free burrito, so you could actually choose most meals you wanted. The few times they did the event with chipotle, I probably earned like 35 or so free entrees, which is pretty decent!
wasnt there also a massive adopt me update at the time which also probably caused a large increase of active accounts
This is one of the biggest challenges of modern programming, depending on various 3rd party packages, not knowing what that package is, what it does, or whether it's even reliable, and moreover knowing what are the dependency of that 3rd party package and whether they are safe or not.
Also never update anything
*If its not a security fix @@Paulo27
Or HashiCorp, being a multi-billion dollar company, could just maintain the fucking project themselves instead of blindly using a 4-year-old abandoned pet project from some random person's GitHub page and trusting it to work in a large production environment.
And that's before the issues of relying on additional 3rd party companies to supply the correct 3rd party packages. Supply chain issues the whole way down.
Hi Kevin, amazing content as always! One minor correction @9:54 tho: Go unbuffered channel's length is 0, instead of 1, and it means the sender will get blocked until a receiver receives the value. What the video @9:54 showcase is actually a buffered channel with length 1 (e.g. result of make(chan string, 1)).
github repo: "it was a toy project never meant for production"
multibillion dollar company: "YAYEET"
"multibillion dollar company"
_-$924 million net income_
L pfp
This is like the XKCD of all of the world depending on a toy project someone abandoned 10 years ago
probably 2347... as someone mentioned in some comment above.....
Came here to look for crowdstrike, seems like im way too late🤣
Crowdstrike video incoming in 2 years
Imagine how it must feel, starting a free project just as a hobby, and planning to abandon it eventually, then pretty much half the internet starts using it as an important building block to support the web. Now you're just sitting there, and have a choice to make. Stop maintaining the software, and pretty much break half the internet or keep going, getting zero thanks, and zero dollars for your work.
The Crowdstrike video is going to hit pretty hard
IT global outage vid gonna go crazy
Well.. now we know the next video
yo when is the CrowdStrike video coming
Waiting for the Crowdstrike outage video!
Roblox players figuring out about the DNS steering and sharing ips for early access is kinda crazy 💀
You gotta make a video about the CrowdStrike outage
Can't wait for the CrowdStrike episode 😀
It’s crazy how much of the internet as a whole is in the hands of solo developers who made a thing in their spare time for fun
can you do a video about the current CrowdStrike Outage?
WHENS THE CLOUDSTRIKE EPISODE COMING OUT???
Kevin, get busy and make the Crowdstrike video 😂😢
Kevin Fang, big fan here. Please cover the clownstrike incident
This happened in the middle of my friends sleepover, when we were COMPLETELY into Roblox. He pretty much just came to play it. We checked like every 5 minutes if it got better.
We eventually just slept. THROUGH THE WHOLE THING
Edit: Are some of you really watching videos on Roblox and just hate people in the comment section who used to like the game? Find something better to do jeez
Seems like you guys need to find better games
@@ProblematicParag0n Isn't your avatar from a ripoff of Minetest?
@@N30ZUK1 minetest is a clone of minecraft...
@@N30ZUK1 Calling Minecraft a ripoff of Minetest is the most sweaty nerd Redditor thing you could do
@@tbuk8350 Tbh nothing is correct here. Minetest is not trying to be minecraft it's trying to be a general purpose voxel game engine (check out it's other gamemodes there's some pretty unique cool stuff in there)
This is by far my favorite documentary channel on yt
Whatever it took to make a video about a Roblox server crash and not use the "oof" SFX even once... I salute it.
It's on 6:45
@@MartijnvanBerkel gottem
6:45
@@MartijnvanBerkel I stand corrected. Frankly, using it only once is even more impressive.
i read this 2 seconds before the oof sound played, well done sir
Thank you for all the work you put into making this!!
the kids enter angry
the kids leave confused
Oh shit I was gonna suggest this as an idea, awesome to see that you did it!
Lol nice you're here
is that
@@glefyrhello call of duty black ops guy
Nice technical aspect of the outage!
Great. Hopefully you'll make a video about Windows bsod due to CrowdStrike
I'd imagine programmer Hell is just a bug like this which takes all of Eternity to fix, also it takes down the company's internal issue tracker and communication system.
How fast can you pop out a video? I think there might be something video worthy.
Thank you for this, been waiting for this one for awhile now!
KEVIN FANG JUST DROPPED A VIDEO ABOUT THE HALLOWS OUTAGE OH MY GOD
The burrito incident
@@jakfjfrgnei the slippery cord incident
Roblox is actually a bigger company than most think. Thanks for doing a video on it.
Next video on Crowd Strike update causing global outage!!
It was Crowdstrike, not Microsoft
9:45 "A default channel can only hold one piece of data at a time" It's actually even worse than this: an unbuffered channel also requires that this piece of data is received before a send can complete (!)
> And probably some machine learning and block chain for good measure
lmao nice
Damn I was waiting for this one
I'm glad you made a video on that. I had no idea how it went down behind the scenes! :D
there was an blog post made after the outage
Amazing CGI as always, thanks !!!
new kevin fang video
today is a good day
Man I love your videos, this was a particularly technical one, but still really well presented and interesting.
Thanks! I submitted this in as a suggestion a while ago, never thought it’d be published.
Another super interesting well researched + explained video! As a back end game dev, thanks for the nightmares!
daily appreciation of kevin's visual style, i love how you're able to break down the language i might take for granted and make it easily followable
Great great video, I seriously love the format and I learn so much
Haven't finished the video yet, but this already makes me feel better about the half-day internet outage I fixed at work
Thank you for the great information and entertainment video like always😊
Awesome summary, as always. Thank you! :D
This video is very well executed!
Love these videos please keep them coming!
Saddest day ever for 7 Y.O i hope they can recovery from this 😢
😂😂😂😂😂😂😂😂😂😂😂
developers probably missed out on millions of dollars too!
The oof sound was a chef kiss to this master piece of video. Great work as always.
Great video!
I like your stuff keep it up make more security related stuff!
Good work, these are both interesting from the tech perspective and just plain fun hah
its a good day when there ia a new kevin fang video
Another great video, I really enjoyed it.
There's probably heaps of outages you can do next, but perhaps you could do a video on the "OpenOffice can't print on Tuesdays" bug?
yo honey wake up, new Kevin Fang video to watch while at work
Tons of love for captioning your videos❤❤
love your vids! please make more
YAY GLAD YOU DROPPED THIS
Ah yes, that day in 2021 that i was working in Studio and the toolbox stopped working, and my ass almost had a heart attack because i though i got banned.
In life... you have roblox
(another BANGER kevin fang video, cant wait for the next)
Oh man, I love your channel so much. I can't wait to see the XZ backdoor video made by you, it's gonna be fun 😂
I love this series. It's like true crime or airplane disaster videos, but it can be fun, because nobody really gets hurt. Except for big corporations and Roblox players, and well... screw them.
That’s a bit harsh on Roblox players… I mean most of them are like 9 years old
@@hagangray8006if they aren’t 9 there’s a 50% chance they’re a predator or another kind of scum
@@absoultethings4213 or… just normal people. Big shocker I know
@@apersoniguess_impossible😱😱😱😱😱
unti money some rando get involved, yeahhh its really fun
great video, thanks dude
Always love kevin fang videos... but would you mind using I Home's we're finally landing closer to the end of the video please? Thx
love this video makes everything understandable!
your videos are quality over quantity
Let’s go bro. CrowdStrike is giving free material to your next video.
Well done as always
I love the way that you explain these complex incidents. You deserve a 冰淇淋🍦
these videos make me feel like im watching a some type of CSI crime documentary
Waiting for the crowdstrike video
the goat's back with another banger
Man I remember when this happened this was crazy.
finally, a good video on the infamous outage
Great video but why does it sound like an AI voice when you say "Availability Zone" at the end??
best roblox video
5:04 I also heard avatars broke before the whole game went out, and some players were able to play roblox but most of the scripts were missing so it was pretty unplayable. Is it because the game couldnt fetch those? Wow
Baby, wake up!
New Kevin Fang lore just dropped 🎉
I don't understand a single thing but I'm so incredibly curious that I want to know more
I genuinely really love this for some reason
Nice! I wish Roblox never recover from that!
new kevin fang video finally
Thanks for the explanation dude"
Great!
GO mentioned!
So you're telling me that I should continue to be paranoid about how every single line of code of my personal projects is not efficient or secure enough? Deal! Love this thank you!
When I heard "We're Finally Landing" I thought you were going to show some Roblox speedrunning
Finally, a Kevin Fang video about an outage I was witness to.
I thought your new video is going to be about xz untils :(
No way my favorite full stack youtuber talks about my favorite buggy god-cursed lego game
STARTING THE DAY OFF NICE !
I paused the video when that perf screenshot came up. 5 seconds later I'm like "why the hell did nobody check this before?" We love lock contention.