Neighbour: "Why doesn't your son go out and enjoy the sunlight?" Parent: "Because my son is working on code that will change the world" The son: "My left-pad shoots the fastest blanks in the west"
@5:10 "stick with vendor implemented options and you will mostly get the best performance" Reminds me of the joke "How do you write faster Python? Use C."
@ThePrimeagen Bravo! Good to see you addressing the reality of environmental factors on timing benchmarks. With my background in large-scale scientific computing on shared resources, it was the first thing I thought of when I saw you discover (on Twitch) that your live-coded leftpad was "slower". The unpredictability is real. And it's frakkin' annoying.
I Also happen to be a scientist doing the science and math is cool especially when it proves things with charts. Id like to see more videos of whose chart is the fastest please
4:44 well, ackshually, Strager posted a video here on yt where he absolutely destroys cpp and rust's native hashtable speeds by implementing his own hashing function. Of course it was designed for his specific use case, but I think it should be noted that the real take away of this are the friends we made along the way.
that video was crazy, every time he increased the number of consults per seconds you would say "that's hard or impossible to improve" then he proves you wrong and repeat for 10 min more.
@@andrejsk6211 You can also give a different hash function to std::unordered_map, but that's not why it's terrible. The reason is that, in order to prevent iterator invalidation, they made it so that each entry in the hash map points to a linked list containing all the values. That all but guarantees a cache miss on every access, and it's too late to fix. Everyone needing an actual hash table in any context where performance matters will probably use the boost or abseil versions.
Love this full on demonstration of what Prime means when he says mucrobenchmarks lie. Also love that he used the average of medians instead of the average of averages, this man how to measure anything’s. Sending all the signals for more of this.
In case you are curious, leftpadTravvy doubles the size of padding on every iteration, this means that only O(log(n)) string additions is performed. Although I have no idea why it is better to do it this way instead of adding a string of a given length right away. This technique is quite cool because it can be used in various domains, for example to raise numbers to a power of n quickly
Nah because the second I saw that algorithm i’m like “oh shit that’s fast pow!” and immediately knew what it was doing, just from that one dumb bit shift (fast-pow video (TW: python) th-cam.com/video/GrNJE6ogyQU/w-d-xo.html)
Haha same, stopped the video, got pen and paper, opened the mdn documentation for & and >>= and started working out each line. This is how I imagine C-programmers work all day (probably not, I assume this level of functionality has been optimized to all hell and back by people much smarter than me in the last ~50 years)
"adding a string of a given length right away"? Because the number of chars is usually only known at runtime. So there is no string of a given length right away :D
Great video dude, would love to see more videos like this that focus around benchmarking. I really liked your benchmark videos you were doing awhile back with Rust vs TS vs Go. Another thing i'd like to see a video on, is some parsing code to an AST with rust. I've done a good bit of it with babel or jscodeshift, but it feels like another world when i'm trying to do it in Rust. Thanks again dude for all your hard work and long hours put into this stuff.
I like how much JS-Fuckery the Travvy solution managed to build in their. Hats off, I genuinely feel like I learned something just from figuring out why its so damn fast
The classic linux book called Rebel Code by Glyn Moody contains a chapter on "lies and benchmarks". Statistics and benchmarks very often are misused or abused for propaganda and marketing purposes.
This sort of power of two algorithm is really common in cryptography code where you have to do things like modular exponentiation. I think they actually use even wackier versions there with windows for doing multiple bits at once, I'd have to go read a crypto library again to be sure.
Loving the vids Prime! I stumbled upon your videos a few months ago and now I'm watching every one of them. You've inspired me to try and learn rust as well and coming from Python/Golang I feel like I'm going to get destroyed but at least I'll be able to write code that can run on literally anything!!
This was very instructive on why we shouldnt re-re-re-invent the wheel, though I am in favor of trying to re-invent the wheel to understand how some mechanics works, to learn new stuff, but at the end we should stick to what has been done and been proven to be functionnal and fast !
Yes , use ready made wheels to build spaceships. It's why leftpad caught so many off guard because downstream vendors of important material aren't given a thought , or more importantly a single dime.
That's why I like Prime, he is not afraid to be wrong, not afraid to admit that he was wrong, and has the skill to get to the bottom of the issue. Would be even more terrific if the method was explained in more detail (why medians, why sum etc), but probably his disappointing stats on longer videos mean he's forever skewing towards "funner", more dynamic, less involved, style.
Median is a weird metric - it captures an idea like "a normal run" but doesn't capture the data's skew unless the distribution of the differences from the median are symmetric, in which case the median is the same as the average because it has no skew. So, does it?
The median tends to be a lot better in these cases. Especially when it comes to the internet's. At least that is my general experience with things. But everyone has their favorite way of doing things.
It's identical to how the string.repeat (used in the one I posted) and string.padStart built-in functions work in terms of their logic. Mine's slightly slower using .repeat because there's some overhead in the process of using that over the native, but his is effectively identical in performance to the native because they're doing the same thing without that overhead.
I'm thinking would calculating standard deviation also give insight on the algorithms? Is it possible some have inherently more variance than others but it gets hidden when you shrink the results to average of medians? Like perhaps you prefer just slightly slower algorithm if the worst case scenarios are not nearly as bad. I don't know how to interpret the statistics in this sense.
Microbenchmarks are totally not misleading in most cases if you benchmark your own data structure / algorithm with no unknown magic... Like strings as weird trees and stuff are unknown magic - but GC is the biggest unknown magic among all... Of course even then a microbench might be misleading a bit - like maybe an algorithm uses more memory (and thus also more cache), but is faster. When alone... It however incurs the extra memory cost for example - but this is well understood and it always amazes me how hard is to benchmark something in a managed / gc language properly.....
Was there a submission that used the classical version? function leftPad(str, len, ch) { var strlen = str.length; return len > strlen ? new Array(len - strlen).join(ch || ' ') + str : str; } That's what people used before there was .padStart() or .repeat().
ok but I still don't get why primes implementation was slower in the micro benchmarks was it garbage collection boosting the result for the original if so why or rather how, is using requests and manually trigger garbage collection the best way, what would happen if manual garbage collection was turned off, would be see that all the results would be randomized, would change the order of these test change the result, would prime's solution still win ? this video leaves me with more questions than it answers
I want to see an explanation video by whoever the f wrote that leftpadTravvy implementation. I've been writing JS for a decade and I have absolutely no idea what he was doing.
Well this isnt really fair to prime he wrote that thing in 2 seconds on a livestream, im sure the twitter fellas did a few iterations before publishing their results
Listen man, the last time I stuck to vendor implementations I got a new gamepad instance from navigator every frame I held an input down in my electron app.
You could make a funny video shitting on Haskell strings. The default implementation is a linked list of characters, so if you want efficient strings you have to use lazy bytestrings or... There're like 4 implementations of strings which makes it hard to know which one to use
Neighbour: "Why doesn't your son go out and enjoy the sunlight?"
Parent: "Because my son is working on code that will change the world"
The son: "My left-pad shoots the fastest blanks in the west"
xDDD
The lengths this man went to to prove he's right 😂
Science!
When you know you're right! 😂
Just so his wife doesn't think he's a failure
Lengths which are powers of two!
The ultimate way to prove you were only slightly slower than everyone else is to spend 36 hours sending 50 million requests.
The chad never admits defeat 🗿
To please the Wifeagen
@@programmer1356 lmao TheWifeagen!
SPEND 36 HOURS SENDING DEEZ NUTS
@@scheimong I hope he rolled a server in Ligmu
Next competition should be about the most efficient way to measure dicts
Hard to beat this joke. 👏
HEARTY Cackling Chuckle in public.
I think it's already proven that middle out is the most efficient way to measure as well as _blow_ the dicts
@5:10 "stick with vendor implemented options and you will mostly get the best performance"
Reminds me of the joke "How do you write faster Python? Use C."
hahahaha, the comment "this cured my impostor syndrome" has me dying lollll
@ThePrimeagen Bravo! Good to see you addressing the reality of environmental factors on timing benchmarks. With my background in large-scale scientific computing on shared resources, it was the first thing I thought of when I saw you discover (on Twitch) that your live-coded leftpad was "slower". The unpredictability is real. And it's frakkin' annoying.
Very frustrating
I Also happen to be a scientist doing the science and math is cool especially when it proves things with charts. Id like to see more videos of whose chart is the fastest please
4:44 well, ackshually, Strager posted a video here on yt where he absolutely destroys cpp and rust's native hashtable speeds by implementing his own hashing function. Of course it was designed for his specific use case, but I think it should be noted that the real take away of this are the friends we made along the way.
that video was crazy, every time he increased the number of consults per seconds you would say "that's hard or impossible to improve" then he proves you wrong and repeat for 10 min more.
well he means in the context of web, C++ is different.
" destroys cpp and rust's native hashtable speeds"
I don't know about Rust but std::unordered_map is well known to be garbage
@@isodoubIet Rust uses a cryptographically safe hashing function by default, but I believe you can give it a different one.
@@andrejsk6211 You can also give a different hash function to std::unordered_map, but that's not why it's terrible. The reason is that, in order to prevent iterator invalidation, they made it so that each entry in the hash map points to a linked list containing all the values. That all but guarantees a cache miss on every access, and it's too late to fix.
Everyone needing an actual hash table in any context where performance matters will probably use the boost or abseil versions.
you should definitely do more videos like this, they are fun to watch and helps to understand who to become a batter developer
yeah he's the BAST
This reminds me of that time TechSavvyTravvy beat Prime at leftPad.
Love this full on demonstration of what Prime means when he says mucrobenchmarks lie. Also love that he used the average of medians instead of the average of averages, this man how to measure anything’s.
Sending all the signals for more of this.
In case you are curious, leftpadTravvy doubles the size of padding on every iteration, this means that only O(log(n)) string additions is performed. Although I have no idea why it is better to do it this way instead of adding a string of a given length right away.
This technique is quite cool because it can be used in various domains, for example to raise numbers to a power of n quickly
Ah, silly of me to jump to the comments without actually finishing the video
yeah I stopped to work it out too, it’s neat!
Nah because the second I saw that algorithm i’m like “oh shit that’s fast pow!” and immediately knew what it was doing, just from that one dumb bit shift
(fast-pow video (TW: python) th-cam.com/video/GrNJE6ogyQU/w-d-xo.html)
Haha same, stopped the video, got pen and paper, opened the mdn documentation for & and >>= and started working out each line. This is how I imagine C-programmers work all day (probably not, I assume this level of functionality has been optimized to all hell and back by people much smarter than me in the last ~50 years)
"adding a string of a given length right away"? Because the number of chars is usually only known at runtime. So there is no string of a given length right away :D
Great video dude, would love to see more videos like this that focus around benchmarking. I really liked your benchmark videos you were doing awhile back with Rust vs TS vs Go. Another thing i'd like to see a video on, is some parsing code to an AST with rust. I've done a good bit of it with babel or jscodeshift, but it feels like another world when i'm trying to do it in Rust.
Thanks again dude for all your hard work and long hours put into this stuff.
"All benchmarks are wrong, but some are useful."
I like how much JS-Fuckery the Travvy solution managed to build in their. Hats off, I genuinely feel like I learned something just from figuring out why its so damn fast
The classic linux book called Rebel Code by Glyn Moody contains a chapter on "lies and benchmarks". Statistics and benchmarks very often are misused or abused for propaganda and marketing purposes.
Imposter syndrome left uncured
My goal is to make sure everyone feels like they have imposter syndrome
Thank goodness you're coder, that ego was directed into the right place
I respect going this hard about anything.
On the other hand, I love weird incantations.
Good shit Prime.
Also I've come to understand V8 a lot more since this event and still have more to go. I cannot recommend trying it.
You're not a failure to me Prime! Keep em coming. HOT video.
I had to check if the video was running at x1.25 speed by accident.
No Errorbars/Standard deviations though. Can't really judge the data without them.
This sort of power of two algorithm is really common in cryptography code where you have to do things like modular exponentiation. I think they actually use even wackier versions there with windows for doing multiple bits at once, I'd have to go read a crypto library again to be sure.
Making the internet better one function at a time, this is great!
Signal. More of this please. This is gold content
5:20 dude... i.. have no idea what's happening in the code.. and i am programming since i am 15...
GREAT VIDEO SO GLAD YOU ARE BACK TO POSTING HERE
Classic Prime video, keep it up my man
Let's goo finally more second channel uploads
the only thing that matters here is that he proved that people doesn't understand basic statistics... what a great dude
Exponential string concatenation is faster because it's O(log(n))! It only has to iterate over the number's bits.
I like this type of content. It makes me feel as smart as Tom
Loving the vids Prime! I stumbled upon your videos a few months ago and now I'm watching every one of them. You've inspired me to try and learn rust as well and coming from Python/Golang I feel like I'm going to get destroyed but at least I'll be able to write code that can run on literally anything!!
This was very instructive on why we shouldnt re-re-re-invent the wheel, though I am in favor of trying to re-invent the wheel to understand how some mechanics works, to learn new stuff, but at the end we should stick to what has been done and been proven to be functionnal and fast !
Yes , use ready made wheels to build spaceships. It's why leftpad caught so many off guard because downstream vendors of important material aren't given a thought , or more importantly a single dime.
That's why I like Prime, he is not afraid to be wrong, not afraid to admit that he was wrong, and has the skill to get to the bottom of the issue. Would be even more terrific if the method was explained in more detail (why medians, why sum etc), but probably his disappointing stats on longer videos mean he's forever skewing towards "funner", more dynamic, less involved, style.
I guess imposter syndrome is back on the menu
It's 8am and my brain isn't on yet, but this was good to hear, so here's some algo juice as requested.
Median is a weird metric - it captures an idea like "a normal run" but doesn't capture the data's skew unless the distribution of the differences from the median are symmetric, in which case the median is the same as the average because it has no skew. So, does it?
The median tends to be a lot better in these cases. Especially when it comes to the internet's. At least that is my general experience with things.
But everyone has their favorite way of doing things.
It def isn't the ideal way (line charts/histograms). But it probably doesn't matter because the variability/spread is so low.
Nice, my shit talking comment made it into the video 😂 sorry Prime!
And thank you for going an extra mile to enlighten us
travvy's solution is the fastest because it does at most 2*log(N) + 1 concatenations and half of them are destructive.
It's identical to how the string.repeat (used in the one I posted) and string.padStart built-in functions work in terms of their logic. Mine's slightly slower using .repeat because there's some overhead in the process of using that over the native, but his is effectively identical in performance to the native because they're doing the same thing without that overhead.
I really enjoy how you can make more complicated topics fun
While watching this my partner said “this guy sounds like Gru” and now I can’t unhear it
let's goo finally another yt vid from my fav youtuber
Thank you, Mr. ThePrimeagen!
We really need to shout "Charts!!" like Prime did at 03:30 when present in corp
I'm at work primeagen, you can't be that funny at the end 😂
THIS IS A MARKER! Flip nice editing btw
Really great, I love that power of two algorithm and that you mentioned String.repeat does it!
Would love to see a video going into depth about how to apply these techniques to other problems.
Glad your back on the main channel!
Whoever is your editor, pay them more. They earned it with this one.
Appreciate you bro🙏
The left pad incident. I didn't know it was a thing since Prime rewrite left pad and it become a trend kekw
Toms a genius!
I'm thinking would calculating standard deviation also give insight on the algorithms? Is it possible some have inherently more variance than others but it gets hidden when you shrink the results to average of medians? Like perhaps you prefer just slightly slower algorithm if the worst case scenarios are not nearly as bad. I don't know how to interpret the statistics in this sense.
would love to see proportional difference in results between micro and linode(c) benchmarks
to see how much does actually microbenchmarks LIE :3
This is truly one of the moments of all time
blazingly fast 🤩
More of this prime!! Love from brasil
Microbenchmarks are totally not misleading in most cases if you benchmark your own data structure / algorithm with no unknown magic... Like strings as weird trees and stuff are unknown magic - but GC is the biggest unknown magic among all...
Of course even then a microbench might be misleading a bit - like maybe an algorithm uses more memory (and thus also more cache), but is faster. When alone... It however incurs the extra memory cost for example - but this is well understood and it always amazes me how hard is to benchmark something in a managed / gc language properly.....
Was there a submission that used the classical version?
function leftPad(str, len, ch) {
var strlen = str.length;
return len > strlen ? new Array(len - strlen).join(ch || ' ') + str : str;
}
That's what people used before there was .padStart() or .repeat().
Don't be a failure!
Great video :)
doing the stupid thing in good faith! Appreciate you man! :)
There is an extention called ' Blazingly™ ' which replaces words like blazing or blazingly with Blazingly ™
ok but I still don't get why primes implementation was slower in the micro benchmarks was it garbage collection boosting the result for the original if so why or rather how, is using requests and manually trigger garbage collection the best way, what would happen if manual garbage collection was turned off, would be see that all the results would be randomized, would change the order of these test change the result, would prime's solution still win ? this video leaves me with more questions than it answers
Thats why we love you prime
this is the stupid thing i do so u know that this is the type of video i like... thx for the content!
you're currently my favorite youtuber
i would actually say favorite person in the world but it seems a bit too much 🙄
You really got em. You did it bro.
Isnt the "standard" solution also (possibly) different, because it depends on the implementation each interpreter has?
Or am I missing something?
These days people think V8 is the only JavaScript engine there is, so that thought doesn't even occur to them.
3:05 why apache?
Isn't power of two string concatenation faster because the rope data structure is a binary tree?
Have a comment to my like so that the YT magic happens, you know. Be healthy.
The only reason for the leftPad node package to exist was until it was actually implemented by node js std string type.
I want to see an explanation video by whoever the f wrote that leftpadTravvy implementation. I've been writing JS for a decade and I have absolutely no idea what he was doing.
3:10, ooh, a never-nester, huh? 😀
benchmark deez 🥜
Well this isnt really fair to prime he wrote that thing in 2 seconds on a livestream, im sure the twitter fellas did a few iterations before publishing their results
He actually checked out the internal implementation of V8 string Repeat
so if i have a slow benchmark, i should change the way i benchmark to make it faster, i see
/s
Listen man, the last time I stuck to vendor implementations I got a new gamepad instance from navigator every frame I held an input down in my electron app.
What a character development arch.
I love you, Prime 💕
i understood nothing but i wish i was able to talk nerdy like you
I don't know why this was recommended to me, but thanks TH-cam I guess.
"use native", does this law/principle of superior workmanship translate to app development like tauri/flutter against swiftui?
Prime doesn't understand why recursion with tail call optimisation failed because he doesn't know that tail call optimisation only works in Safari.
He's back.
THIS A STUPID SIGNAL GR8 CONTENT I ENJOYED MY STAY
With a name like leftPadTravvy what else could you expect.
I AM SENDING SIGNALS THAT I WANT MOREARORMOAAAAAAAAAAAAAAR
Commenting for the algo to support prime
This Primeagen guy is the real deal, huh
how do you learn the deeper stuff of js? What search terms or black wizard do I have to consult
Awesome video, really enjoyed it. Its funnier if you had saw the moment im the stream xD
i never thought of this, is putting everything on a server the only solution?
You could make a funny video shitting on Haskell strings.
The default implementation is a linked list of characters, so if you want efficient strings you have to use lazy bytestrings or...
There're like 4 implementations of strings which makes it hard to know which one to use
Need more of these sorts of code analysis of your piers..Plz
Loved it! Here is a SIGNAL 🌝
I highly favour your efforts of diving deep into nitty gritty. It can inspire a lot.
I knew there is something wrong with the measuring by the time differentials
Would be very nice a video about performance bottlenecks and how tools that measure function call and whatever can be misleading