Memory, Cache Locality, and why Arrays are Fast (Data Structures and Optimization)

แชร์
ฝัง
  • เผยแพร่เมื่อ 25 มิ.ย. 2024
  • Why is the first loop 10x faster than the second, despite doing the exact same work?
    Follow me on:
    Twitter: / iced_coffee_dev
    Github: github.com/simondevyoutube/
    In this video we talk a bit about memory, the cpu caches (specifically the L1/L2/L3 cache), hardware prefetching, and how all this comes together for arrays. We'll be covering each of these topics in enough detail for you to get a solid understanding, working through real world examples to help illustrate the point. We'll talk about the most fundamental data structure, arrays, how they work, what situations they're great in, and when they suck. We'll also touch on some algorithmic complexity. Finally, we'll be talking about why understanding this is important and how this leads in to more advanced topics and data structures.
    What's covered:
    * How memory allocation works, memory addresses
    * Contiguous memory
    * CPU Caches, L1/L2/L3 cache
    * Hardware prefetching
    * Array operations, what's fast and what isn't
    * Closing thoughts, why this is important to understand, how this relates to more advanced data structures
  • วิทยาศาสตร์และเทคโนโลยี

ความคิดเห็น • 385

  • @simondev758
    @simondev758  3 ปีที่แล้ว +81

    If you enjoyed this, help support me: www.patreon.com/simondevyt
    As for the question at the beginning, as a first hint, look at how the indices for the arrays are generated in both loops. Write out a few, and think about the pattern in memory. Actual answer below.
    Both loops touch all memory exactly once.
    First loop generates indices sequentially (ie. 0, 1, 2, 3, 4, ...), meaning memory accesses will also be sequential.
    Second loop uses a large stride (0, 4000, 8000, 12000, ...), so cache misses will be common.

    • @simondev758
      @simondev758  3 ปีที่แล้ว +1

      @@heinzerbrew It's the indexing, just unroll the loop a few times to confirm. Sequential, predictable memory access vs jumping around, main topic of video. Php? Don't know much about it, other than it's an interpreted language. Guessing the overhead of that drowns out other things.

    • @simondev758
      @simondev758  3 ปีที่แล้ว +1

      @@heinzerbrew Again, you can't use php to test this. And this has nothing to do with memory allocations. Additionally, please don't use the pinned comment for this.

    • @hww3136
      @hww3136 3 ปีที่แล้ว +1

      would a modern c/c++ compiler be smart enough to detect and optimize it?

    • @simondev758
      @simondev758  3 ปีที่แล้ว +10

      This is a pretty contrived example. I tried to think of the absolute, most basic thing that would illustrate the point without needing to build a lot of extra boilerplate code around it. But in this example, using "gcc -O3" doesn't optimize it, but gcc -Ofast does, because it disables strict standards compliance. So it does sorta catch it, but is held back because you have to explicitly tell the compiler that you don't care much about the order. Remember that floating point operations are NOT associative, so it tends to be the compiler reordering them without explicit permission from the programmer is a no-no. V8 doesn't seem to catch it, but can't say for sure why.
      So the difference in timings between the loops should still largely come down the difference in memory access patterns.
      We could make the setup a little more complex with some structs with useless info to pad them out, or an array vs linked list, or something to that effect. Maybe it would have been a better choice? Would be more legit, but harder to parse for beginners. I dunno, I'm learning how to present this as I go heh.

    • @Victor_Marius
      @Victor_Marius 2 ปีที่แล้ว

      @@hww3136 what if you intend to iterate that way? You'll get a bug

  • @WojtekKozowski
    @WojtekKozowski 3 ปีที่แล้ว +481

    “why the code on the top runs faster? watch the video and find out” ...and no mention of the code again. Otherwise nice video.

    • @simondev758
      @simondev758  3 ปีที่แล้ว +218

      Hah yeah, I feel kinda stupid for not looping around, live and learn. I'll continue the series and make sure I actually answer it next time!

    • @metagen77
      @metagen77 3 ปีที่แล้ว +22

      @@simondev758 Nah you managed to explain without relying on code. If anything cut the intro.

    • @bonehelm
      @bonehelm 3 ปีที่แล้ว +85

      One is accessing the array in row major order, and the other is accessing it in column major order. The top one is faster because the memory is being accessed in a contiguous way, which means you'll get a lot of cache hits (very fast, going to main memory much less). The bottom is is NOT accessing memory in a contagious way, meaning every access is a cache MISS. en.wikipedia.org/wiki/Row-_and_column-major_order

    • @tagg1080
      @tagg1080 3 ปีที่แล้ว +10

      @@bonehelm maybe I am blind but the only thing he switched was the + and *

    • @bonehelm
      @bonehelm 3 ปีที่แล้ว +27

      @@tagg1080 That's exactly the only change. He's using a formula to treat a 1D array as if it were a 2D array. But the order of + and * determines whether it's accessed row by row, or column by column.

  • @User-ty2ml
    @User-ty2ml 2 ปีที่แล้ว +21

    Best Director Award goes to Simon, i have never seen such a beautiful explanation without answering question, amazing!!!!

  • @vitalino1981
    @vitalino1981 3 ปีที่แล้ว +73

    "let's head to apple store. I don't actually own one, because they are super expensive and I'm super cheap, but I like to look at them" 😂🤣😆
    Man this was soooo relatable on so many levels 😂

  • @xeridea
    @xeridea ปีที่แล้ว +48

    You didn't explain why the original code posted was 10x slower. The reason being that the method in which the array was effectively traversed was not sequential or easily predictable, causing a lot of cache misses.

  • @sukkrad7797
    @sukkrad7797 3 ปีที่แล้ว +16

    This cleared up so many things that my professor failed to teach me (or even understand it himself)
    We truly need more videos explaining how the cpu works when we code
    Thank you so much!

    • @simondev758
      @simondev758  3 ปีที่แล้ว +2

      Nice, lemme know if there's other subjects I should cover too.

  • @MrLazini
    @MrLazini ปีที่แล้ว +11

    This channel is pure gold man. Although I already know most of these concepts, you explain them pretty clearly and in a very understandable way ! Well done friend, subbed!

  • @Chadderbox
    @Chadderbox 3 ปีที่แล้ว +195

    I feel like this is going to be an awesome series, especially for people like me who learnt programming by themselves and never really had a formal education teaching them about what parts of your code does under the hood.

    • @simondev758
      @simondev758  3 ปีที่แล้ว +93

      I did study computer science, but none of this was covered. This is mostly from years of experience learning at companies and applying these optimizations.

    • @carterearles9528
      @carterearles9528 3 ปีที่แล้ว +1

      Me too!

    • @Ainigma
      @Ainigma 3 ปีที่แล้ว +8

      @@simondev758 i share this experience. in computer science, we also never covered that. only when i had to do projects and facing performance optimization problems, i was motivated to tackle the fundamentals of data structures and how to squeeze more and more efficiancy out of my code.

    • @barddz4646
      @barddz4646 3 ปีที่แล้ว +6

      @@Ainigma wait really? I’m at the first semester of computer engeneering and i’ve alterady learned about this, Idk if the way they teach is different here in Brazil or not

    • @simondev758
      @simondev758  3 ปีที่แล้ว +18

      Different universities teach different curricula. I went to a university that's considered pretty good (ie. often included in top lists), the focus was much more on theory and mathematics. I mentored a few Stanford grads while I was at Google, came off with the same impression. Super strong bases in theory/math.
      Also, computer engineering != computer science. I can totally believe that an engineering oriented curricula would learn way more about the underlying hardware.

  • @Rssks
    @Rssks 3 ปีที่แล้ว +117

    Wow this video is almost 10 minutes but it felt like 1 minute! I really enjoyed it (Y) will definitely rewatch this later today!

    • @simondev758
      @simondev758  3 ปีที่แล้ว +5

      Awesome!

    • @Provitia
      @Provitia 3 ปีที่แล้ว +1

      The video became x10 faster for you

    • @psychopompous489
      @psychopompous489 ปีที่แล้ว

      I was jumping around randomly and it felt like 10 minutes

  • @osambrojevanisam3575
    @osambrojevanisam3575 3 ปีที่แล้ว +18

    This has got to be the best educational video on youtube, fuckimg amazing

    • @simondev758
      @simondev758  3 ปีที่แล้ว +4

      Awesome, glad you enjoyed it!

  • @AllAboutCode
    @AllAboutCode 3 ปีที่แล้ว +34

    4:20 you ain't cheap bruh, your content is way better than expensive internet courses.

    • @simondev758
      @simondev758  3 ปีที่แล้ว +3

      ty! If you have suggestions for future vids, let me know.

    • @AllAboutCode
      @AllAboutCode 3 ปีที่แล้ว +3

      @@simondev758I'd love this series just keep doing it

    • @valencefootball9740
      @valencefootball9740 3 ปีที่แล้ว +1

      hahaha weed number

    • @masterofdizzzaster
      @masterofdizzzaster 3 ปีที่แล้ว +1

      @@simondev758 I hope you will grow so much that you will be able to afford one of those macs without feeling its price, i watch your videos for quite some time now and you deserve it.

  • @cuteswan
    @cuteswan ปีที่แล้ว +2

    I dutifully paused the video at the beginning and took _way_ too long to spot the differences between the code samples. (I've never been much of a programmer.) After getting it I still learned a lot more from your discussion on the caches and all that. Thanks.

  • @EpsilonKnight7
    @EpsilonKnight7 4 หลายเดือนก่อน +1

    Posting this for anyone else out there like me.
    As someone who knows JUST enough code to be dangerous, I feel dense for not getting it sooner.
    My brain absolutely looked at the two for loops, saw them moving through the same numbers, and even though I recognized there was a difference in the last line, my thought process collapsed to "alright so you move through all the same array positions eventually," and I spent ten minutes trying to figure out, on my own, why changing the order of operations would make it faster while ignoring, y'know, the actual result of iteratively running the loops.
    The first comment I saw mentioned row vs column major ordering, and even then, I thought "that's silly, it's a 1D array!" and failed to make the connection.
    Took me seeing someone list the resulting indices in order to finally click. Despite so much of video focusing on contiguousness.
    So for anyone else that did the same, we're a special kind of stupid, but at least not a unique kind.

    • @simondev758
      @simondev758  4 หลายเดือนก่อน

      It's hard, which is why optimization is often a specialty.

  • @kosnk
    @kosnk 3 ปีที่แล้ว +1

    An amazing series start, Simon! Thank you for this dive into memory allocation/access details, very helpful! Surely looking forward to a (contiguous?) continuation and your other quick projects & cool tips! Very kind of you to put in so much work and share it. Again, thanks!

    • @simondev758
      @simondev758  3 ปีที่แล้ว +1

      np, glad you enjoyed it!

  • @aufkeinsten7883
    @aufkeinsten7883 3 ปีที่แล้ว +2

    you are made for creating videos like these, I could watch this for hours, thanks so much for the content

  • @thegibusguy4969
    @thegibusguy4969 ปีที่แล้ว +2

    Hey, thanks for giving me a good explanation on what cache hits/misses were. I've seen it being tossed around but never knew what it meant until now.

  • @mahmoudhammmad8089
    @mahmoudhammmad8089 3 ปีที่แล้ว +6

    AMAZING!!!
    Waiting to watch the rest of this series

  • @eddiemuller3157
    @eddiemuller3157 3 ปีที่แล้ว +6

    Good Stuff, my man! The explanation is very clear and the dry humor is appreciated. I went into this video not knowing what a contiguous array was and now I have an idea of what it is.

    • @eddiemuller3157
      @eddiemuller3157 3 ปีที่แล้ว

      Question; I understand that the L1, L2, and L3 caches reside on the CPU itself. Is there a typically a controller on the CPU like a northbridge? Or is it the CPU itself that access the caches hence the low latency; no middle man.

    • @simondev758
      @simondev758  3 ปีที่แล้ว +1

      That's awesome! That's right, caches typically reside on the CPU. They're also built with a different kind of memory (SRAM), faster. Northbridge is the interface between CPU & main memory and resides on motherboard. That's my understanding anyway.

  • @katarinazivkovic3102
    @katarinazivkovic3102 3 ปีที่แล้ว +4

    This is explained so clearly, I'm now going to binge watch your videos 😃 Thanks to YT for recommending me your channel. Subscribed! 🌟

  • @tonnose
    @tonnose 3 ปีที่แล้ว +1

    I don't remember I watched this video before until I saw the L1/2/3 Cache example. The drawing is making the video so interesting. Now I come back for the new videos after finishing my algorithm final exam!

  • @danishfayaz8077
    @danishfayaz8077 3 ปีที่แล้ว +1

    Absolutely loved it. Just the right way to explain a concept top to bottom. Keep it up

  • @pranjalagnihotri6072
    @pranjalagnihotri6072 2 ปีที่แล้ว +1

    I love how he goes into depth of things, please continue your great efforts sir 🔥

  • @miteshkumar5557
    @miteshkumar5557 3 ปีที่แล้ว +24

    Arrays are also slow when you have multiple threads, each bound to its own CPU, trying to access a global array. Suppose you have an array of size 4 int32_t, and 4 threads labeled 0 to i. The i th thread accesses the i th index in the array and modifies it repeatedly. This will cause major slowdowns as each modification will mark the cache of the other 3 threads/CPUs as modified, and each thread will have to do extra work to update their cache, even though the value that the thread cares about isn't being modified. This essentially makes the parallel code sequential. A linked list would be much better to use here since the memory locations will not be adjacent to each other, or each thread can modify a local variable and only write to the array at the end.

    • @xeridea
      @xeridea ปีที่แล้ว +5

      I have code that uses multiple threads to generate large arrays using perlin noise, and get near perfect scaling with threads. If you were doing something that multiple threads had tons of modifications, the L3 cache would still be valid, as that is shared between cores, so it wouldn't be that big of an issue. The bigger issue would be multiple threads constantly updating the same data causing race conditions, or bizarre behavior due to how memory reads and writes work.

  • @sva3338
    @sva3338 3 ปีที่แล้ว +3

    Holy f. This is so good, youtube finally recommend me something that worth my time. Please, release more episode!!!

    • @simondev758
      @simondev758  3 ปีที่แล้ว

      Second video in the series is the linked list one.

  • @mahima7638
    @mahima7638 2 ปีที่แล้ว +1

    Loved this video! Will enjoy watching the whole series definitely.

  • @als5321
    @als5321 3 ปีที่แล้ว +1

    This was amazing. THANK YOU. More like this for Javascript. Keep it coming and don't stop!

  • @bakerfx4968
    @bakerfx4968 3 ปีที่แล้ว +2

    I wasn’t ready for it to be over!

  • @nathanp6928
    @nathanp6928 3 ปีที่แล้ว +3

    I have a final tomorrow over this stuff and this video just happened to pop into my recommended. Probably cause I was googling virtual memory, caching, etc. Super informational video, and I was able to study and be entertained at the same time. I’m a fan 👍🏼

    • @simondev758
      @simondev758  3 ปีที่แล้ว +1

      Good luck on your final!

  • @Greatbubba747
    @Greatbubba747 3 ปีที่แล้ว

    This is a very great high-level view of arrays and CPU caching! I wish I had this video as an introduction when I was taking my Computer Systems class in college!

  • @pauljordan3064
    @pauljordan3064 3 ปีที่แล้ว +1

    Great explanation. Clear and concise. Thanks!

  • @Dionny
    @Dionny 3 ปีที่แล้ว +48

    But why is the code example from the very start of the video faster than the other example? I feel like that wasn't covered or I missed the point.

    • @simondev758
      @simondev758  3 ปีที่แล้ว +21

      Youness covered it well, but I left it open as sort of an "exercise for the viewer". Kinda wanted the video to be in depth enough that you could answer it yourself.

    • @mike.emery.
      @mike.emery. 3 ปีที่แล้ว +42

      @@simondev758 Leaving as an exercise to the viewer is fine, but as a suggestion for the format it would be good to circle back to the question at the end of the video and call it out. The way it's structured now it seems like the question is just forgotten about.

    • @simondev758
      @simondev758  3 ปีที่แล้ว +34

      Great feedback Mike, 100% agree! Tried it out but next time I'll make sure to close the circle.

    • @bonehelm
      @bonehelm 3 ปีที่แล้ว +2

      One is accessing the array in row major order, and the other is accessing it in column major order. The top one is faster because the memory is being accessed in a contiguous way, which means you'll get a lot of cache hits (very fast, going to main memory much less). The bottom is is NOT accessing memory in a contagious way, meaning every access is a cache MISS. en.wikipedia.org/wiki/Row-_and_column-major_order

    • @TiDuZzO
      @TiDuZzO 3 ปีที่แล้ว

      @@bonehelm I still don't get it, both FOR starts looping on x and then y. So both should have the same speed. The only difference here is that on the first one you do the multiplication before the addition... And on the second one the other way around... And it doesn't seems to be a continuous problem here....

  • @honey661
    @honey661 3 ปีที่แล้ว +2

    Even if I know this topic, your video was so interesting that time has passed unnoticed and I watched whole video. Wow

  • @pist5343
    @pist5343 3 ปีที่แล้ว +15

    Awesome! Looking forward to this series :)
    It's really not easy teaching about memory and data structures without the audience falling asleep, but you've done a great job!
    Unfortunately JS arrays are messy and not exactly contiguous... :(

    • @simondev758
      @simondev758  3 ปีที่แล้ว +7

      Hah yeah, I actually had a small section on JS arrays in there just to clarify about them, but cut it due to time (and laziness).
      You're totally right, they're non-contiguous "array-like" objects. The TypedArray is closer to what you'd get in C++ or something.

  • @boxicool
    @boxicool 3 ปีที่แล้ว +2

    Man. Your channel (even i don't use js so much) is freaking awesome!! I would buy your course or book without hesitation! Thank you!

    • @simondev758
      @simondev758  3 ปีที่แล้ว

      No courses at the moment, I just put it all online for free :)

  • @pinekel8987
    @pinekel8987 ปีที่แล้ว +1

    Thank you so much for creating this series, this is what ( even now ) I think more programmers ( and sometimes myself ) should pay attention to.

  • @kik8bg
    @kik8bg 3 ปีที่แล้ว

    A simple and interesting explation of arrays, I came here to watch this video just to make sure I knew exactly how everything works. Having hardware classes with programming coursesreally, does help out a lot to understand computers, both on hardware and software level.

  • @RobertAlexanderParraCiro
    @RobertAlexanderParraCiro 3 ปีที่แล้ว +1

    Problably the best vide have ever watched about how arrays works

  • @tomi.mocnik
    @tomi.mocnik 3 ปีที่แล้ว +2

    Got it as a recommendation, instantly subscribed.

  • @nissieln
    @nissieln 3 ปีที่แล้ว +1

    Superb series! thank you for the content!

  • @legendary9145
    @legendary9145 3 ปีที่แล้ว +1

    your videos deserve a way bigger audience

  • @mehmedkumali1135
    @mehmedkumali1135 3 ปีที่แล้ว +1

    Very helpful video, you explained everything clearly.. Reminded me of my university teacher back in the day. Thank you for the upload!

    • @simondev758
      @simondev758  3 ปีที่แล้ว

      Awesome, if you have suggestions on topics to cover too, let me know

  • @shaurya9055
    @shaurya9055 3 ปีที่แล้ว +3

    If you'd have been my tuition teacher I'd have been a topper in my class
    Your way of explaining is so cool man👌
    I'm inspired

  • @closery
    @closery 3 ปีที่แล้ว +1

    Sir, your channel is a gem! thx for these videos

  • @JNunoCoast
    @JNunoCoast 3 ปีที่แล้ว +1

    Already knew these concepts but still an amazing video, definitely helpful to cs students!

  • @ezequielgarrido51
    @ezequielgarrido51 3 ปีที่แล้ว +1

    Great series!

  • @shivankchopra8552
    @shivankchopra8552 2 ปีที่แล้ว

    Thanks for sharing this amazing tutorial!

  • @stLmpp
    @stLmpp 3 ปีที่แล้ว

    Your videos are amazing! Thank you for this content

  • @djflugel79
    @djflugel79 3 ปีที่แล้ว +3

    8:24 the animation made me smile

  • @BipinOli90
    @BipinOli90 3 ปีที่แล้ว +2

    Loving this series man. Please keep going. Would love to see more of real world optimization like you did previously. But yeah fundamentals need to be cleared up first.
    Btw I remember there was one cpu level hack which exploited this eager loading behavior in cpu. Some private part of memory was located by measuring the access speed, if the access was faster that meant the memory location just after current location was already cached by the cpu even though later on it was restricted to actually use due to other privacy constraints.

    • @simondev758
      @simondev758  3 ปีที่แล้ว +1

      More real world optimization? Sure, I can probably swing that.
      Sounds a lot like Meltdown: en.wikipedia.org/wiki/Meltdown_(security_vulnerability)

    • @BipinOli90
      @BipinOli90 3 ปีที่แล้ว

      @@simondev758 ah yeah, its Meltdown.
      Thanks man. Love your videos.

  • @inconito7281
    @inconito7281 3 ปีที่แล้ว +1

    This video was amazing, you have my subscription

  • @rewrittenbytes1616
    @rewrittenbytes1616 3 ปีที่แล้ว +6

    Arrays are fast, except when they aren’t ;)
    Great video!

  • @xicheung7867
    @xicheung7867 3 ปีที่แล้ว

    Your content PLUS the video production = 🙌 SUBSCRIBED

  • @_DD_15
    @_DD_15 2 ปีที่แล้ว

    Nice video, thanks for that!

  • @afsarzan
    @afsarzan 3 ปีที่แล้ว

    love your explanation and examples..

  • @yac7571
    @yac7571 ปีที่แล้ว +1

    Great job, I like your style. Easy like and sub 💯

  • @Chapali9a
    @Chapali9a 3 ปีที่แล้ว +1

    Learned a lot. Thank you

  • @yourfutureself4327
    @yourfutureself4327 ปีที่แล้ว +1

    6:02 snipe'ns a good job mate

  • @almazmusic
    @almazmusic 3 ปีที่แล้ว +1

    Okey, you got me :) awesome videos!

  • @perelmanych
    @perelmanych 3 ปีที่แล้ว +1

    For all those of you who are wondering why he accessed 1d array like if it is 2d, here is the real world example. Assume that in C++ you write a class that represents a matrix. You can go with 1d array or something like vector< vector< int > >, which you can think of as being 2d array. So why the hack then to use 1d array. There are many pros of using this approach. One of them is much faster work of constructor and destructor. Assume that you need to allocate memory for NxM matrix m. With 1d array it is one call of malloc (which asks the system at allocate a chunk of memory for you). With 2d array you should call malloc N times, once for each row. The same logic applies when you want to free memory. So now once we chose to represent our matrix as 1d array, assume that the user of our class wants to sum all of its elements. The user has no idea how did we choose to represent the matrix under the hood, he only knows that he can access an element of the matrix by using operator() as m(x, y). Your operator() in turn will have to calculate the position of an element by evaluating y * M + x and in the essence this will be exectly the same code as he showed in the video.

    • @simondev758
      @simondev758  3 ปีที่แล้ว +1

      Awesome explanation, exactly what I had in mind when making the example.

  • @binaryburnout3d
    @binaryburnout3d ปีที่แล้ว +1

    Here king, you dropped this. . .

  • @IgorSadovskii
    @IgorSadovskii 3 ปีที่แล้ว +1

    Omg is very useful, thanks a lot !

  • @AviPars
    @AviPars ปีที่แล้ว +1

    Nice video ! Thanks

  • @calebkrauter4027
    @calebkrauter4027 8 หลายเดือนก่อน

    Great video!

  • @jonasls
    @jonasls 3 ปีที่แล้ว +1

    I want more!! Can't wait

  • @giorgibatsiashvili4270
    @giorgibatsiashvili4270 3 ปีที่แล้ว +1

    i found this channel and its briliant

  • @piyushjaware
    @piyushjaware 2 ปีที่แล้ว

    Hi Man. Very nice explanation. I was curious how you made the animations in your video.

  • @NamLe-to2db
    @NamLe-to2db 3 ปีที่แล้ว +1

    I love this. Thank you for sharing.

  • @raimanish4843
    @raimanish4843 3 ปีที่แล้ว +2

    this was very helpful. thank you

  • @bwanamaina
    @bwanamaina 3 ปีที่แล้ว +1

    Well explained 👏👌👍

  • @FullThrottleTaco
    @FullThrottleTaco ปีที่แล้ว +1

    This video kicks ass.

  • @trieutruong2410
    @trieutruong2410 3 ปีที่แล้ว +1

    Great video. I’m your fan for now

  • @DARKCOP2011
    @DARKCOP2011 3 ปีที่แล้ว

    This is super nice!

  • @hardwired89
    @hardwired89 3 ปีที่แล้ว +2

    more js content sir love it❤️ thank you

  • @tuckerbeauchamp8192
    @tuckerbeauchamp8192 3 ปีที่แล้ว

    Your videos are awesome

  • @IrwinRodriguez
    @IrwinRodriguez 3 ปีที่แล้ว +1

    excellent explanation! new sub!

  • @prodsefv
    @prodsefv ปีที่แล้ว +1

    thanks simon

  • @count_of_pizza
    @count_of_pizza 3 ปีที่แล้ว

    You are my guru man

  • @apoorvaranjan9921
    @apoorvaranjan9921 3 ปีที่แล้ว +1

    This blew my mind

  • @amimulahsan1289
    @amimulahsan1289 3 ปีที่แล้ว +6

    Wow, man. I mean literally you just made it simple; not like other people who go through so much technical stuff and use buzz words.

    • @simondev758
      @simondev758  3 ปีที่แล้ว +3

      I mentored a junior guy recently who did that, spoke entirely in jargon and buzzwords. I always wondered if he specifically practiced doing that.

  • @gligoradrian784
    @gligoradrian784 3 ปีที่แล้ว +1

    Took me a good minute to find the difference between the 2 codes.

  • @tomg0
    @tomg0 3 ปีที่แล้ว +2

    Awesome Stuff!

  • @irfanukani
    @irfanukani 3 ปีที่แล้ว

    Awesome content ❤❤

  • @HersonBagay
    @HersonBagay 3 ปีที่แล้ว +5

    My only formal training in programming is an elective course in C in college. I've loved it so much I end up writing device drivers for baremetal code running on top of ARM Cortex M processors as a hobby. There is no substitute for learning the fundamentals. It pains me to see some universities starting Computer Science students with a course in Java...

    • @simondev758
      @simondev758  3 ปีที่แล้ว +1

      My first exposure to programming was Java in my intro to computer science class heh :P

    • @HersonBagay
      @HersonBagay 3 ปีที่แล้ว +1

      Some people are just naturally good at logic and programming. We can count ourselves in that group

  • @harshitjoshi3082
    @harshitjoshi3082 3 ปีที่แล้ว +1

    Great video 🔥👍

  • @obsoletepowercorrupts
    @obsoletepowercorrupts 3 ปีที่แล้ว

    There are pros and cons to this worth noting. It can be a valid form of redundancy to have an array as a 'look-up' to speed up data coming from a query in a relational database (rather than hitting the query over and over). That helps performance sometimes. However, the cache vulnerabilities can be exploited. An array does not have the same benefits of a linked-list in terms of being able to specify pointers for pass by address versus pass by reference. Also though, in some languages a linked list can be a problem as it needs objects, and that means garbage collection (Java) can happen at anytime, thereby making performance hard to predict and sometimes rather stuttering.

  • @lilwage6910
    @lilwage6910 4 หลายเดือนก่อน

    How does the processor look up data in the L-caches? I thought it was looking not for specific data, but for whatever data is in a specific data address. Do the L-caches keep the data alongside the address of that data in memory, where it came from?

  • @agsystems8220
    @agsystems8220 ปีที่แล้ว +1

    Great video. You may have forgotten to revisit and explain in detail the code at the start though...

    • @simondev758
      @simondev758  ปีที่แล้ว

      Yes... yes I did heh. Answer is pinned.

  • @nkululekomthimkulu1248
    @nkululekomthimkulu1248 2 ปีที่แล้ว

    I just subbed. I will be checking out your videos.

  • @vishwaratna
    @vishwaratna 3 ปีที่แล้ว

    I liked this channel so far. Keep making please...,🤩

  • @staplepin8099
    @staplepin8099 3 ปีที่แล้ว +2

    Amazing channel

  • @mfam98
    @mfam98 2 ปีที่แล้ว +1

    Had a chair on advanced computer architecture, this covered some weeks on a 10 min video. Wish I had this kind of resources back then. Also had to make a simulator with several caching sizes and strategies, good times 😂

    • @simondev758
      @simondev758  2 ปีที่แล้ว +1

      Ooh that simulator sounds kinda fun to play with.

  • @Skyefaux
    @Skyefaux 3 ปีที่แล้ว

    More of this please

    • @simondev758
      @simondev758  3 ปีที่แล้ว

      Working on the next one right now actually.

  • @khusamalfas2121
    @khusamalfas2121 3 ปีที่แล้ว +1

    Thank you)))

  • @hamsam789
    @hamsam789 3 ปีที่แล้ว +1

    This was really clear and easy to understand, thanks

  • @ahn_buguei
    @ahn_buguei 3 ปีที่แล้ว +1

    Thanks! I couldn't understand cache misses or contiguous arrays before your video. It felt like 3Blue1Brown of Computer Science

    • @ahn_buguei
      @ahn_buguei 3 ปีที่แล้ว +2

      I just applied this on my internship and I solved a serious bottleneck!!! Thanks Simon

    • @simondev758
      @simondev758  3 ปีที่แล้ว

      That's really cool! What kind of problem were you solving?

    • @ahn_buguei
      @ahn_buguei 3 ปีที่แล้ว +1

      I optimized some existent numerical C++ code (for signal processing) from 420ms per iteration to 45ms, but the goal was 30ms per iteration. There is a giant nested for loop that was taking 44ms to run. So I applied the change specified in the video to reduce cache misses. Now it's taking 22ms. Maybe I could further optimize it by writing a special data structure (probably just a linked list), but that's good enough for the manager by now and he has other priorities for me.

  • @abdelrahman5094
    @abdelrahman5094 3 ปีที่แล้ว +1

    Great explanation, thanks for your effort
    I hope to see future videos on how javascript garbage collection work behind the scenes

    • @simondev758
      @simondev758  3 ปีที่แล้ว +1

      Ooh that'd be an interesting subject.

    • @BipinOli90
      @BipinOli90 3 ปีที่แล้ว +1

      I think it uses the reachability idea. If there is no way a part of memory can be reached it will be garbage collected.
      They didn't use reference count approach because that doesn't handle 2 pieces referencing each other but are completely isolated and not reachable.

  • @denisrastegaev608
    @denisrastegaev608 3 ปีที่แล้ว +1

    Perfect!

  • @Nostrum84
    @Nostrum84 5 หลายเดือนก่อน

    Never even saw the difference in those highlighted for loops in the beginning, until I saw the answer :(

  • @meto4545
    @meto4545 ปีที่แล้ว

    Great video! But I still do not understand the Intro example, is the array in the bottom a dynamic array or just the top one a hash table?

  • @codeit9502
    @codeit9502 3 ปีที่แล้ว

    May you tell me which recorder and editing software you use, Because mine are completely bonkers.