All about MEMORY // Code Review

แชร์
ฝัง
  • เผยแพร่เมื่อ 23 พ.ย. 2024

ความคิดเห็น • 221

  • @TheCherno
    @TheCherno  2 ปีที่แล้ว +185

    Who's excited for part 2?
    Keep exploring at brilliant.org/TheCherno/ Get started for free, and hurry-the first 200 people get 20% off an annual premium subscription.

    • @pushqrdx
      @pushqrdx 2 ปีที่แล้ว +2

      Can you please point me to that sweet Visual Studio color scheme you're using?

    • @heman922
      @heman922 2 ปีที่แล้ว +3

      Plz do a series of cpu

    • @PhoenixDigitalGamer
      @PhoenixDigitalGamer 2 ปีที่แล้ว +1

      Can u make tutorial on creation of game engine Cinematics system. Please :)

    • @mr.anderson5077
      @mr.anderson5077 2 ปีที่แล้ว

      yes please

    • @harshsulakhe2720
      @harshsulakhe2720 2 ปีที่แล้ว +3

      can u plz make a complete VIDEO ON ASSEMBLER like the one similar to LINKER AND COMPILER

  • @Stowy
    @Stowy 2 ปีที่แล้ว +409

    Thanks a lot for looking at my code ! For the logging, I was using spdlog, but then I removed it because I wasn't able to import it using FetchContent haha. This is very useful feedback and I can't wait for the part 2 !

    • @ThePowerRanger
      @ThePowerRanger 2 ปีที่แล้ว +4

      Cheers, good luck for your classes!

    • @blazefirer
      @blazefirer 2 ปีที่แล้ว +25

      2rd

    • @Stowy
      @Stowy 2 ปีที่แล้ว +9

      @@blazefirer english is not my first language lol, my b

    • @blazefirer
      @blazefirer 2 ปีที่แล้ว +15

      @@Stowy its ok. I saw that there was only one reply and I would the 2nd so I couldn't resist making the joke

    • @ohmree
      @ohmree 2 ปีที่แล้ว +1

      I suggest taking a look at xmake as a replacement for cmake, it probably has spdlog in its repos and is just a pleasure to use in general.

  • @anon_y_mousse
    @anon_y_mousse 2 ปีที่แล้ว +217

    As a predominately C developer, I agree with and applaud his choice of adding "pp" to the end of file names to differentiate C and C++ header/source files. They are separate and it should be noted. Arena allocators are a good idea and I've implemented several that I use in my own libraries. Heap allocation need not always be super expensive, even with "vectors", and the mitigation technique I learned years ago that still works beautifully to this day is to scale by a factor of two and always reserve memory starting at some power of two. As to the comment about logging, yes, it is a Windows "feature" to slow down by such a large factor when logging to a console. If you use a Linux distro of nearly any variety you'll be surprised by how quick the terminal updates as compared to Windows.

    • @FREAKBAlT
      @FREAKBAlT 5 หลายเดือนก่อน +2

      pp

    • @cookiecrumbzi
      @cookiecrumbzi 4 หลายเดือนก่อน

      She pp behind my file till I core dump

    • @wowyomad
      @wowyomad 4 หลายเดือนก่อน

      ​@@FREAKBAlTsee pp

    • @zahash1045
      @zahash1045 หลายเดือนก่อน +1

      lol pp

  • @paligamy93
    @paligamy93 2 ปีที่แล้ว +7

    @8:13 would not recommend starting with _ ever because its too easy to make a mistake because "Use of two sequential underscore characters ( __ ) at the beginning of an identifier, or a single leading underscore followed by a capital letter, is reserved for C++ implementations in all scopes."
    @13:31 Not only do you want to be using pointers, but ask yourself "Do I need a hierarchy or do i just need several implementations of void ClassName::Update(float deltaTime)"? Because if you don't need a hierarchy, don't use one! Use type erasure and yes its still a function pointer and a potential cache miss, but it will simplify your code structure. Now you have a folder called Entities instead of a type named Entity that everything derives from and your type erased entity type now defines the contract a type must fulfill to be an entity instead of saying you HAVE to derive from Entity to be useful here.
    @14:51 Also known as a "cache-miss" because the writer was not as concerned about "cache locality"
    @19:59 std::sync_with_stdio(false) improves that time considerably but c++ iostreams are notoriously slow and the reason why is because of all the safeguarding overheads they do. The console is slow because it has to render which as you know is bleh. Logging libraries are the way to go in this case and not have them output to console but have them output to files. This is a graphical program so there shouldn't be a "console out" anyway. Create a new global logger named log or something at the very least. There are multithreaded logging libraries that will attempt to put your logs in chronological order if you don't want to split them.
    @25:41 On the virtual part: The operating system will allocate to you a "page" of memory when your current page is full so its basically the same thing as a small arena allocator, but its so much smaller than what an arena allocator will give you and many many system calls to the OS to ask for more "pages" is what makes allocation take so long. You're giving over your CPU cycles to the OS and that's going to mess up your execution cache because it code that's not in your program that's being called, malloc or whatever is going to be a function that's in a dynamic library aka a function pointer and more cache misses. Profile your system calls! You may find more than you expect. Also align your types (adds padding) so that when you do ask for a value its not going to have to ask for 2 lines (? proper name escapes me) because half of your object is on one line and the other half on another.
    @31:47 It looks like the number you're looking for is already computed with collision pairs as well. You seemed to know you needed to make a vector but made it too early! make instances as close as possible to where you use them.
    @32:09 I think the multiple solver problem is something that should be handled with a template. From what I saw you don't need to dynamically at runtime change your solver with the same types. Make your solver be something the compiler figures out.

  • @simonesasso8379
    @simonesasso8379 2 ปีที่แล้ว +127

    Yes, implementation and profiling of the optimizations would be super interesting to see!

    • @crumbled9774
      @crumbled9774 2 ปีที่แล้ว +4

      yes yes yes. Can't wish for anything better!

    • @ChrisM541
      @ChrisM541 2 ปีที่แล้ว +2

      Totally agree, that would be super interesting.

    • @ibrahimmahdi1299
      @ibrahimmahdi1299 2 ปีที่แล้ว

      can't wait for a video like that from the best "TheCherno"

  • @crystalferrai
    @crystalferrai ปีที่แล้ว +11

    31:45 Good advice about preallocating vectors. If this is a function that runs every frame, I would take it a step further and make the vectors persistent. Clear them at the start of the function and reuse them. This way the memory remains allocated and keeps getting reused. Another option would be to use an auto-resetting frame allocator like you mentioned earlier. However you go about it, the main idea is to not make new heap allocations every frame.

  • @mr.anderson5077
    @mr.anderson5077 2 ปีที่แล้ว +15

    Cherno, has a huge backlog of "The topic for another video", please keep it coming. yes, we do want a cpu cache, memory fragmentation , and what not in the multiverse video

  • @miguelguthridge
    @miguelguthridge 2 ปีที่แล้ว +10

    At 14:40 where you're talking about cache misses, there's a relevant article which is really good called "Your computer is not a fast PDP-11"

  • @thwKobas
    @thwKobas 2 ปีที่แล้ว +40

    I left C++ like a 7 years ago, and this brings so much memories and smile to my face. I'm watching your videos for few weeks now and must say good job and keep uploading. :)

    • @tathagatmani
      @tathagatmani 2 ปีที่แล้ว +1

      What did you switch to ?

    • @matthewe3813
      @matthewe3813 2 ปีที่แล้ว +1

      @@tathagatmani probably rust or c

    • @thwKobas
      @thwKobas 2 ปีที่แล้ว +3

      @@tathagatmani Actually I switched first to objective-C and then swift :D Doing iOS mobile development now

    • @Alperic27
      @Alperic27 ปีที่แล้ว

      c++ has evolved a looot … but he seems to be stuck in c++9x style.

  • @ThePhyskid
    @ThePhyskid 2 ปีที่แล้ว +20

    I'd really be interested in seeing how you add the optimizations. In particular, I'd be interested in seeing how you clean up the memory used by the arena allocator once you're left with holes.

  • @Mnmn-xi6cj
    @Mnmn-xi6cj 2 ปีที่แล้ว +17

    Would love to see you profiling this after your first look at it. I'm sure the stack allocation and growing of the vector each frame hits like a truck. That would also allow you to show some before/after benchmarks!

  • @Klusio19
    @Klusio19 2 ปีที่แล้ว +5

    I just started learning C++, currently I (I think) finished learning OOP concepts, and this video is so interesting for me actually! The stuff about the memory and access times is pretty interesting.

  • @jonathangrahl
    @jonathangrahl 2 ปีที่แล้ว +8

    Great topic! This has been in my head the latest weeks when implementing my path tracer and SaH BVH, and the optimisations really add up. Especially referring objects by index and saving them in a 1D array.

  • @jeffcummings3842
    @jeffcummings3842 2 ปีที่แล้ว +6

    You really caught my attention when talking about the CPU cache, as I've done some work with Assembly Language programming WAY in the past, but yeah, understanding how that works is an amazing detail for optimization. OMG, great idea with the logging to a file vs console, I'm just getting to the point in my project where it's starting to become medium sized, and logging is an issue already, so great to know that logging to files is more efficient...plus the macros... it probably helps that I am watching your video at a time when I'm considering re-working my entire codebase for my main project too. LOL OMG, that's amazing that you can pre-allocate memory and pass an allocator to the vector class, I'm totally going to look into this and try it! Great video, thanks for sharing.

  • @viraatchandra8498
    @viraatchandra8498 2 ปีที่แล้ว +3

    for c++ simple logging, you can look at `sync_with_stdio(false)` and `std::cin.tie(NULL)` calls to accelerate your `cout` code a bit. `printf` will in general be faster though because it doesn't deal a lot with multi threaded scenarios. there are even faster ways to output logs, but of course, its non trivial overhead.

  • @HLCaptain
    @HLCaptain 2 ปีที่แล้ว +12

    What I would like to see is you optimizing a project based on your recommentation you given in this video, then compare the results with an unoptimal solution with via a profiler. Would be super interesting! Great video though! :)

  • @Basel-ll8fj
    @Basel-ll8fj 2 ปีที่แล้ว +4

    this series is really fun to watch and very helpful

  • @sixtenhugosson
    @sixtenhugosson 2 ปีที่แล้ว +5

    If anyone wants to learn more about memory arenas, there's a good write-up called "Untangling Lifetimes: The Arena Allocator" by Ryan Fleury.

  • @SkyCityInc
    @SkyCityInc 2 ปีที่แล้ว +4

    This is awesome, makes me want to write my own physics engine as an exercise. Can't wait for the next video!

  • @atraps7882
    @atraps7882 2 ปีที่แล้ว +2

    im not even a game developer, i just work on the web and the cloud doing backend stuff but this is really interesting to watch. Subbed!!

  • @ShaunYCheng
    @ShaunYCheng 2 ปีที่แล้ว +16

    I'm not a game dev but this is still very educational.

  • @dealloc
    @dealloc 2 ปีที่แล้ว +1

    The reason it's slow to write to stdout is that things like std::flush, std::endl and new lines ("
    ") will flush the contents of the cout buffer into the stdout buffer terminal (writing to it) this happens instantly because terminals usually have little or no buffering, so it can appear instantly. This also happens with files on disk; although it's perceived as faster because it doesn't flush the contents as frequently, due to how the OS buffers the contents before writing to the file on disk. So it's not that terminals are slow, it's that any I/O is slow in general.
    You can avoid this by flushing the cout buffer less frequently (i.e. outside of loops) but it can be an architectural nightmare and often not needed, since you're probably more interested in up-to-date info when debugging. Do what Cherno (and many other projects) does and use different levels of logging for more granularity.

  • @wright777
    @wright777 2 ปีที่แล้ว

    For a better std::cout -> console performance:
    1. Call ios_base::sync_with_stdio(false);
    2. Call std::cin.tie(nullptr);
    3. Use '
    ' instead of std::endl

  • @fellypsantos_
    @fellypsantos_ 2 ปีที่แล้ว +4

    extremely valuable knowledge passed here, thanks Cherno ♥

  • @StevenMartinGuitar
    @StevenMartinGuitar 2 ปีที่แล้ว +2

    Would def love to see you profile this and then implement the optimisations and profile again! (threading, arena, allocators, less heap etc) great video!

  • @squelchedotter
    @squelchedotter 2 ปีที่แล้ว +3

    I wouldn't expect that the virtual memory thing matters all that much considering current CPUs don't prefetch across page boundaries anyway. But things like huge pages do have advantages in terms of TLB lookups and hit rates.

  • @mementomori7160
    @mementomori7160 2 ปีที่แล้ว +2

    I really liked this video, all in for part 2

  • @uploadschedule
    @uploadschedule 2 ปีที่แล้ว +2

    in the moment now i dont have time to watch it. But later i will watch this vid and im sure its interesting because videos about how the hardware components work etc are always a thing i like learning about :D

  • @douglasullman
    @douglasullman ปีที่แล้ว +1

    I've been loving your stuff and gotta say the plug for brilliant is brilliant ! I'm going to check that out. Thank you so much Sir.

  • @tolkienfan1972
    @tolkienfan1972 2 ปีที่แล้ว +3

    Often the dependencies between chained pointers is more important than the fragmentation. I.e. you could explicitly construnct a linked list in contiguous memory, but iterating will still involve the cpu waiting for each load to complete before it can calculate the next pointer. Iterating over the exact same nodes, but using an index instead of the next pointers, will be much faster. The cpu can prefetch the cache lines.

  • @MrFlyingChip
    @MrFlyingChip ปีที่แล้ว

    Haven't seen this in the comments, so will leave it. There's an article called "What Every Programmer Should Know About Memory". It explains in detail how the CPU works with memory, how RAM works, why it's so slow, and why CPU cache memory is so fast. I really recommend reading it (you just need to read only 3-4 first chapters).

  • @IkeVoodoo
    @IkeVoodoo 2 ปีที่แล้ว +4

    Great video, though each time I wish that we could see the final optimized version of the project :D

  • @Sebanisu
    @Sebanisu ปีที่แล้ว +1

    Just realized you are still doing code reviews and this one had 3 videos. So Now I got my afternoon planned out heh.

  • @-infality
    @-infality 2 ปีที่แล้ว +5

    Regarding the slow Windows terminal you may be interested in Casey Muratori's videos about it and his refterm prototype project

    • @Macuyiko
      @Macuyiko 2 ปีที่แล้ว

      Was going to mention that as well. He goes into some interesting details about conhost if I remember correctly which is doing a lot of crazy things that make consoles slow on Windows.

  • @nikeedev
    @nikeedev 2 ปีที่แล้ว +2

    I read C++ standards 2 months ago, and it said that C++23(C++2b) will support .h file as standard header file. It doesn’t mean that .hpp shouldn’t be used, but .h will be supported because it was before planned to phase it out, but as it was used a lot within C but also C++ they will keep it

    • @ultimatesoup
      @ultimatesoup ปีที่แล้ว

      You can actually get rid of headers entirely if you use modules

  • @Spartan322
    @Spartan322 ปีที่แล้ว

    Terminal logging is slow in C++ because most streams, especially cout, tends to flush constantly where as most implemented file logging in C++ doesn't perform constant and immediate flushed for every input.

  • @jef777
    @jef777 2 ปีที่แล้ว +4

    This main function looks so nice. I wish mine could look so inviting.

  • @Beatsbasteln
    @Beatsbasteln 2 ปีที่แล้ว +1

    this was fascinating. can you make a video about how to make an arena allocator and then show how you use it when creating vectors?

  • @SC2Villares
    @SC2Villares 2 ปีที่แล้ว

    Why is that channel so good? Humanity deserves it? Oh my, what a gift!

  • @disieh
    @disieh 26 วันที่ผ่านมา

    My background is mobile games. I'll still say any build system is better than just VS/Xcode/Android Studio/plain makefiles/shellscripts. Despite CMake being a pain, I'd still recommend learning it because it tends to be most common and most supported by IDEs and toolchains. Despite the wild west of build systems, it tends to be most common.
    The most important reasons why to use a build system is getting support for new IDEs automatically and the ability to add linters, static analysis, fuzzing and unit tests easily to your project later. I've worked with too many projects where you're stuck with ancient versions of VS, no tests because nobody figured out how to add them (and code being brittle because of that).
    The absolute worst thing you can do is end up with a build process where devs use one process to make local builds and completely separate set of tools to make CI builds.

  • @on-hv9co
    @on-hv9co 2 ปีที่แล้ว +1

    I do something very similar with that log macro. its essentially just an X macro that wraps cerr and uses the ascii color codes. from there DLOG and RLOG are called and will log their respective debug/(sparse) release

  • @F1nalspace
    @F1nalspace 2 ปีที่แล้ว +2

    Nice project and good talk about memory improvements! Memory arenas and transient memory are great and my most used techniques when i do programming these days.
    If you are interested, i have a similar physics project (2D fluid simulation) that is a little bit more complex, due to its multi-threading + integrated benchmark support and 4-versions of C++ styles, where i tried to show the difference between naive/from-the-book C++ programming to data-oriented-programming, but didn´t get it exactly right - especially the data-oriented part. Just give me a hint, i will sent you the details.

  • @cyphre117
    @cyphre117 2 ปีที่แล้ว +1

    Would be great to hear you talking about Static vs Dynamic libraries!

  • @DiamondWolfX
    @DiamondWolfX 4 หลายเดือนก่อน

    A logging setup I've been messing with has the message simply sent to a queue, where a separate thread pulls from the queue and actually logs the thing

  • @thehambone1454
    @thehambone1454 2 ปีที่แล้ว +1

    Would love a video about the CPU cache and the related!

  • @aaron6807
    @aaron6807 2 ปีที่แล้ว

    FINALLY! I'VE BEEN WAITING FOR THIS EPISODE FOR AGES

  • @enigma7791
    @enigma7791 2 ปีที่แล้ว +3

    Yes if you could look at your optimisations and the effect on performance that would be really cool! Often I spend too much time optimising code for very little return. EDIT...but I do note the FPS is massive here anyway so it is difficult to quantify if it's worth it. Maybe throw in something that really puts a strain on the FPS and see the optimisations make it smooth again? Either way great code Stowy and great review Cherno.

  • @shalip
    @shalip 2 ปีที่แล้ว +1

    please release a video where you implement your suggestions. It would be so GREAT !!

  • @christopherprobst-ranly6357
    @christopherprobst-ranly6357 ปีที่แล้ว

    Strong argument to use hpp: A potential user does not need to think about extern "C". If it's .hpp, it can be included only and directly in Cpp. .h leaves a lot of room for speculation. Can you import it from C? Can you import it from Cpp? Do you NEED to call extern "C"? It's there for a reason.

  • @sethmoore5903
    @sethmoore5903 2 ปีที่แล้ว +1

    I'm curious how the actual defragmentation process works in a game engine and how it affects performance in a simulation where we have lots of circles dying

  • @ricardopieper11
    @ricardopieper11 2 ปีที่แล้ว

    This is the 1th The Cherno video I watch

  • @Overminddl1
    @Overminddl1 2 ปีที่แล้ว

    Logging to console in Windows is indeed substantially, like Substantially slower than on Linux, however there are ways to speed it up as well, both by using Microsofts new terminal as well as using buffering in the program instead of flushing every single log immediately, still not as fast as on Linux, but helps a ton.

  • @darioabbece3948
    @darioabbece3948 2 ปีที่แล้ว +1

    The project: c++ gameplay
    The cherno explanations: c++ lore

  • @MrDenniable
    @MrDenniable 2 ปีที่แล้ว

    @19:45 About the huge time consumption of logging... You should check out Trice! It speeds up your logging performance on embedded systems :)

  • @simonkufeld7903
    @simonkufeld7903 2 ปีที่แล้ว

    this channel should have more subs

  • @deconline1320
    @deconline1320 2 ปีที่แล้ว +2

    We see it often in code, but in C++ it's not a good idea to start a variable identifier with an underscore. Some combinations of single/double underscore identifiers are reserved for the compiler implementation by the C++ standard. I would avoid it completely.

  • @BradenBest
    @BradenBest 2 ปีที่แล้ว +1

    I wouldn't worry about fragmentation. It's the heap allocator's job to worry about managing that. And in the general sense, as long as you free memory in the opposite order that you allocated it, fragmentation will not be a problem. I say this as someone who has implemented malloc+free in C. To get a memory leak from allocator fragmentation, you would have to do some insanely stupid things. Of course don't just allocate willy nilly from the heap if you don't have to. Heap allocation carries a performance overhead because when malloc has to get more memory, it has to do so via a system call, which means a context switch, which is slow. That's the `sys` metric given by the `time` command.
    Regarding specifically what is said in the video, where you go into low level machine details like the CPU cache, I especially wouldn't worry about that, because that's premature optimization. Worry about choosing efficient algorithms, not about how the machine accomplishes a task. That's the compiler's job. Turn on that -O3 flag. Or -Ofast if you're not worried about slightly less precise math. Sometimes you can justify low level optimizations, like when the Quake devs implemented the fast inverse square root using low level floating point math. But then look what happened--the chipset manufacturers and compiler vendors caught up. Nowadays, the quake inverse square root is no faster (and sometimes slower) than code that a compiler will generate for a more straightforward algorithm. I do not recommend wasting your time optimizing for hardware. The compiler has already done it and you can save a lot more time by choosing a better algorithm. C (and by extension C++) is not a low level language, and your computer is not a fast PDP.

    • @BradenBest
      @BradenBest 2 ปีที่แล้ว +1

      A big problem with that argument is the assumption that the pieces of data necessarily will be fragmented. It's "whataboutism" taken to the extreme. But let's look at an average case where you allocate 100 small objects using a heap allocator: the heap allocator has a free pool of memory, so it slices a chunk off for both the object and the bookkeeping node to manage that memory, and updates the other node to account for the borrow. It does this over and over again until 89 objects in, the pool doesn't have enough memory. So the allocator will do a context switch asking for more memory. The memory comes from the heap, so it will be adjacent to the previous memory, but it will continue to allocate memory until all objects are allocated. The allocator is smart, it doesn't want to waste CPU time by making a bunch of syscalls to allocate tiny blocks of memory, so it does them in bulk. Pages and pools of memory that it marks up and manages. If the addresses were wildly spread out, that would mean the allocator is allocating random pages for every single allocation request, and all those context switches would be a far worse bottleneck than a cache miss. But as it turns out, the heap grows upward. The addresses are all fairly close together.
      Now, you can optimize your code to assume that the allocator allocates a huge chunk of memory that's all close together, or you can optimize it to assume that the addresses will be far apart, but in the end, that's all you're doing: assuming. The standard says nothing about how the allocator is implemented. Don't assume. Write better algorithms. If the compiler thinks your array of structs will be more efficient if it turns it into individual arrays of the one element you access, it will do exactly that. That's the ultimate lesson: the compiler is better at optimizing than you are.

  • @TuxikCE
    @TuxikCE 2 ปีที่แล้ว +4

    Pls bring more of these code reviews!

  • @billynugget7102
    @billynugget7102 2 ปีที่แล้ว

    C++ ALREADY HAS ARENA ALLOCATOR. It works for all std structures/containers even vector. Its called PMR

  • @Amish_Avenger
    @Amish_Avenger 2 ปีที่แล้ว

    you suggested allocating things like rigid body to the stack because of cpu optimizations but shouldn't the programmer worry about space? Are you banking on the fact that vectors allocate on the heap contiguously? Or should there be a specific buffer created or contiguous heap memory?

  • @IgnoreSolutions
    @IgnoreSolutions ปีที่แล้ว

    I’m surprised you didn’t mention the fact that variables starting with just an underscore are considered reserved by the language.

  • @rajpootmhm
    @rajpootmhm ปีที่แล้ว

    Please make a video on handling big data
    Along with memory management and time complexity

  • @codemastercpp
    @codemastercpp 2 ปีที่แล้ว

    For speeding up console ouput
    You can unsync with stdio
    ```
    ios_base::sync_with_stdio(false);
    cin.tie(0);
    ```

  • @kursatyakupkukul7670
    @kursatyakupkukul7670 2 ปีที่แล้ว +1

    Wow, really enjoyed this one as a non game/game engine developer!

  • @draco5991rep
    @draco5991rep ปีที่แล้ว

    I just started programming in C and I wonder a lot about when to use the heap and when to use the stack. Because I am more comfortable using the stack, I predominantly put all data onto the stack. Is there an easy rule of thumb to when use one or the other?

  • @christopherprobst-ranly6357
    @christopherprobst-ranly6357 ปีที่แล้ว

    Logging on Linux/macOS: Yes, their terminals are magnitutes faster than Windows. Reason is that they are totally different implemented and Console on Windows is just slow. I read somewhere why it's hard to change. But Files are always faster, that's true.

  • @odarkeq
    @odarkeq 2 ปีที่แล้ว

    11:33 The webcam picture quality begins to tank because of the video encoding all the little gaps between so many moving circles. It's interesting to see a non-FPS-related side-effect appear while testing FPS-related benchmarks.

  • @MosiuoaF
    @MosiuoaF 2 ปีที่แล้ว

    Thank You!

  • @Thomas_Lo
    @Thomas_Lo 2 ปีที่แล้ว

    cool refrence video for quite a lot of topics. works well as a refresher :-)

  • @0xCAFEF00D
    @0xCAFEF00D ปีที่แล้ว

    25:45
    Does it really work like this? That you have fragmentation in any percievable way. I thought with virtual memory you're not taking any penalty in reading across pages beyond that you're taking more TLB space because you have multiple pages. Is there any gain in having the actual pages be contiguous?

  • @frankhaugen
    @frankhaugen 2 ปีที่แล้ว

    The reason why writing to console is slow, is that windows assume a window, so it's written to the UI interopts, while filewriting is just bits on disk

  • @m3taldragon1
    @m3taldragon1 2 ปีที่แล้ว

    Certain IDEs require you to use hpp vs just h if you are using any C++.

  • @featherless656
    @featherless656 2 ปีที่แล้ว +2

    I wish I could find the motivation and smarts to be able to do stuff like this

  • @davidcmoffatt
    @davidcmoffatt ปีที่แล้ว

    There is more benefits to contiguous data storage. Cutting down on TLB misses, and VM page misses jump to mind.

  • @SETHthegodofchaos
    @SETHthegodofchaos 2 ปีที่แล้ว

    15:20 Is there a difference between a "Entity Component" system and a "Entity Component System" system/architecture? Both can be implemented with a data-oriented memory layout, correct?

  • @sherazali8691
    @sherazali8691 2 ปีที่แล้ว +1

    About logging, can we just create a Static class and call it's function to log something there (through parameters)
    like:
    Logger.Log(_currentFps);
    and in our release build, we just comment out all the statements in that function.
    We would still have an overhead of calling that function and passing parameters, but is it okay to do it like this?

    • @nickgennady
      @nickgennady 2 ปีที่แล้ว

      It’s more simple and straightforward to setup sure but you have to keep commenting and uncommenting every time you want to change build type and you have to remember to do that.
      His macro way is much better.

    • @user-dh8oi2mk4f
      @user-dh8oi2mk4f 2 ปีที่แล้ว

      I would be quite surprised if your compiler left the function call to an empty function with max optimization

    • @nickgennady
      @nickgennady 2 ปีที่แล้ว

      @@user-dh8oi2mk4f fair. Did not think of that

  • @MrSandshadow
    @MrSandshadow 2 ปีที่แล้ว +1

    23:50 it's called 'placement new'

  • @roz1
    @roz1 2 ปีที่แล้ว

    @Cherno We can do calloc rather than malloc which will be a contiguous allocation .... that can help but still it can't beat the stack memory.

    • @TheCherno
      @TheCherno  2 ปีที่แล้ว

      Both calloc and malloc returns a contiguous allocation of memory - there’s actually very little difference between how those two work

  • @GautamSharma-un3cr
    @GautamSharma-un3cr 2 ปีที่แล้ว

    Please make a video on how to exploit cache lines and CPU cache in order to build blazing fast applications

  • @kuroakevizago
    @kuroakevizago 2 ปีที่แล้ว +2

    Thanks you're giving me a heads up on what to do next. I probably going to start making 2D Physics Engine.
    Thanks btw got your brilliant discount :)

  • @nathantonning
    @nathantonning 2 ปีที่แล้ว

    Great code review.

  • @ciCCapROSTi
    @ciCCapROSTi 3 หลายเดือนก่อน +1

    I have no idea why you'd want a pointer there when you KNOW which implementation you use. Hell, why does the class hierarchy even exist? Just use a member variable, not a pointer.

  • @freandtuber
    @freandtuber 2 ปีที่แล้ว

    Maybe there is time to have a look in to openMP for loading and shaping allocated memory 🤔

  • @mobslicer1529
    @mobslicer1529 2 ปีที่แล้ว

    with logging what i do is for stuff that gets called all the time i only log failures so you know what happens with those but don't flood the log.

  • @ValinorFP
    @ValinorFP ปีที่แล้ว

    Great video, thank you! In modern C++, is heap memory fragmentation a concern for developers, given that the OS uses virtual memory to map to physical memory? My hypothesis is that even if physical RAM is fragmented, but virtual memory is contiguous, the C++ program's performance will not be affected.

    • @majormalfunction0071
      @majormalfunction0071 ปีที่แล้ว

      Maybe or maybe not. CPUs don't prefetch across page boundaries, probably because of kernel-side page permissions / residency state. The more pages you access, the more TLB slots you use. TLB misses hurt, but maybe not to the level of framerate problems. It's an extra memory access, paid serially. Huge TLB requires defragmented memory on the kernel-side, and has a system-wide limit. Running kernel code to change page residency really hurts. It's many instructions, and a possible disk access.

  • @lionkor98
    @lionkor98 2 ปีที่แล้ว

    You can log into a queue, and then flush the queue on a separate thread

  • @TGAPOO
    @TGAPOO 2 ปีที่แล้ว

    Leading underscores are reserved in microsoft code. You should never use leading underscore variable if you expect to work on windows. Prefer trailing if you must.

  • @cloud9sl98
    @cloud9sl98 2 ปีที่แล้ว

    WORKING thx bro

  • @unkgames-abdullahali4048
    @unkgames-abdullahali4048 2 ปีที่แล้ว +1

    Physics engine: is an engine about physics!! 👍👍👍

  • @roykapon181
    @roykapon181 2 ปีที่แล้ว +1

    Is a std::vector with preallocated size a decent way to implement this kind of memory management? Or do you need to do it manually? Im a cpp newbie so pls dont roast me :)
    Btw, a great video! Looking forward for pt 2

    • @Larock-wu1uu
      @Larock-wu1uu 2 ปีที่แล้ว

      I am curious about this as well

    • @roykapon181
      @roykapon181 2 ปีที่แล้ว

      I forgot to note that it will probably not work well with deleting items (I guess that for this we need a more sophisticated method)...

  • @ByChris
    @ByChris 2 ปีที่แล้ว

    How comfortable would you feel about making a C++ Graphics course for udemy?

  • @cjjavellana
    @cjjavellana 2 ปีที่แล้ว +2

    What IDE are you using?

    • @felipheallef
      @felipheallef 2 ปีที่แล้ว +1

      Visual Studio

    • @dealloc
      @dealloc 2 ปีที่แล้ว

      Visual Studio, but I believe an older version like 2019.

    • @rdxrid
      @rdxrid 2 ปีที่แล้ว

      Visual Studio 2022

    • @rdxrid
      @rdxrid 2 ปีที่แล้ว

      @@dealloc Nah, its 2022

    • @theRPGmaster
      @theRPGmaster 2 ปีที่แล้ว +1

      @@dealloc It should be noted that Visual Studio is not the same as Visual Studio Code

  • @nullbeyondo
    @nullbeyondo 2 ปีที่แล้ว

    If he just uses a clock for the delta time instead of a fixed time-step, that means his physics engine is not determinstic and thus will produce different results every time he runs a simulation.

    • @Stowy
      @Stowy 2 ปีที่แล้ว +4

      yes i didn't knew that at the time, but i'm working on networking at the moment so I realized that mistake. I'll definetly be careful about that if I ever do something like that again haha

  • @xxdeadmonkxx
    @xxdeadmonkxx 2 ปีที่แล้ว

    I really want to know how would you deallocate item from custom memory pool (arena?)

  • @bu3778
    @bu3778 2 ปีที่แล้ว

    damn this was a nice review

  • @stephenkamenar
    @stephenkamenar 2 ปีที่แล้ว

    when did the concept of Random Access Memory die. modern performance programming is literally all about not randomly accessing memory.
    but i know that wasn't the case on older computers

    • @fleroviux
      @fleroviux 2 ปีที่แล้ว +1

      When CPU speeds started to exceed RAM speeds, so cache (small memory that is close to the CPU core and much faster than external RAM) and prefetching became a necessity.
      Without that RAM would constantly bottleneck the CPU.
      Often times, depending on the type of memory used, sequential accesses to memory can also be faster than random/non-sequential access, because the address doesn't need to be transmitted and decoded for every access.

  • @Amitkumar-dv1kk
    @Amitkumar-dv1kk 2 ปีที่แล้ว

    Do you also review Java codes or is it only c++?

  • @andreidumitras4237
    @andreidumitras4237 2 ปีที่แล้ว

    What cholor scheme do you use?
    Awesome video btw.

  • @cameleon5724
    @cameleon5724 2 ปีที่แล้ว

    One content, two languages. What I have now written may have a perfect mirror in another language. You can create a program that searches for the perfect language mirror. Thanks to this, you will be able to speak two languages ​​and perform tasks in the shade.Endless enigmatic book in all languages. You can write a book with mirrors in all languages of the world. You can speak two languages at once, you just need to find the perfect reflection, same content, different translation. Infinite Mirrors. Pi 3.14 XBooks. Hybrid language. The algorithm flows through our heads, endless coding, just take off the chameleon masks. Connect words without spaces and you will find hidden tasks in all languages. Our conversations collide in the process, some words as well as numbers in words. We perform tasks hidden between words. You can create a Python coding language from a spoken language. You just need to find the mirrors. Two tongues glued together.

  • @nuDeltaTech
    @nuDeltaTech 2 ปีที่แล้ว

    I use __ for static and single _ for members.