Optimizing Away C++ Virtual Functions May Be Pointless - Shachar Shemesh - CppCon 2023

แชร์
ฝัง
  • เผยแพร่เมื่อ 28 ก.ย. 2024
  • cppcon.org/
    ---
    Optimizing Away C++ Virtual Functions May Be Pointless - Shachar Shemesh - CppCon 2023
    github.com/Cpp...
    We all know that Virtual Functions Should Be Avoided. A great many tutorials exist for replacing virtual functions with compile-time polymorphism mechanisms, such as std::variant and templates. But is that feeling justified? Are virtual functions truly slower? By how much? Does it matter for your particular use case? What costs do their alternatives carry?
    In this lecture we'll try and understand where that impression came from, what virtual functions do that make them slower and how all of that interacts with modern CPU architectures. We'll also explore the limits of benchmarks for answering those questions.
    This lecture may not supply you with answers, but it will supply you with better questions.
    ---
    Shachar Shemesh
    Shachar Shemesh has been programming computers since the 8-bit era, and still finds passion in it today. Shachar's professional career has taken him to security, networking, storage and video streaming.
    Outside his professional career Shachar is... also programming. He has several open source projects to his name, and is lately working on creating his childhood computers, from scratch, on low-cost FPGAs.
    Shachar also plays the saxophone. Not necessarily well.
    __
    Videos Filmed & Edited by Bash Films: www.BashFilms.com
    TH-cam Channel Managed by Digital Medium Ltd: events.digital...
    ---
    Registration for CppCon: cppcon.org/reg...
    #cppcon #cppprogramming #cpp

ความคิดเห็น • 111

  • @sampro454
    @sampro454 6 หลายเดือนก่อน +96

    10/10 "who am I?" slide

    • @ujin981
      @ujin981 6 หลายเดือนก่อน

      he's definitely no Casey Muratori th-cam.com/video/tD5NrevFtbU/w-d-xo.htmlsi=yJA2QgKbbV1urmS5

    • @embeddor3023
      @embeddor3023 6 หลายเดือนก่อน +11

      This is by far the best speaker introduction slide I have seen ever.

    • @X_Baron
      @X_Baron 6 หลายเดือนก่อน +3

      The ending too: Let's take this outside, bring it on!

    • @garethma7734
      @garethma7734 6 หลายเดือนก่อน

      11/10 benchmark slide

  • @PaulTopping1
    @PaulTopping1 6 หลายเดือนก่อน +45

    The important takeaway from this whole talk, IMHO, is that design issues should determine whether you should use virtual functions, not performance considerations. Only when something doesn't perform well enough for the task should you worry about eliminating virtual functions and only as part of a comprehensive performance analysis where that is only one of many strategies under consideration. Makes sense to me.

    • @BlackBeltMonkeySong
      @BlackBeltMonkeySong 6 หลายเดือนก่อน +5

      If you write code this way, you may find that virtual functions proliferate your codebase in a way that is impossible to remove. For example, google heavily uses protobufs even though it is at least twice as slow, and sometimes 10x slower than using modern C++ without virtual functions. It's just too hard to fix.

    • @TheOnlyAndreySotnikov
      @TheOnlyAndreySotnikov 6 หลายเดือนก่อน +3

      On this, I can quote Alexandrescu: "Easy on yourself, easy on the code: All other things being equal, notably code complexity and readability, certain efficient design patterns and coding idioms should just flow naturally from your fingertips and are no harder to write than the pessimized alternatives. This is not premature optimization; it is avoiding gratuitous pessimization."

    • @Evan490BC
      @Evan490BC 6 หลายเดือนก่อน

      @@BlackBeltMonkeySongThe question to ask is: do you really need OO programming and runtime polymorphism? Modern C++ is moving away from those concepts and more towards functional programming and composition.

    • @AlfredoCorrea
      @AlfredoCorrea 6 หลายเดือนก่อน

      @@BlackBeltMonkeySong actually, even heavily OO code bases can benefit from variant since both approaches are not contradictory. What I did in the past was to take a (closed) inheritance system and make variants out of leaf classes. The benefits were immediate since visit overloads could take advantage of the (static) commonality between alternatives and the instances could be created without heap allocations and stored in STL containers.

  • @alskidan
    @alskidan 6 หลายเดือนก่อน +38

    My conclusions: benchmarking is hard, especially when you don’t know what you’re measuring 🤣

    • @CompuSAR
      @CompuSAR 6 หลายเดือนก่อน +7

      My conclusion was that benchmarks are meaningless. Almost always. It has less to do with how well you know what you're doing and more to do with how complicated the hardware optimizations are. Even taking a piece of code you actually use and isolating it will change its performance.
      So don't benchmark. Profile.

  • @RPG_Guy-fx8ns
    @RPG_Guy-fx8ns 6 หลายเดือนก่อน +6

    if you are making systems that handle large amounts of data quickly, it ideally should not be object oriented or use virtual functions or inheritance. It should be data oriented, and parallelized, like a GPU particle system. Functions and data should be separated, and functions should act on packed arrays of data organized by traits, not objects. Avoiding cache misses is not pointless, and removing inheritance and virtual functions is only part of the solution. This can speed up your code by very significant amounts, especially in video games or large simulations.

  • @ArthurGreen-bw3sb
    @ArthurGreen-bw3sb 6 หลายเดือนก่อน +16

    The thing about which cases are "natural" for inheritance seems to be an important issue. If you've spent a decade writing Java and using GoF patterns, then inheritance becomes the natural solution for a lot of problems that you might use other techniques for otherwise.

    • @theodoredokos8145
      @theodoredokos8145 6 หลายเดือนก่อน

      I think this is a very good point. The Java dev’s thoughts could look very different compared to a C library developer, for example.
      The fact that we have modern languages that deliberately eschew inheritance is an important indicator of that subjectivity, as well.

  • @CartoType
    @CartoType 6 หลายเดือนก่อน

    A very good presentation. Clear, to the point, correctly paced. And a surprising and useful conclusion.

    • @CompuSAR
      @CompuSAR 6 หลายเดือนก่อน

      I find your use of the word "conclusion" baffling, but thank you anyways.

  • @LarryOsterman
    @LarryOsterman 6 หลายเดือนก่อน +3

    A *great* exploration. And a graphic demonstration of why doing performance analysis the world of modern processors is unbelievably hard (edited to remove a reference to benchmarking, since it's a broader area of challenges).

    • @CompuSAR
      @CompuSAR 6 หลายเดือนก่อน

      Thank you.

  • @colonelmax1
    @colonelmax1 6 หลายเดือนก่อน +7

    Those benchmarks are not representative at-all. No warmup, no CPU power state and frequency locking, ....

    • @CompuSAR
      @CompuSAR 6 หลายเดือนก่อน +4

      Warmup? They run millions of times. Frequency locking? If you don't frequency lock your actual production code, why would frequency locking the benchmark do anything?
      The premise of this criticism is that you want to use the benchmark to isolate just this one operation out of the whole. But with modern CPUs, the CPU is the one mixing everything together. If you succeed in isolating a single part, the results are going to be even more meaningless than usual.

    • @TheArech
      @TheArech 6 หลายเดือนก่อน

      Thought I put a thumb up, I also have to note that his benchmarks are representative of the actual real world systems which could execute the code - no one will do a warmup, lock CPU freq scaling, power handling & etc just to run some tools.
      So this is exactly as he said - results are perfectly valid for his own 2 PC, and he can't neither explain it, nor scale it. Doing benchmarking properly could help to explain the results, but the issue of scaling (probably the most important one) still remains.

    • @許國讚
      @許國讚 6 หลายเดือนก่อน +2

      You are missing his point. When your program runs in reality, it won't be in such a tightly controlled environment. Hence, whatever you learn from such a tightly controlled environment won't matter.

    • @meneldal
      @meneldal 5 หลายเดือนก่อน

      @@CompuSAR If you want to compare how 2 things perform, you have to remove external variables that throw in noise in your measuring.
      The only way to get a noiseless benchmark is to use a hardware simulator, where you can get cycle-accurate counts of how much time something is taking (and you're stubbing all the outside environment factors like temperature). If you can't get that, you could run on bare metal, which will free you of an OS injecting tons of noise in your measurement, and if you disable variable clock in the cpu would give you something pretty consistent.
      The great thing with hardware simulators is you can clearly observe the effects of a cache miss on the latency of something like an interrupt call, like the first interrupt was 800ns slower because of a bunch of cache misses (the firq handler instructions, the handler function array and the handler function itself). Obviously you can't do that for any machine since x86 cpus aren't really open and licenses from arm not exactly cheap either, but if you need to get the most performance at any cost you have to go very low level.

  • @sirhenrystalwart8303
    @sirhenrystalwart8303 6 หลายเดือนก่อน +3

    Forget about performance. Virtual functions make your code base hard to understand. Only use when absolutely necessary.

  • @Jean-LouisJlLeroy
    @Jean-LouisJlLeroy 6 หลายเดือนก่อน +1

    Inheritance, virtual functions, that's for boomers, totally uncool ;-)

    • @CompuSAR
      @CompuSAR 6 หลายเดือนก่อน +1

      And yet you've spent the time, if not watching the video then at least writing this comment.

  • @soumen_pradhan
    @soumen_pradhan 6 หลายเดือนก่อน +35

    I now know less.

    • @CompuSAR
      @CompuSAR 6 หลายเดือนก่อน +25

      I feel exactly the same, and I'm the one who gave this talk.

    • @austinscott4695
      @austinscott4695 6 หลายเดือนก่อน

      congratulations!

    • @bunpasi
      @bunpasi 6 หลายเดือนก่อน

      Haha. We just realized that we know less than we thought. That is still progress ... I guess.

    • @Adowrath
      @Adowrath 3 หลายเดือนก่อน

      And that's a great thing to acknowledge!

  • @reneb86
    @reneb86 4 หลายเดือนก่อน +6

    I think every developer instinctively has felt that the "virtual functions are slow" claim is overly generalized. Virtual functions have a practical purpose, and that practical purpose can outweigh a need for performance. And given a proper context, the correct implementation of virtual functions will also yield performance gains. What is surprising here is that compiler optimization and cpu management seems to be a decade or two ahead of what developers are thinking.

  • @surters
    @surters 6 หลายเดือนก่อน +11

    Branch target buffers have increased over the years, that might be why it suddenly is a different scenario now.

  • @bescein
    @bescein 6 หลายเดือนก่อน +11

    If he wanted to prove benchmarks show nothing he chose test subject poorly. The main downside of virtual functions is that they cannot be inlined. Thats the whole reason people tend to avoid them.

    • @CompuSAR
      @CompuSAR 6 หลายเดือนก่อน +3

      Can you explain why the answer I gave to that very same question at minute 28:00 in the video doesn't satisfy you?

    • @N....
      @N.... 6 หลายเดือนก่อน +2

      This is discussed at 27:54 - the point of the benchmark is to compare ways of implementing customization points. You can't always inline customization points, for example if that static function call is from a dynamically loaded library.

  • @vladimir0rus
    @vladimir0rus 6 หลายเดือนก่อน +13

    The important takeaway from this whole talk, IMHO, is that "If it is fast enough - use it!".
    For some people a Python is fast enough.
    There is no way a virtual call can be faster than a direct call so the rest of the talk was about inability of the author to measure the difference properly.

  • @Roibarkan
    @Roibarkan 6 หลายเดือนก่อน +8

    10:18 Shachar’s TH-cam channel: youtube.com/@CompuSAR
    Trailer: th-cam.com/video/ZUHcmYPELqg/w-d-xo.html
    Great talk!

    • @CompuSAR
      @CompuSAR 6 หลายเดือนก่อน

      Thank you! Much appreciated.

  • @chrisminnoy3637
    @chrisminnoy3637 6 หลายเดือนก่อน +16

    Guys, maybe you should study the branch prediction strategies of modern cpus and cache management. That will give you insight how specific cpus completely eliminate the time difference between a direct and indirect call in more than 90% of the time. But smaller code will run faster because more code fits into cache lines, and indirect calls are a bit bigger. So then the performance of cache handling is again the determinating factor. To conclude, just use virtual functions when it makes sense.

    • @CompuSAR
      @CompuSAR 6 หลายเดือนก่อน +7

      The flip side of this argument happens when you look at the alternatives for virtual functions. There are switch/case, where prediction might be actually more difficult for the CPU, and thus slower, and there are templates, which blow up the code size, putting more stress on the caches.
      In fact, that was my original prediction going into this research. What I found out, however, was that even the simplest cases don't behave the way I expected them to.

    • @babgab
      @babgab 5 หลายเดือนก่อน

      I'd rather not depend on the CPU to be smart if I can avoid it. It makes reasoning about what code needs attention harder if there's only a *probability* (depending on CPU and runtime conditions) that something needs attention. A bit slow but consistent is sometimes better than usually fast but randomly and uncontrollably very slow.

  • @tomkirbygreen
    @tomkirbygreen 6 หลายเดือนก่อน +6

    Love the take away that design is supreme, but also the energy and enthusiasm with which the topic is explored. Awesome stuff sir. Kudos.

    • @CompuSAR
      @CompuSAR 6 หลายเดือนก่อน

      Thank you!

  • @ZigiStardust-c7b
    @ZigiStardust-c7b 6 หลายเดือนก่อน +3

    Some time early in the talk ( 3:53 ) he says he asked experts about possible reasons for the virtual/concrete diffs, a large chunk of the talk was attempts to poke at potential reasons (cache sizes and whatnot) and then at the end (31:31 ) in answer to a (smart!) suggestion from the audience about a potential way to isolate the reasons - he cut the question and says he absolutely positively doesn't even care about why? It seems needlessly hard for some speakers to give simple answers like "thanks, i didn't think of that".

    • @CompuSAR
      @CompuSAR 6 หลายเดือนก่อน +7

      31:31 is the end of the video. Did you mean 30:00, which talks about comparing CLANG and GCC outputs? Because if so, it's a completely different question.
      The consulting with experts was to make sure I'm not missing some obvious bug in the benchmark, where what I'm measuring is different than what I think I'm measuring. While there is _some_ benefit to understanding why performance behaves the way it does, at the end of the day you input source code and get assembly, and it's very rare to hand-tweak the output. That's my whole point: you should measure things in the context you use them.

  • @gregthemadmonk
    @gregthemadmonk 6 หลายเดือนก่อน +2

    Interesting to see how running the benchmark from the StackOverflow question on a modern compiler (thankfully the OP provides QuickBench links) results std::visit being actually faster on Clang 17 + libstdc++ (and slower on libc++ :) ) than a virtual function call. Guess it just further proves the point of the talk :)

  • @matiasgrioni292
    @matiasgrioni292 6 หลายเดือนก่อน +2

    The talk is good, but the presenter's handling of questions certainly does not make me think that he thinks his talk is not a period.

    • @CompuSAR
      @CompuSAR 6 หลายเดือนก่อน +2

      That's an interesting point (though, in fairness, there was a clock in front of me saying "you have 30 more seconds to clear the stage"). What do you think I should have done differently?
      I find it a bit sad that this is the most relevant criticism I have received to date. I honestly do want a discussion, but I definitely do want to change the discourse around performance. I think benchmarks that artificially isolate one component may come to a very clear conclusion, but if that conclusion doesn't reflect on the real world then it's pointless.

  • @bboysil
    @bboysil 6 หลายเดือนก่อน +2

    This talk is a great talk about "which is faster" and how to really measure using statistcs: th-cam.com/video/8Ph1l-_0PeI/w-d-xo.html

    • @Roibarkan
      @Roibarkan 6 หลายเดือนก่อน +1

      Indeed. Another great talk on the same subject, and the same author (Emery Berger) - from cppcon: th-cam.com/video/koTf7u0v41o/w-d-xo.html

  • @AlfredoCorrea
    @AlfredoCorrea 6 หลายเดือนก่อน +2

    The advantage of variant is the value semantics it provides (runtime values), while Virtual functions are ideal to interface with “plugins” (unbounded dynamic behavior). Neither use case is performance driven in principle. IMO the good news is that there are red flags for each use case: if you are using get_if or holds_alternative all the time, variant is not the right choice; If you are using dynamic_cast all the time Virtual is not the right choice. Also, it is not binary choice, there is a whole spectrum of type erasure techniques that happen to include these two cases.

    • @N....
      @N.... 6 หลายเดือนก่อน

      It's also worth noting that virtual functions and variants are not mutually exclusive - you can have a variant where every type inherits from a base class and keep a base class pointer for virtual function calls into the value held in the variant (ideally wrapped in a proper container for safety though)

    • @AlfredoCorrea
      @AlfredoCorrea 6 หลายเดือนก่อน

      @@N.... yes, like anything, variant alternative types benefit from having commonality and it happens that derived classes have a lot of commonality. What you imply is something that I used in the past, in the context of coworkers that are fanatics of virtual functions and have a stablished inheritance system; i made variants of derived classes (or a thin wrapper of such variant) and suddenly I could store them in arrays or containers and handle them as values and without extra heap allocations.

  • @JSzitas
    @JSzitas 6 หลายเดือนก่อน +1

    I think unless you show median/minimum timings for a piece of code, benchmarks are indeed meaningless (almost by design).

  • @ruadeil_zabelin
    @ruadeil_zabelin 6 หลายเดือนก่อน +2

    28:00 Loss of inlining isn't always a problem. Too agressive inlining pollutes your cache too and can be slower in certain conditions. Coming back to... measure it properly in your real application

    • @TheOnlyAndreySotnikov
      @TheOnlyAndreySotnikov 6 หลายเดือนก่อน +2

      First, the benefit of inlining is not precisely in the lack of calls but instead in the opportunity to optimize the code, which is otherwise split between several functions. Second, compilers know about the CPU architectures and the drawbacks of aggressive inlining. If functions are inlinable, this gives the compiler a 𝘤𝘩𝘢𝘯𝘤𝘦 to use inlining and optimization wisely. If functions are virtual, these chances are exactly zero.

    • @babgab
      @babgab 5 หลายเดือนก่อน

      @@TheOnlyAndreySotnikov Not *quite* zero, if you're directly using a method that is marked "final" I'd expect the compiler to figure out that it doesn't need the virtual call there and inline. But still.

  • @Fudmottin
    @Fudmottin 6 หลายเดือนก่อน +4

    I always thought virtual functions were a nice way to create maintainable code at the cost of a small bit of performance due to the VTABLE lookup for an indirect call. I was willing to accept that cost for functions that did enough work that the call was a trivial portion of it. Now I'm wondering, why worry?

  • @andreaselfving2787
    @andreaselfving2787 4 หลายเดือนก่อน +1

    I wonder how the results would have been in a bare-metal embedded system, like cortex-m or avr core

  • @videofountain
    @videofountain 6 หลายเดือนก่อน +1

    Thanks. Cache, Benchmark, Profile. Interesting casting of doubt 🎊 to make more work for programmers and have a community consider more of the total measurement picture.

  • @DerHerrLatz
    @DerHerrLatz 3 หลายเดือนก่อน +1

    Reading the comments is even better than watching the video. This topic seems to be hotter than tabs vs. spaces.

  • @moshegramovsky
    @moshegramovsky 6 หลายเดือนก่อน +1

    Great video! I have wondered about all the hate I've seen for virtual functions and inheritance-based polymorphism in the last few years. Some of the proposed solutions are baffling in terms of performance. Especially some type erasure strategies like variants.

    • @CompuSAR
      @CompuSAR 6 หลายเดือนก่อน +1

      Thank you!

  • @iddn
    @iddn 6 หลายเดือนก่อน +1

    The amount of people who share micro-benchmarks that they’ve run on laptops is crazy. Laptops are useless for this

    • @bunpasi
      @bunpasi 6 หลายเดือนก่อน +2

      Well, if your customers use primarily laptops ...

  • @TechTalksWeekly
    @TechTalksWeekly 6 หลายเดือนก่อน +1

    This talk has been featured in the #6 issue of Tech Talks Weekly newsletter.
    Congrats Shachar 👏👏👏

    • @CompuSAR
      @CompuSAR 6 หลายเดือนก่อน

      Thank you!

  • @nicholastheninth
    @nicholastheninth 5 หลายเดือนก่อน

    Well there was once where I decided to not inline a function inside a loop that gets called a few thousand times per frame, and the performance shot down to 60% of the perf when force-inlining it, even though it really shouldn’t have mattered (the function does a non-insignificant amount of work).

  • @younesmdarhrialaoui643
    @younesmdarhrialaoui643 6 หลายเดือนก่อน +1

    10/10

  • @mariogonzalezmunzon7076
    @mariogonzalezmunzon7076 6 หลายเดือนก่อน

    Very great talk, deep explanation on hard concepts, will need to rewatch it many times to really understand it.

  • @Dominik-K
    @Dominik-K 6 หลายเดือนก่อน

    Really good presentation! Performance can be important and not losing oneself in assembly if not necessary are important

    • @CompuSAR
      @CompuSAR 6 หลายเดือนก่อน

      Thank you!

  • @Nors2Ka
    @Nors2Ka 6 หลายเดือนก่อน

    The 4:00 mark perfectly summarizes why this talk is a waste of time.

    • @CompuSAR
      @CompuSAR 6 หลายเดือนก่อน +1

      If you do know, please share. If you don't, then I don't think it's a waste of time to map what we thought we knew but don't.

  • @TheOnlyAndreySotnikov
    @TheOnlyAndreySotnikov 6 หลายเดือนก่อน +15

    The primary source of virtual function slowness is that the compiler can't inline them and thus can't optimize across the function boundaries. A simple non-virtual getter, in most cases, will be optimized to a register operation; a virtual getter, on the other hand, will require an indirect call and will be orders of magnitude slower. You are done if you have such a getter in an often-executed loop.

    • @tikabass
      @tikabass 6 หลายเดือนก่อน

      most designs are more complex than setters...

    • @tistione
      @tistione 6 หลายเดือนก่อน +2

      You would be surprised how little overhead such a small virtual getter has. A few % at worst on modern cpus.

    • @CompuSAR
      @CompuSAR 6 หลายเดือนก่อน +3

      You're comparing two different things. If a non-virtual getter is an option then, yes, putting in a virtual function is a poor replacement (for all sorts of reasons). But if a virtual gettter is what you need, the non-virtual alternatives are not a non-virtual getter. It's a switch case with lots of options.
      When comparing optimizations, you need to compare optimizations of constructs with similar power.

    • @Roibarkan
      @Roibarkan 6 หลายเดือนก่อน

      Self promotion) in a cppnow talk from 2021 I had some discussion about the cost of virtual function calls, devirtualization and where (if at all) variant can help: th-cam.com/video/YBXRiPKa_bc/w-d-xo.htmlh18m48s

    • @TheOnlyAndreySotnikov
      @TheOnlyAndreySotnikov 6 หลายเดือนก่อน

      @@tikabass Really?!

  • @lol785612349
    @lol785612349 6 หลายเดือนก่อน +3

    So let's summerize: Premature optimization is the root of all evil and the optimization is premature earlier than you think.

    • @CompuSAR
      @CompuSAR 6 หลายเดือนก่อน +2

      Sounds about right. Next I'll submit the talk, I'll say I only need 5 minutes.

    • @TheOnlyAndreySotnikov
      @TheOnlyAndreySotnikov 6 หลายเดือนก่อน +1

      It's exhausting how often claiming "premature optimization" is used to justify sloppy thinking. Andrei Alexandrescu: "Easy on yourself, easy on the code: All other things being equal, notably code complexity and readability, certain efficient design patterns and coding idioms should just flow naturally from your fingertips and are no harder to write than the pessimized alternatives. This is not premature optimization; it is avoiding gratuitous pessimization."

    • @babgab
      @babgab 5 หลายเดือนก่อน +1

      @@TheOnlyAndreySotnikov Yep. I've seen "premature optimization" used to justify using std::set instead of a simple switch statement, an unordered_map instead of an enum that indexes an array... People overcomplicate things and when you point this out, they claim it's "simpler" and you're asking them to "optimize prematurely." It's bizarre.

  • @cedricmi
    @cedricmi 6 หลายเดือนก่อน

    I agree. But in some cases, you do care for performance differences, even if the performance is "good enough". For example, when you're at a scale where an app electricity cost is a consideration, if not the main consideration.

    • @CompuSAR
      @CompuSAR 6 หลายเดือนก่อน +1

      I'd argue that if electricity cost is a consideration, then the performance isn't "good enough". "Good enough" is a deliberately subjective term.

  • @נירבןמנחם-ב2ז
    @נירבןמנחם-ב2ז 6 หลายเดือนก่อน +1

    Revisiting the debate on virtual method performance seems redundant, even if not as slow today as some believe.
    The justification for OOP and therefore virtual methods has become less pressing as the industry shifts towards composition over inheritance,
    so let's focus on the best practices that have evolved rather than defending outdated ones.
    Benchmarking is definitely not useless, it compliments profiling in the same way that unit tests do for integration tests.

  • @johnmcleodvii
    @johnmcleodvii 6 หลายเดือนก่อน

    Premature optimization is a huge easte of time. Only try to optimize if your program is too slow. Then you need to figure where in your code it is too slow, and why that section os too slow
    Two distinct cases i eorked on come to mind.
    Case 1. Every access to a file involved opening and closing the file. Tye solution eas to keep the file open. Unfortunately this was done in a garbage collected language, so every instance of the class had to be found, modified, and tested. And there were a couple thousand instances - several weeks of work for yhe programmers, several more for testing.
    Case 2. The innermost loop was slow in an image editor. Inspection of the asembly language showed that the compiler was reading a byte, changing one bit, writing the byte, reading the same byte, changing the next bit, writing the byte again. This was done 8 times before the next byte was retrieved. The solution was to hand write the assembly code for the innermost loop. We tewrite it duch that the byte eas read, all 8 bits were modified, and the byte was written. Speedup of around a factor of 8.

    • @CompuSAR
      @CompuSAR 6 หลายเดือนก่อน +1

      Whenever I come across case 2, I try to figure out why the optimizer didn't pick it up itself. I then try to fix that instead.
      As a pure guess, I'd say that you were dereferencing a pointer somewhere in the loop, and the compiler couldn't prove to itself that the pointer wasn't aliasing to the value you were changing. I rather trivial change to another part of the loop may have solved your problem without having to write (and then maintain) assembly.

    • @sirhenrystalwart8303
      @sirhenrystalwart8303 6 หลายเดือนก่อน

      In 2024, almost all programs are slow. They are slower than they were 25 years ago because everybody has adopted this lazy ideology.

    • @babgab
      @babgab 5 หลายเดือนก่อน +2

      Depends on what you consider "optimization." By the time you know your program is too slow, it may be too late to optimize, and the slowdowns may be spread through your entire program in a "death by a thousand cuts" situation. Not considering performance up front can create more work than it saved.

  • @r2com641
    @r2com641 6 หลายเดือนก่อน +2

    You right, it’s better just to optimize out the whole c++ and use a sane language, thanks god we are in 2024 and have choices 👌

    • @Lalasoth
      @Lalasoth 6 หลายเดือนก่อน +4

      I'm confused. If you dislike C++ so much why waste time watching this ? Aren't their better uses of your time ? I am wasting mine talking to you but its amusing that you hate so much that you waste your time posting comments about it. Just move on already :)

    • @SqueakyNeb
      @SqueakyNeb 6 หลายเดือนก่อน +9

      Every language has an equivalent problem to virtual functions. Even if they don't, you'll still write the same behaviour eventually. If you ever write code that looks like "if(thing.type == A) doA(thing) else if(thing.type == B) doB(thing)" then you've invented virtual functions.
      And really, the point of the talk is just "here's another thing that people think are slow, here's some benchmarks to mess them up, stop being afraid of using tools when you don't even know their overhead".

    • @h4ndle_yt
      @h4ndle_yt 6 หลายเดือนก่อน +5

      oh no a cultist

    • @CompuSAR
      @CompuSAR 6 หลายเดือนก่อน +7

      @@SqueakyNeb I gave this talk several times, and each time got the impression that nobody got what I was trying to say. It's very obvious to me that you _did_ get what I was trying to say. Thank you.

    • @r2com641
      @r2com641 6 หลายเดือนก่อน

      @@SqueakyNeb wrong. First of all, you are viewing it as black or white, this is not the way it works. Your specific example with just two cases will be much faster than using virtual function. When there are few cases then it’s always faster, if there are many branching cases and conditions inside change often then it’s harder for compiler to predict branch and yes then theoretically virtual function will be faster, but in any case virtual function has extra level of indirection in a form of a virtual table lookup overhead which is always there. On top of that if one used switch statement then he can potentially be even faster since compilers can optimize switch better than if else. So if programmer knows how to properly write his branched code in an effective way he will best performance of virtual functions usage easily. But if he is a noob like you, then yeah, his pure C code will be as slow as if one used virtual functions. I hope it’s clear