C++ MythBusters - Victor Ciura - CppCon 2022

แชร์
ฝัง
  • เผยแพร่เมื่อ 20 พ.ย. 2022
  • cppcon.org/
    ---
    C++ MythBusters - Victor Ciura - CppCon 2022
    github.com/CppCon/CppCon2022
    The C++ community is very large and quite vocal when it comes to controversial issues.
    We’re very fragmented on many topics, based on the breadth of the C++ ecosystem and the background/experience we each bring from our C++ niche.
    From CppCoreGuidelines to opinionated best practices to established idioms, there’s a lot of good information easily available. Mixed up with all of this there are also plenty of myths. Some myths stem from obsolete information, some from bad teaching materials.
    In this presentation, I will dissect a few of the most popular C++ myths to a level of detail not possible on Twitter… and without the stigma of newb/duplicate/eyeroll one might experience when asking these questions on StackOverflow.
    Expect the familiar “Busted”, “Plausible”, or “Confirmed” verdicts on each myth and come prepared to chat about these.
    ---
    Victor Ciura
    Victor Ciura is a senior software engineer on the Visual C++ team, helping to improve the tools he’s been using for years. Before joining Microsoft, he programmed C++ professionally for 20 years, designing and implementing several core components & libraries of Advanced Installer, improving the virtualization and repackaging technologies for MSI/MSIX.
    One of his hobbies is tidying-up and modernizing aging codebases and has been known to build open-source tools that help this process: Clang Power Tools.
    He’s a regular guest at Computer Science Department of his Alma Mater, University of Craiova, where he gives student lectures & workshops on using modern C++, STL, algorithms and optimization techniques.
    More details: @ciura_victor & ciura.ro & linkedin.com/victor-ciura
    __
    Videos Streamed & Edited by Digital Medium: online.digital-medium.co.uk
    #cppcon #programming #cpp
  • วิทยาศาสตร์และเทคโนโลยี

ความคิดเห็น • 42

  • @mqnc6275
    @mqnc6275 ปีที่แล้ว +17

    Starts-ish at 15:00

    • @mqnc6275
      @mqnc6275 ปีที่แล้ว +3

      15:45

    • @gnerkus
      @gnerkus ปีที่แล้ว

      Thank you!

  • @Dziaji
    @Dziaji ปีที่แล้ว +7

    For the example of 40:58, the solution at 41:35 is clever, but when passing an lvalue, it creates an extra move operation. That's not the end of the world, but when talking about tooling, a function like this may be called millions of times per second. It is possible that the move might be optimized away, but that isn't certain. My preferred way to handle this (which also gives the benefit of accepting types that are explicitly, but not implicitly, convertible to the member type, is to use "universal references" which is a term coined by Scott Meyers, I believe, that uses templates and something called "perfect forwarding". It goes like this:
    class Widget
    {
    std::string id;
    std::string name;
    public:
    template
    Widget(type&& id, type2&& name)
    : id(std::forward(id)),
    name(std::forward(name)) {};
    };
    Not only does this avoid the extra move when passing lvalues into the constructor, but in the case that the members aren't std::strings, but some other class that has different explicit constructors for other types, and/or different constructors for const lvalue ref, lvalue ref, rvalue ref, volatile lvalue ref, const volatile lvalue ref (yes those are all subtly different), the templated rvalue reference combined with std::forward will make any constructor necessary to perfectly copy or move all types and all combinations of CV qualified types automatically for you. If you don't want certain combinations, then you can explicitly delete them or overload them, and your explicit or deleted functions will take precedence over the perfect forwarding template.
    It is a little more complicated to read at first, if you are not very familiar with universal references, but it will generate code that is PERFECT, and can possibly save you from having to create a bunch of constructors that are all doing essentially the same thing. This template acts as a catch all for types that can explicitly be used to construct the members, and also handles all CV combinations correctly.
    After learning universal references, you will find many uses for them, and you can even perfectly forward variadic parameter lists of universal references like this:
    tempalte
    void MyFunction(VariadicType&&... parameterList) {
    OtherFunction(std::forward(parameterList)...);
    };
    all types and CV qualifications will remain exactly the same when passed to OtherFunction as they were when passed to MyFunction.
    If you are really into creating tools like I am, these techniques are INVALUABLE!

    • @JohnDlugosz
      @JohnDlugosz ปีที่แล้ว +1

      The community has adopted the term "Forwarding References" for this, with Meyers' endorsement.

    • @Dziaji
      @Dziaji ปีที่แล้ว

      @@JohnDlugosz I guess that works because they are usually forwarded, but I like "universal reference" better because it hints at the fact that it binds to everything. The cool thing about them is how they bind to everything, not how they tend to be forwarded.

  • @JohnDlugosz
    @JohnDlugosz ปีที่แล้ว +4

    12:18 "iostreams are slow"
    Sometimes this falls in the "it depends" camp.
    This happened in 2007 I believe, on the Microsoft compiler.
    I saw that the project was using a function in their common code library for converting a string to an int, and it was implemented in the manner of the generic implementation of Boost's lexical_cast: stream it into an iostream object and stream it out again.
    I said this is ridiculous, why not use the simple direct standard library functions for this? I was disbelieving when told this was faster.
    I did my own tests, including timing and looking at the generated code. When compiled as the project was, with aggressive inlining selected ("Any available function", basically compiler chooses) the iostream version was completely inlined and all the overhead eliminated, reduced to the ostream's internal extraction operator which was written as a template, _not_ just calling the C library function.
    That implementation, even though the code seemed pretty much the same as the implementation of atoi or whichever I looked at, the C library was not only not a template but funneled all the similar functions (operating on different int types) to a master long int form that further had conditional code for whether it was operating on an unsigned type.
    It was the "generic at run time" code that was slower than the C++ template. In particular, the handling of signed and unsigned types in the same function caused a slowdown, even in my test code which would have trained the branch predictor better than in a real program where this common code gets used in different ways.
    Using iostreams in all their glory is slow. But when used surgically, new C++ code beats old C standard library code.

  • @khatdubell
    @khatdubell ปีที่แล้ว +1

    Talk starts at 15:45 for those that want to skip the intro.

  • @TimTeatro
    @TimTeatro ปีที่แล้ว +2

    Thanks. I appreciate how you have an obvious fondness for functional programming, but not an ideological commitment to it. Also, I especially appreciate the references to other talks.

  • @CartoType
    @CartoType ปีที่แล้ว +10

    In fact I’ve given up on this. It’s just too slow moving. The whole thing could be summed up in ten minutes.

    • @OREYG
      @OREYG ปีที่แล้ว +1

      1.75x

    • @origamibulldoser1618
      @origamibulldoser1618 ปีที่แล้ว +4

      A ten minute introduction on what you're going to introduce in the talk is too much.

    • @khatdubell
      @khatdubell ปีที่แล้ว

      There is a lot of fluff in this talk

  • @brynyard
    @brynyard ปีที่แล้ว +9

    The reason for the "myth" (more of a fact) of C++ not being easy toolable is because of it's syntax (preprocessor + grammar that needs semantics to be parseable + the sheer size of the spec). There are many tools for C++, eventually, but there has been a _lot_ of effort to get there. That there are many tools is _not_ a "proof" that it's "easy", it's more a reflection of how desirable and important they are.

    • @JohnDlugosz
      @JohnDlugosz ปีที่แล้ว +1

      s/it's/its/ in first sentence. ("It's" is correct in last sentence).
      English can be hard to tool, too.

    • @brynyard
      @brynyard ปีที่แล้ว

      @@JohnDlugosz more a problem of muscle memory and not being enough of a pedant to go back an fix it. Glad you could help me out with that ;)

    • @keshkek
      @keshkek ปีที่แล้ว +1

      No, you should check specification of C# which is 2856 pages (just clicked on download pdf button on microsoft site) or Java specification. I saw another specification of C# (June\Jule 2022) which has more pages (500+) about core language than C++ (~470) but in c# version there no no library specification. Rest of last c++ working draft is about library. It's another myth and I don't understand why you using "fact" word

  • @Dziaji
    @Dziaji ปีที่แล้ว +6

    The StackOverflow meme at 8:30 is perfect. The number and severity of pedantic, moronic responses you will get on StackOverflow when asking a simple question truly boggles the mind.

  • @khatdubell
    @khatdubell ปีที่แล้ว

    The "always pass by const reference" is so strong in people i've seen them pass ints by const reference and make a copy of it in the function.

  • @TheNovakon
    @TheNovakon ปีที่แล้ว +3

    std::move is actually std::allow_move - permission to move, but you need someone who really need to move it

  • @pawello87
    @pawello87 ปีที่แล้ว +2

    Rule number one: If you ever hear advice with words "always" or "never" - just ignore it.

  • @sheeftz
    @sheeftz ปีที่แล้ว +1

    I just wanted to add in the end: "Before we had X feature, C++ was one step away from its death. You could program in C++ nonetheless before the feature, but nowdays you'd most probably program in Java\C#\Rust without that feature X".

  • @franciscogerardohernandezr4788
    @franciscogerardohernandezr4788 ปีที่แล้ว +1

    That kitty image was just hilarious(I totally relate). And it brings the core issue with any mature programming language: the level between novices(which IMHO would be intermediate level 5 years ago) and the elite tier is just abysmal, comparable to chess club players and GMs.

  • @peterevans6752
    @peterevans6752 ปีที่แล้ว

    Victor, enjoyed all the history embedded into this presentation along with the myths!

  • @DmitryShubin-ym4pj
    @DmitryShubin-ym4pj ปีที่แล้ว

    Interesting and useful. Thanks, Victor.

  • @andreismyk-8919
    @andreismyk-8919 ปีที่แล้ว

    Victor, functional looks rather cluttered in C++ compared to F#. Did you look into optimizing it's code representation?:) Where would I use C++ instead of F#? - would be helpful

  • @yash1152
    @yash1152 ปีที่แล้ว +1

    9:04 > _"covered by ... . Marked as duplicate. - SO"_
    I didnt read SO before, but I easily guessed it was about SO, that's how --- SO really is.

  • @andreismyk-8919
    @andreismyk-8919 ปีที่แล้ว

    Great idea, Victor! Although, just numbering those myths may not be sufficient down the road. :) MythBusters++ to the rescue?

  • @JohnDlugosz
    @JohnDlugosz ปีที่แล้ว

    25:32 get_size(bool)
    It didn't inline the call with true, but the non-inlined function returning either 5 or 0xffffffff depending on the parameter is wonderfully optimized to not require any branching!
    I'll bet this optimization was perfected specifically because of the conditional returns of npos in the standard library string class.
    Though at L5, I wonder why it repeated [RDI+16] instead of using [DX] which it just loaded at the previous line? I suspect it's the latency and dependency, preventing the adjacent instructions from being issued at the same time. The stores could be re-ordered to do the "Hell" part later, and save a few bytes in the generated instructions.

  • @fpskkkk
    @fpskkkk ปีที่แล้ว +2

    this one looks like it's going to be awesome...

  • @CartoType
    @CartoType ปีที่แล้ว +8

    My advice: skip at least the first 11 minutes. Apart from a funny cartoon about StackOverflow there is very little semantic content. This guy needs to learn the value of brevity.

  • @JohnDlugosz
    @JohnDlugosz ปีที่แล้ว +1

    20:50 (\S+)\s*=\s*(\S+)
    Interesting that this would be so slow to compile the regex; perhaps the optimizer is slow.
    I don't think that the caller would be interested in matching "x=y=z=6" and showing that there are 3 different ways in can match and thus bind the captures. This is a common example of not considering that the regex _could_ be passed inputs that are far different from what you are expecting, and under-constraining parts of the match will lead to recursion pits.
    In this case, you don't really mean "anything that's not whitespace", but (at most) "anything that's not an equal sign or whitespace". The repeators should be lazy rather than backtracking -- even if under-specified matching leads to multiple ways to match something, we don't care. So write
    ([^\s=]+)\s*?=\s*?(\S+)
    It should compile to an *efficient* matcher. Though if the optimizer doesn't figure it out, it would try each letter from shortest to longest substring to re-try a string that doesn't match, futilely.
    Given "abcde", it would match the first Plus to the whole thing, optional whitespace? fine (none), equal sign? FAIL retry
    match the first Plus to "abcd" only; optional whitespace? OK (none); match "e" against "=" FAIL retry
    etc.
    Or maybe it's the other way around (tries shortest first), I don't remember.
    It requires some optimization in the regex compiler to realize that the "=" is a required character and anchor around that. Of course, even my rewritten form will take "x=y=z=6" as a match, but only one way. The first token captured cannot contain equal signs so it's (x), but the second one can.

  • @pbholmen
    @pbholmen ปีที่แล้ว

    You haven’t shown that std::move doesn’t move. I’m pretty sure using an object after move is undefined behavior. In that case, the c++ standard makes no guarantee whatsoever on what should happen. This is true for every operation in c++.

  • @StefaNoneD
    @StefaNoneD ปีที่แล้ว +7

    21:00 Sorry, but CMake is a complex beast in contrast to other build and package manager systems for other languages. In Rust I invest minutes (!) in order to understand Cargo. For CMake I have to invest many hours and still have many problems to deal with CMake.

  • @Dziaji
    @Dziaji ปีที่แล้ว +1

    apparently, the elephant metaphor is about a pig and a cow. Interesting take.

  • @DaemonJax
    @DaemonJax ปีที่แล้ว

    In Myth #24: why the hell is gcc doing all that work for a literal string? It's a literal string ffs! Crazy. EDIT: Oh because you're explicitly using the std:string type, instead of just letting it be a cstring.

  • @kamilziemian995
    @kamilziemian995 ปีที่แล้ว +15

    Myth nb. 1: you can learn how to use C++ well.
    You can't. You can only constantly learning how many things you do wrong.

  • @ABaumstumpf
    @ABaumstumpf ปีที่แล้ว +6

    std::regex_match - yes... that thing is horrendous. it is an abomination that should NEVER have made it into any standard and we will pay for this mistake for many years.
    i did have to analyse and "fix" a bug recently:
    The problem was that a user-intercation process died and cored ... multiple times over a couple of days. Ok, lets look at the core-dump with gdb. hey, a segfault? In a input-sanatisation-function? But that was a very short one. Ok, lets look at the full backtrace. Huh - regex... frame 1, frame 2, frame 3, 4,5,6,7..... 20, 50,100, frame 500... oh my god how long will that continue!?!?! frame 1000, frame 2000, frame 3000 .... all the way up to >7000.... yes. FINALLY found where it was called from and can now look at the local variables.
    "if( std::regex_match( user_input, "^.{1,}$" ) );"
    What was user-input? 200 character long just numbers.
    Yes... a user just pressed a key and a 200 character long string was enough for std::regex_match to cause a segfault.
    While trying to find who did this i used "grep" to regex-find and parse some logfiles... just a couple hundred MEGABYTES of text - no problem.
    TL:DR:
    NEVER use std::regex - it is slow to compile, slow to run, and so bad it can easily crash your program even if you did everything correct.

    • @sub-harmonik
      @sub-harmonik ปีที่แล้ว

      I don't get it; if it's implementation specific what's stopping it from being as fast as the library mentioned in the video? some compile-time vs. runtime requirements in the standard?

    • @superscatboy
      @superscatboy ปีที่แล้ว +1

      @@sub-harmonik The CTRE library is template metaprogramming *masterpiece.* The only caveat is that you have to know the pattern string at compile time - it then does actual dark magic to create a bespoke pattern matching state machine at compile time, which is nice and fast at runtime.
      However, the standard requires that the standard library regex be flexible enough to take dynamic pattern strings at runtime, so almost all of what makes the CTRE library fast wouldn't be applicable.
      I have no doubt that the standard regex library couldn't be significantly improved, but unfortunately CTRE can't be used as a straight drop-in replacement, and the techniques it uses for optimisation can't really be applied either.
      Essentially if you know the pattern string at compile time (which in most cases you do) then CTRE is almost certainly a better choice than the standard library.

    • @JohnDlugosz
      @JohnDlugosz ปีที่แล้ว +1

      The fact that another regex tool works fine suggests that the problem is in the implementation, not in allowing regex to be part of the standard.
      Though there are a lot of non-optimal things in the code shown, it should be a performance issue, and a much more mild one as in Perl5, not a stack overflow.
      In a language where regex is a central tool, like Perl5, we might suggest being sure to use the lazy modifier and pre-declaring the string so it's compiled once not on every use, or expressing it more efficiently as ("^."),
      but even in Perl in this case we can say that if a simple function exists for what you're trying to do, use that instead: you're just asking if the length>=1, or rather, that the string is not empty!
      The admonishment of "know your library" is probably not adequate since
      if (!user_imput.empty())
      is elementary enough to not require extensive learning of what's in the std.
      I would suggest that this is a creeping change: the match string was originally something meaningful, and got modified more than once into its current form. The error was in simply modifying the minimal amount rather than understanding what's actually being done and changing a larger piece of code. This "WTF?" should have been caught in the code review, though.

    • @ABaumstumpf
      @ABaumstumpf ปีที่แล้ว

      @@JohnDlugosz the regex is not hard coded but was request by a different department that has to configure our systems - this config can be changed by non-developers at runtime.
      It is the case of bad design requirements of the standard library that regex is so slow. And my current solution was to add some extra config so that the regex is the last thing that is checked after some simpler options like isEmpty, isAlphanumeric etc.
      But still "^.{1,}*$" is a valid regex and a couple hundred characters causing problems is unacceptable give that grep regex with lookahead on much larger inputs runs just fine.