Constructors Are Broken

แชร์
ฝัง
  • เผยแพร่เมื่อ 20 พ.ค. 2024
  • Constructors promise a safe, guaranteed way to initialize objects in C++. But what they actually provide is many subtle opportunities to read uninitialized memory, and ergonomic issues that can negatively affect performance (by needing to throw exceptions to signal failure) and memory usage (by needing to juggle class layout in order to initialize members in the right order in the constructor initializer list). Join me as we spelunk into some of these topics and learn a healthy alternative from Rust, the factory function, that might help our code be safer and better.
    Special guest appearances from struct update syntax/functional record update, non-static data member initializers/NSDMI, NonZeroU8 and niche optimizations, Option/optional, and lots of pontification about type invariants.
    Superconstructing Super Elider: quuxplusone.github.io/blog/20...
    NSA memory safety: www.nsa.gov/Press-Room/News-H...
    I use the amazing Manim library for animating these videos, and I edit them with Blender and Audacity.
    www.manim.community/
    www.blender.org/
    www.audacityteam.org/

ความคิดเห็น • 509

  • @Patashu
    @Patashu 3 หลายเดือนก่อน +207

    The realization 'in a factory, you can do a bunch of logic, then construct the object with all its initial values in one fell swoop, so there's never an invalid object' is really clever.

    • @feuermurmel
      @feuermurmel 3 หลายเดือนก่อน +8

      I'm not a C++ dev, but I wrote Java for about 10 years. Java provides you _some_ additional guarantees when it comes to constructing an object, but there are still easy ways to make mistakes and access uninitialized fields. I _never_ understood what the point of constructors (as seen in Java or C++) is, instead of having a way to construct an object with all field values in a single step. You can't name constructors and have to differentiate them by signatures instead, constructors are complicated to make safe (I think Swift has managed to do so), you can't return a type different from the class (e.g. wrapped in an Optional).
      It just seems like a bad idea that should be avoided in language design.

    • @petarpetrov3591
      @petarpetrov3591 2 หลายเดือนก่อน +2

      sooo smart, until the moment you realize your type is now not compatible with quite some std containers. Default initialization DOES MATTER.

    • @TheSast
      @TheSast 2 หลายเดือนก่อน +2

      ​@@petarpetrov3591 I agree, but not everything should implement a default initialization, it should be tied to an interface.

    • @AlfredoCorrea
      @AlfredoCorrea หลายเดือนก่อน +1

      I don't get it, how is that clever? The invariants will be violated while you are doing the bunch of logic. This is just kicking the problem. The real solution is stop kicking the can and accept that not all constructors need to establish invariants, especially the default constructor. Partially Formed values are ok, and efficient.

    • @feuermurmel
      @feuermurmel หลายเดือนก่อน

      @@AlfredoCorrea I don't understand. If an object has an associated invariant and the factory method creates that invariant _and then_ constructs the object, how is the object's invariant ever violated? Either the object exists with the invariant or it doesn't exist.
      Maybe you're looking at this from a different angle, but in my experience, having invariants associated with types/objects is really nice. Making it impossible/harder to instantiate objects with broken invariants helps a lot. I write a lot of security-critical stuff.

  • @Ankh.of.Gaming
    @Ankh.of.Gaming 4 หลายเดือนก่อน +104

    This reminds me of a concept I picked up in one of @NoBoilerplate's videos, which I now repeat like mantra when coding in Rust: "Make invalid state unrepresentable"

    • @lepidoptera9337
      @lepidoptera9337 หลายเดือนก่อน

      You know what that does to you, right? It makes your software useless if you miss your user's actual user requirements. It's just trading one bug opportunity against another.

    • @microcolonel
      @microcolonel หลายเดือนก่อน

      ...what? ​@@lepidoptera9337

    • @onebacon_
      @onebacon_ 22 วันที่ผ่านมา +1

      ​@@lepidoptera9337 No. It makes all invalid states unreachable, meaning if the not mentioned use case lies within the non invalid cases then you can be absolutely sure that it will work without changes. If it lies within the explicit invalid then it will not work just as it shouldn't.
      If a new use case lies within the before agreed on "invalid" then it should not be able to exist, without explicit change.

    • @lepidoptera9337
      @lepidoptera9337 22 วันที่ผ่านมา

      @@onebacon_ It usually doesn't. People use e.g. negative entries in fields that usually expect positive numbers as flags all the time. Or let's say you make an animal class and you only allow "dog, cat, canary", then only small animal vets can use it. A rural vet who treats "horse, cow, sheep, goat" is done for. All of you guys have been over-trained by architects with OCD in this belief that exercising absolute control over the user makes good software. That's total BS. What makes good software is adaptability and resilience. If somebody can compromise your Windows PC because they are allowed to put an arbitrary string into "animal type" rather then selecting from a pulldown-menu, then you didn't do your job correctly. You just showed the entire world that you are poorly educated engineers.

    • @ivensauro
      @ivensauro 19 วันที่ผ่านมา

      ​@@lepidoptera9337having invalid states doesn't mean that the software cannot meet the requirements, just that we have less time representing the technical things and more time making the requirements

  • @zstewart
    @zstewart 4 หลายเดือนก่อน +240

    Note that at 7:47 the same optimization (0 = None) means that the NonZeroU8::new function is actually a no-op even though it returns Option and contains a match. Using it won't cause a branch until/unless you check the result yourself, for example by unwraping it. If you want a NonZeroU8 the effect is largely the same, but of you're passing to an API that wants Option regardless, then no extra branch is incurred. (This doesn't necessarily apply without optimizations turned on, of course.)

    • @tombenham9458
      @tombenham9458 4 หลายเดือนก่อน +5

      Interesting. But what's the point of having Option in this case ? Can't you just keep a u8 and check if it's not 0 when you need to ? Better readability and type safety I guess because then you can never accidentally devide by 0 if you forgot to check (but you can never forget to check Option for None). Is that all ?

    • @zstewart
      @zstewart 4 หลายเดือนก่อน +3

      ​@@tombenham9458 the NonZero types are often used for optimizations. If your type contains one anyehere in it, Option will have the same shape as T. The other use is semantics. Sure you "could just check that its not zero" when you use it, but that means putting those checks all over your code and more possibility to forget them. If you use this type it pushes the check to one place in the code (the factory new function) and everywhere else the invariant is guaranteed by the type system without having to put any more checks. So it's both harder to get wrong (correctness) and more performant (less checks needed). When working in Rust I find I pretty commonly create these sorts of restricted "Semantic Types" in places where in other languages I might just use a plain int or string.

    • @silentobserver3433
      @silentobserver3433 4 หลายเดือนก่อน

      @@tombenham9458 @tombenham9458 You can also only check it once and then pass around bare NonZeroU8. The compiler would not let you get NonZeroU8 without checking, so in all the other places of your code you can be sure you don't need any checks. With a u8 you would have to remember at which exact points of your code it has already been checked and at which it wasn't.

    • @DBZM1k3
      @DBZM1k3 4 หลายเดือนก่อน +39

      ​@tombenham9458 With Option, you don't necessarily have to check for None. That can be done via the many helper functions that Option has. 'map', for instance, becomes a no-op if the value is None. You have 'is_some_and' function too for doing some boolean check on the value, which defaults to false with None. Option is such a useful type.

    • @Neonvieh
      @Neonvieh 4 หลายเดือนก่อน +46

      @@tombenham9458 encapsulating this in the type system means that even when you pass this object to some other library or get a value passed by another library, you will not need to explicitly check for it to be non-zero. Furthermore, the compiler cannot know if a u8 is nonzero just by looking at it. With a proper type, the compiler can know this however. Type-safety in this case means to represent possible values by your type system, meaning if you pass the correct type, it is valid by design. For a randomly picked u8 value, we do not know if this is nonzero. Thus representing a NonZeroU8 value as u8 is not type-safe.

  • @_fudgepop01
    @_fudgepop01 3 หลายเดือนก่อน +82

    “You can grep your files for unsafe and find where you might have made a mistake” is the single best way of explaining (to me) why the “unsafe” keyword exists and why the compiler is more strict otherwise. That made so many things just *click* in my mind - great video!!

    • @gideonunger7284
      @gideonunger7284 3 หลายเดือนก่อน +10

      the thing is that thats quite irrelevant in larger code bases. when using rust to write my game engine with vulkan for example the invariants to control for are so large and expansive that everything has to unsafe anyway. having an api that is provably safe when your memory isnt cache coherent and asynchronously in flight is virtually impossible.
      also people dont actually give a shit about it. every other crate i open from crates io leaks undefined behavior into safe rust. and that one compromise it all that it takes to compromise all of your rust codes safety.

    • @shinobuoshino5066
      @shinobuoshino5066 3 หลายเดือนก่อน

      @@gideonunger7284 um sweaty, just spend 3 years writing a safe wrapper for vulkan and use that instead.

    • @mnxs
      @mnxs 23 วันที่ผ่านมา

      ​@@gideonunger7284I just don't know enough about GPU/Vulkan programming to meaningfully comment on the first part, but I just thought, isn't the point that's (kinda) made in this video that you need to design your API very carefully, creating very small abstractions and such components, in order to be able to minimise the invariant space around unsafe code? Ie., it's a very big effort, but not necessarily impossible?
      I don't know if it's _literally_ every other crate (maybe you just have very different usage patterns than me, lol), but yeah, there is a problem of a lot of crates using unsafe while not properly analysing it. It would be great if there was a common expected standard for analysis of unsafe, such that it'd be easier to see if the effort had been made, at the very least.
      To aid in that, but I don't know if this is feasible, it would be great if the compiler had an "invariant/UB analyser" that could look into and at least recommend invariants to check for and modes of UB that might arise from a particular use of unsafe.

  • @prodkinetik
    @prodkinetik 4 หลายเดือนก่อน +73

    incredible video. editing, narration, explanation, the code itself - everything was fantastic and so concise. as a rust programmer who’s never touched c++ i still learned a lot about programming in general. please keep it up.

  • @Roibarkan
    @Roibarkan 4 หลายเดือนก่อน +62

    8:30 note that the “new” and “unchecked_new” functions need to be marked as ‘pub’ in order for them to be exposed for outside use - and that the member(s) of the structure must not be marked pub to prevent people from constructing a NonZeroU8 on their own (and from mutating the member(s)).

    • @shrootskyi815
      @shrootskyi815 3 หลายเดือนก่อน +10

      I think the `pub` keyword is omitted for the purposes of keeping the example code in the video concise and focused on what is being demonstrated. Similarly, in real code, some of the functions should ideally be marked as `const`, but that isn't relevant to the video topic. Anyway, still a useful reminder.

    • @Roibarkan
      @Roibarkan 3 หลายเดือนก่อน +4

      @@shrootskyi815 thanks for your reply. The main point that I was trying to convey is that if one has a struct with invariants, they should take steps to ensure that the struct only gets created using their designated 'factory functions' and its fields won't be manipulated directly. This typically means putting the struct in a module and declaring the factory functions and other potentially mutating functions as 'pub'.

  • @rainerwahnsinn3262
    @rainerwahnsinn3262 4 หลายเดือนก่อน +18

    Another observation: It's refreshing that your videos have no background audio. Pure calm voice is so much better to deliver concise factual information than noisy music. Wish that more educators would rediscover "less is more".

  • @christianchung9412
    @christianchung9412 4 หลายเดือนก่อน +27

    9:00 "first of all exceptions are incredibly expensive for what's supposed to be a very simple and low-level type "
    This is only true if exceptions are used as a regular control flow mechanism (I think because of the stack unwinding but I'm not actually familiar). If the exception is not triggered, the cost is insignificant. Everything else you said about them is true though. Might've been worth mentioning that panic! exists for some niche low level uses, and that Result can't replace exceptions in every instance, but in practice for the vast majority high level code it can. Really really high quality video, I am astounded by the amount of misinformation that exists on C++, especially from other young people.

    • @joestevenson5568
      @joestevenson5568 3 หลายเดือนก่อน +8

      Yes, exceptions - like much of C++'s lunacy - are actually in the language for a good reason.
      Also there's a great CPPCon talk "expect the expected" that talks about many types of error handling and their issues/strengths. Exceptions in particular provide a lot of strengths. You can do centralised error handling, transport lots of information about the error, and they're very performant for situations where the "unhappy case" is rare - as you would hope it is.

    • @MIchaelArlowe
      @MIchaelArlowe 3 หลายเดือนก่อน

      Exceptions should only be used in exceptional circumstances. Even if they were free from a performance standpoint, using them for flow control just makes a maintainability mess.

    • @hemerythrin
      @hemerythrin 2 หลายเดือนก่อน +1

      Unfortunately there's no other (sane) way to signal failure from a constructor, which is why they're mentioned in this video. And for something like NonZeroU8, having the failure case be expensive really reduces the number of places that type can be used (compared to the rust version which is zero-cost)

  • @Roibarkan
    @Roibarkan 4 หลายเดือนก่อน +47

    10:38 A potential way to “mark” create_unchecked as unsafe (although I personally think that “grep unsafe” is analogous to “grep unchecked”) is to use “the passkey idiom” as a way of keeping more control of which code might call specific member functions.

    • @shadamethyst1258
      @shadamethyst1258 4 หลายเดือนก่อน +6

      Friends over at the pony language call these passkeys "object capabilities"; you create a type such that only code you trust may receive and instance of that type, and you then use this trust to gate away features of your library

    • @LunaDragofelis
      @LunaDragofelis 3 หลายเดือนก่อน +4

      What is the pony language?

    • @wanderer7480
      @wanderer7480 2 หลายเดือนก่อน

      ​@@LunaDragofelisJust Google it?

  • @rexcrafter1518
    @rexcrafter1518 4 หลายเดือนก่อน +61

    It just amazes me time and time again, how the great choices that were made at rusts core make it such a practical and cohesive language to work with and reason about and have a butterfly-effect-like influence on so many parts in the language itself
    Another great example for that is the std::mem::drop implementation, made me chuckle when I found out about it :D

    • @kuhluhOG
      @kuhluhOG 4 หลายเดือนก่อน +3

      Isn't std::mem::drop just a function with an empty body?

    • @senzmaki4890
      @senzmaki4890 4 หลายเดือนก่อน

      My favourite part is traits if you're working with a library that works with a specific struct, you often end up awkwardly calling a custom function that takes in the struct and spits some value or does some crap several times across your codebase.
      You can instead just implement a trait for the struct to make your life easier and code cleaner.

    • @timurkravchenko7824
      @timurkravchenko7824 4 หลายเดือนก่อน +8

      @@kuhluhOG exactly, it just consumes the value )
      pub fn drop(_x: T) {}

    • @jonnyso1
      @jonnyso1 4 หลายเดือนก่อน +6

      @@senzmaki4890 The ability to slap a trait in almost anything to extend its functionality is so great.

    • @dzarko55
      @dzarko55 4 หลายเดือนก่อน +7

      Here’s another line of code that’s equivalent to “std::mem::drop(x)” - “x;”. Just the variable and a semicolon. Because everything is a statement, “x” means you’re trying to pass x, and adding a semicolon means you aren’t returning anything. So you’re just passing x into the void, and the deconstructor is immediately called.

  • @anderdrache8504
    @anderdrache8504 4 หลายเดือนก่อน +136

    The only problem with Rust's style is that in-place initialization on the heap isn't guaranteed.
    The code:
    let data = Box::new([0u8; 10_000_000]);
    is supposed to create 10 megabytes of data on the heap but might overflow the stack in the process. I tested this and it actually crashes in debug mode but works in release mode, pretty gross. I think solutions to guarantee a stack overflow won't happen are being worked on but it's not trivial.
    The way constructors in C++ work, you can just allocate the space on the heap and then call the constructor on the pointer.

    • @kaga2922
      @kaga2922 4 หลายเดือนก่อน +38

      the std uses a "box" keyword that avoids this problem while preventing stable rust code from using it. very annoying

    • @_vxpm
      @_vxpm 4 หลายเดือนก่อน +1

      ​@@kaga2922 that keyword no longer exists - even within std. Box::new is now implemented with a special attribute.

    • @pluieuwu
      @pluieuwu 4 หลายเดือนก่อน +1

      @@kaga2922actually it's now been replaced by an arcane rustc attribute called #[rustc_box]. this problem does need fixing but its a much smaller problem than the tarpit C++ finds itself in imo - essentially we need a more ergonomic and safe equivalent of out pointers that doesnt involve using MaybeUninit which is very easy to misuse, whereas some sort of &out reference that *must* be fully initialized by the time the function returns would be much better

    • @----__---
      @----__--- 4 หลายเดือนก่อน +37

      placement new is wip from what know. in the mean time you do: vec![0u8; 10000000].into_boxed_slice()

    • @_noisecode
      @_noisecode  4 หลายเดือนก่อน +83

      Absolutely true that placement new is badly needed in stable Rust. My preference would just be stronger guarantees about move elision, rather than a special attribute or `box` syntax or `&out` parameter type of thing. Since Rust controls its own calling convention, it would be amazing if `Foo::bar([0u8; 10_000_000])` could just initialize the 'prvalue' array directly wherever `Foo::bar` moves it to. In a perfect world this could be done without introducing anything like the incredible amount of complexity C++ has around value categories.

  • @majohime
    @majohime 3 หลายเดือนก่อน +2

    This video is so nice! It just resonates so much with struggle I had using C++ at first (like finding out that order of class members affects init order or that factory functions make much more sense in general) and constructors problem so well covered here.

  • @bennettmagy8215
    @bennettmagy8215 4 หลายเดือนก่อน +3

    Another great video! I'm always astounded when watching your videos how often you state an idea that I say all the time to my teammates. Avoiding partially initialized objects, making constructor bodies empty, Being wary of how the spaces between lines can evolve even if the current code looks good. I need to keep directing my teammates to your videos

  • @MasterHigure
    @MasterHigure 4 หลายเดือนก่อน +9

    8:30 "Which is something you can grep your code for" is slightly complicated by the fact that the actual mistake can easily happen outside the unsafe block. We can see that in this very example if you pass a variable to new_unchecked() and somewhere else in the code can accidentally set that variable to 0 before that call.
    It is still a very good idea to have in the language. It just doesn't make it as easy as it might seem at first glance.

    • @anon8510
      @anon8510 4 หลายเดือนก่อน

      Doesn't Rust restrict the usage of unsafe functions to unsafe blocks only?

    • @MasterHigure
      @MasterHigure 4 หลายเดือนก่อน +4

      @@anon8510 Yes, it does. But that doesn't mean that all potentially unsafe errors happen there. As in this very example:
      let x=0;
      let y= unsafe{NonZeroU8::new_unchecked(x)};
      The actual mistake happens on the line above the unsafe block, where x was supposed to be set to 1, not in the unsafe block itself. (It is not difficult to imagine that figuring out the value of x could be much more complicated, for instance dependent on input that isn't sanitised when it should be, or set with some crazy math expression that's not quite correctly coded.) Which is to say, you can't JUST grep for and look at the unsafe blocks. You have to inspect all the surrounding code as well.

    • @anon8510
      @anon8510 4 หลายเดือนก่อน

      @@MasterHigure that's a fair point

    • @----__---
      @----__--- 4 หลายเดือนก่อน +9

      The point is that you can always be sure that the breakage of the invariant happened in an unsafe block. But the thing that breaks the invariant might have initialized outside of it. Safe/unsafe rust does not promise anything else and this is already a lot better than having no safe subset.

  • @Will0wAWisp
    @Will0wAWisp 4 หลายเดือนก่อน +24

    6:10 if I remember correctly 42 is what’s set because NSDMI is basically a default value, could be wrong tho lol

    • @__Brandon__
      @__Brandon__ 4 หลายเดือนก่อน +5

      Yeah, that way you can have multiple constructors that init to different values. If NSDMI took precedence then you couldn't use the initializer list to has per constructor values assigned in the initializer list

  • @herzenschein
    @herzenschein 4 หลายเดือนก่อน +1

    The solution you propose at the end (using aggregate data structures to ensure type validity) for the C++ side reminded me of a CppNorth talk from last year, "Writing C++ to Be Read". It touches on the topic of constructor initialization and how aggregate initialization provides advantages for quite a few cases.

  • @nmay231
    @nmay231 4 หลายเดือนก่อน +1

    Thanks for these videos! An informed opinion, educated discussion, and soothing voice with great visuals make this a great channel. Keep doing what you're doing, man.

  • @dungusberryrocks
    @dungusberryrocks 4 หลายเดือนก่อน +335

    C++ initialization rules are a mess man

    • @David-pz4gy
      @David-pz4gy 4 หลายเดือนก่อน +30

      I'm quoting this on my CS final.

    • @oserodal2702
      @oserodal2702 4 หลายเดือนก่อน +44

      C++ classes are a mess in general.
      Like why the fuck do I have to write a custom destructor, copy-constructor, and a copy assignment operator just to be able to properly handle pointers.
      At least the only real footgun in Rust (in this context) is the `Drop` trait implementation.

    • @ben1996123
      @ben1996123 4 หลายเดือนก่อน +44

      all of c++ is a mess

    • @hampus23
      @hampus23 4 หลายเดือนก่อน +10

      No not really

    • @zstewart
      @zstewart 4 หลายเดือนก่อน +2

      ​@@oserodal2702 and unless you're managing custom resources (raw files, raw memory allocations, etc) or want drop for your type to have side-effects, you mostly don't need to implement drop either!

  • @john.dough.
    @john.dough. 4 หลายเดือนก่อน +2

    I'm so glad this high-level content exists. Great video. Your expertise and knowledge is clear.

  • @neiljp-dev
    @neiljp-dev 4 หลายเดือนก่อน +10

    It's been quite a while since I've worked with C++, but that packing of private implementation made me think of the PImpl pattern. Not for the same application reasons, but certainly still to improve safety/stability, and also similarly with a cost from the indirection that cannot be optimized away.

    • @_noisecode
      @_noisecode  4 หลายเดือนก่อน +9

      It definitely has some similarities to Pimpl, and you could definitely set up your class architecture get the implementation hiding benefits of Pimpl and the initialization benefits of CString::M in one swoop if you chose to. I do want to point out that in the form I showed in the video, CString::M should actually be utterly transparent to the compiler; I doubt there would be any indirection overhead at all, even in unoptimized builds.

    • @Roibarkan
      @Roibarkan 4 หลายเดือนก่อน +3

      @@_noisecode I tend to agree. The main issue I see with your inner-struct suggestion is that I think it doesn't handle inheritance elegantly

  • @SuperSmashDolls
    @SuperSmashDolls 4 หลายเดือนก่อน +21

    Reading about the Superconducting Super Elider brings up another fun thing about Rust: moves. C++ has all sorts of rules about when moves can and can't be elided, specifically because there's API surface for arbitrary types to be told when they are being moved - move constructors, specifically. In Rust, anything can move anything as many or as few times as it likes, and if you don't like that, you have to stuff it behind a smart pointer (like Pin).
    This is mainly notable because it results in one of the biggest problems with Rust/C++ interop: everything has to be behind smart pointers. If you put a C++ type on a Rust stack frame, Rust will move it around without calling the move constructor, which is hilariously unsound.

  • @bonsairobo
    @bonsairobo 4 หลายเดือนก่อน +6

    I'm not sure I would call these associated fns "factories." The term "constructor" seems to apply equally well. My understanding of factories from OOP is that they are objects whose sole responsibility is constructing other objects (especially abstract ones with multiple implementations). I think there's a key distinction in that factories hold onto configuration data while constructors are simply pure functions.

    • @cgazzz
      @cgazzz 4 หลายเดือนก่อน +2

      There is an existing design pattern more commonly referred to as "static factory method" or "static creation method" and is a useful simplification of "factory method" when factory interface or constructor arguments aren't necessary. That is essentially what is being called "factory" for short here

    • @mikkelens
      @mikkelens 4 หลายเดือนก่อน +3

      I misunderstood the idea for the first few minutes of the video because of this as well. I have only heard the factory pattern used to mean "object instance that instantiates other objects", and a (static) "factory" like Vec::new() is often just refered to as a constructor in rust contexts because that word doesn't mean anything else within the language. I am more keen on saying "c++ constructors are something else" than "rust doesn't have constructors, and instead has static factories" as a general rule unless you are actively talking to a c++ person.

    • @----__---
      @----__--- 4 หลายเดือนก่อน

      I mean yeah but this is just bikeshedding. Arguing about the naming of such basic things is just waste of time.

  • @user-ni4cx2hv6b
    @user-ni4cx2hv6b 2 หลายเดือนก่อน +1

    Impressive, a single vid and you got a subscriber. That didn't happen for me for a long time. I knew C++ was unsafe but this really makes it stick in. I like the m. approach though. Scary to propose it in production code but I for sure will try that in my day to day programs.

  • @greob
    @greob 3 หลายเดือนก่อน

    Great video, very interesting. Also great editing and presentation. Thanks for sharing!

  • @JakeDownsWuzHere
    @JakeDownsWuzHere 4 หลายเดือนก่อน +1

    beautiful! thanks for making this!

  • @naturesarmy7936
    @naturesarmy7936 4 หลายเดือนก่อน +13

    As someone who doesn't know C++ and hasn't learned Rust yet, this was very informative 🎉

  • @BB-hn5sq
    @BB-hn5sq 4 หลายเดือนก่อน +15

    Honestly one of the better coding talks I have seen. It is very thoughtful and your solution can be easily reasoned about

  • @kirglow4639
    @kirglow4639 4 หลายเดือนก่อน +1

    Amazingly informative video, as usual. Very well presented

  • @orterves
    @orterves 4 หลายเดือนก่อน +18

    Most languages benefit this, particularly when incorporating the Result or Option types.
    Make invalid states unrepresentable.
    It's only a problem when the frameworks being used fail to support it effectively, and force you into the constructor approach

    • @ollydix
      @ollydix 4 หลายเดือนก่อน +1

      Indeed, and it can be annoying with libraries.

    • @oysteinsoreide4323
      @oysteinsoreide4323 4 หลายเดือนก่อน +1

      much legacy code are using old style C++ which forces you to use the old semantics for much of the stuff. When you work on a system that has code from the previous millennium, then you would know. And most of the time, it is not an option to rewrite all the code.

    • @orterves
      @orterves 4 หลายเดือนก่อน

      @@oysteinsoreide4323 if it ain't broke don't fix it - there's nothing inherently wrong with a constructor based approach when it's working properly. But new code in such a legacy code base certainly should consider the more robust approach - the two styles can coexist just fine

    • @oysteinsoreide4323
      @oysteinsoreide4323 4 หลายเดือนก่อน

      @@orterves Yes, in some instances, then using a better invariant on objects is good. But I'm not sure if I would use the factory approach. I would rather have an exception in the constructor, and ensure that the exception is handled. It will be much less code in the classes, and the validity of objects will be equally safe. The most important thing is to have good invariants, and that is often lacking in old programming style, and that is much worse. And something that is much more difficult to just write yourself out of when the code is complex. So there are much code where there is no clear invariant. So in that case how thigns are constructed is far less important. You would need to fix the invariants first.

  • @ludoviclagouardette7020
    @ludoviclagouardette7020 3 หลายเดือนก่อน +1

    Yep, it makes a lot of sense I like that pattern. I will try it. I think I have actually already seen it several times within the standard library

  • @basilefff
    @basilefff 4 หลายเดือนก่อน +13

    Thanks for thoughtful videos, always a pleasure to watch them

  • @vitulus_
    @vitulus_ 3 หลายเดือนก่อน +8

    One advantage with C++ constructors is "emplace_back" for vectors and other in-place construction. For Rust, you typically have to hope the compiler optimises it that way (I believe its called Return Value Optimisation). However, it's of course trivial to have it in C++ since constructors use pointers instead. I know eventually it's going to be solved, but it's taking a bit unfortunately.

  • @leonie9248
    @leonie9248 3 หลายเดือนก่อน

    As always, your videos are absolutely fantastic. I can’t get enough.

  • @davidrichey2034
    @davidrichey2034 4 หลายเดือนก่อน +1

    Great insight and explanations as always

  • @goczt
    @goczt 4 หลายเดือนก่อน +4

    10:28 "Because C++ doesn't have a builtin notion of safety"
    It does, you just dismissed it at 8:57.
    With exceptions enabled you can make an "unsafe" factory function by marking it noexcept. In case of invalid input the constructor will throw an exception, terminating your program. You (almost) don't pay for exceptions, don't have to handle them for such a scenario, and it guarantees the invariant.
    It is a bit 'hacky' and I acknowledge that it is less 'neat' than the rust way, but it's possible. And I don't think your arguments against exceptions hold much ground.
    1. Yes, fully exceptionless C++ is faster. But firstly, exceptions incur most of their cost when they're actually thrown and handled. For most use cases the overhead for exceptions being enabled is negligible. Secondly, if we're doing a feature comparison here, safe Rust is also slower (potentially also really slow and expensive) compared to unsafe Rust. I'd go as far as to say that these arguments have the same weight when comparing language performance, cancelling each other out, although it's too much work to prove it properly.
    2. "Lots of codebases aren't prepared to handle them". Well, sounds like they should start using the language like it's supposed to be used and actually learn the core safety features.
    I agree with "invisible code path" argument though, it is the only sound one and actually a really strong one.

    • @_noisecode
      @_noisecode  4 หลายเดือนก่อน

      It would be reasonable to use the success or failure of Rust's NonZeroU8::new as actual regular control flow in your code (say you have an unknown u8 and it's not necessarily an error if it's 0, you just want to take a different code path), whereas it would not be reasonable or wise to do so with a try/catch and a throwing constructor in C++. That's the insight at the heart of my argument that exceptions are too expensive for this simple low-level task.
      The noexcept trick actually sounds more like safe Rust than unsafe: it guarantees a crash (safe) instead of forging ahead with UB (unsafe).
      As for code bases that aren't prepared to handle exceptions, sure, everyone should just git gud, but the fact that the language doesn't force you contend with them (the invisible code path argument) means this is always going to be an uphill battle. Structuring your code to put failure in the type system with std::optional (or std::expected in the future) gets the type system on your side about requiring you to think about error cases.

  • @sapphyrusxyz
    @sapphyrusxyz 2 หลายเดือนก่อน +1

    I believe clangd has warnings for every mistake inside a constructor that you mentioned. It will tell you if you're reading from a member that hasn't been initialized yet, it will tell you that you aren't calling an overridden virtual function, etc. Obviously it would be better to disallow those mistakes in the language standard instead of relying on warnings, but good tooling helps a lot.

  • @wChris_
    @wChris_ 4 หลายเดือนก่อน +4

    You could get your constructors back once you aggregated all of your fields. Then the constructor could simply be this: 'CString() : m(CString::create()) {}' or this 'CString(const char* in) : m(CString::create(in)) {}'. This is actually very efficient thanks to NRVO. You also might want to replace the construct() methods with lambdas to have all the code right there.

  • @uis246
    @uis246 4 หลายเดือนก่อน +2

    9:22 if you know x is not zero you can do `if(!x)__builtin_unreachable();` before calling constructor. This will tell compiler that x=0 is not possible.

  • @basboerboom9328
    @basboerboom9328 4 หลายเดือนก่อน +7

    the curly brackets in the C++ constructor are pretty nice to do some quick math with passed in arguments to put a create a value for another struct variable. Instead of creating an instance and then calling an init function you can just create an instance and some base initializing will be done.

  • @LordXelous
    @LordXelous 3 หลายเดือนก่อน +6

    I always lean towards a static factory creation, usually returning a smart pointer for my C++ creation; I can vouch for this working and being absolutely the way to go. Making my constructors private, adding helper utilities, which you can then control access to the constructors via friendship has solved a lot of headaches for several projects my end. Especially when writing code which is to be used by others, preventing willy nilly stack allocation is really rather good, and though you say "factory" and you get glared at, I absolutely agree with the sentiments in this video.

    • @almightysapling
      @almightysapling 3 หลายเดือนก่อน

      I love stuff like this. Can you give some examples where normal constructor use might be leading me to willy nilly allocations that I'm overlooking/could be avoiding?

  • @Codeaholic1
    @Codeaholic1 4 หลายเดือนก่อน +6

    I always called factories as functions that returned new instances of a set of classes based on their arguments or other rules. The classic example is an image factory that can return a jpeg image class or a png image class based on the image file type. Callers need not care about the specialization and have a single method for construction

    • @sus7801
      @sus7801 4 หลายเดือนก่อน +2

      I would call this an 'abstract factory', since an 'image' is abstract

    • @MyAmazingUsername
      @MyAmazingUsername 4 หลายเดือนก่อน +1

      ​@@sus7801But... would you call it an AbstractImageFactoryFactory? 😏

    • @Codeaholic1
      @Codeaholic1 4 หลายเดือนก่อน

      See: OOP Design Patterns. You might not be wrong, but others have decided this well before you.

  • @mCoding
    @mCoding 4 หลายเดือนก่อน

    Great video, and I thought your perspective was very interesting l! Keep at it!

    • @_noisecode
      @_noisecode  4 หลายเดือนก่อน +2

      Appreciate you watching! Big fan of your channel!

  • @oysteinsoreide4323
    @oysteinsoreide4323 4 หลายเดือนก่อน +1

    The exception in the constructor would solve everything also. But as you said, it would mandate a try - catch somewhere to avoid issues with it. optional makes it possible to solve it in modern c++.

  • @Gamesaucer
    @Gamesaucer 4 หลายเดือนก่อน +5

    Constructors can be good as an interface (i.e. you know you're instantiating something but don't know what that thing is), though that does require that all the types you handle require the same amount of arguments for their constructor. It may or may not apply to C++ in particular, but it definitely applies to higher-level languages. For more complex initialisation such as the non-zero refined type case, I think static factory functions are reasonable if you have support for nullable types. Just return a NonZeroU8? from one, and a NonZeroU8 from the other.
    I do agree with many of the points you make, I just don't think they're inherent flaws of the constructor method, just of the way constructors work in C++ specifically. For instance, you could make a guarantee that in the constructor, any field of class A that has type T will have type T? instead until you assign it a value. And if by the end there are paths where not all fields are set, that's a compiler error because you didn't instantiate your object correctly. The type checker should be able to similarly reason about how fields are accessed in methods that you call. If it's just a setter, go ahead. You can treat it as though it accepts a type that has nullable fields of which the class you defined is a refined type. So long as the type-checker determines the method can work with this implicitly defined superset of your class, you're allowed to call it. Most of the other issues come down to a combination of syntax and language semantics, but none of them are flaws with what a constructor method does at its core.

    • @Dooezzz
      @Dooezzz 4 หลายเดือนก่อน +1

      In case of generics (templates in C++) types may have different number of constructor arguments, the particular constructor used is determined when the template is instantiated, e.g. vector.emplace_back(args...); will accept any number of arguments if the handled type has a constructor that accepts that combination of arguments.

  • @MrRoboticBrain
    @MrRoboticBrain 3 หลายเดือนก่อน +1

    A few notes:
    1. I'm a bit disappointed, that you didn't go into templates and meta-programming. Especially the NonZeroU8 case is a perfect example where templates can emulate a more complex compile time type system. (even though the syntax makes you want to hurt someone)
    2. The private struct solution is a pattern many code bases already use especially for dynamically linked libraries to isolate the struct layout from the library user called "private implementation" (PIMPL) but it always annoyed me how you either need a pointer indirection or get rid of the C++ type system and resort to basic C-style OOP to make it work without it.
    3. A CS teacher once told me "any design pattern is just a work around for the deficiencies of the language of choice". This video perfectly illustrates that statement!
    4. Making unchecked_new "safer" in C++ is possible with "friend classes" but that is a giant mess in and of itself.

  • @darrennew8211
    @darrennew8211 4 หลายเดือนก่อน +15

    Very interesting video, thanks!
    Note that the first big general-purpose OOP language was very OOP, in that classes were actually object instances (whose supertype was "Class", which was also an object instance), and the only way to allocate an object was to call a factory function on the class object. So "Point.new()" [not the right syntax] was invoking the new function on the object stored in the global variable Point.
    Anyone who wants to learn more about invariants (and pre- and post-conditions) should check out Meyer's tome "Object-Oriented Software Construction", which you can get as a PDF floating around since he gave away PDF copies with his compiler. It's an interesting delve into how programming languages are designed. Like, what's the mathematical reasoning behind the various higher-level structures. Very handy concepts that have oozed into other programming languages, even if you're not writing OOP programs.
    For example, in Meyer's language, the constructor would have a precondition that the argument passed in isn't zero, and an invariant that the value inside the object isn't zero. That's part of the type signature, so you know that's the requirement. If the requirement is more complex (e.g., an array of bytes contains valid UTF-8) then a non-failing function has to be provided so you can check it. Then the constructor relies on the precondition being met, and if it isn't an exception is thrown in the caller which you cannot catch and continue on from (but you can retry, in a sense), and the top of the exception stack traceback is the caller and not the non-zero-constructor code. Just as an idea of a different way of handling it. His whole exception thing is so much cleaner than other languages.

    • @oysteinsoreide4323
      @oysteinsoreide4323 4 หลายเดือนก่อน +1

      What was the first general purpose OOP? I thought Simula was the first one. It maybe never became big though.

    • @oysteinsoreide4323
      @oysteinsoreide4323 4 หลายเดือนก่อน

      It sounds it is more use of metaclass programming.

    • @darrennew8211
      @darrennew8211 4 หลายเดือนก่อน

      @@oysteinsoreide4323 I would say Smalltalk, as Simula was specifically for simulation.

    • @oysteinsoreide4323
      @oysteinsoreide4323 4 หลายเดือนก่อน

      @@darrennew8211 Well, you can use Simula for general things. Yes, it has a simulation library, but it still is useful for general things. Well useful, is a wide term here. Simula is not much used these days. Has mostly been used in Universities. And it was the inspiration for Bjarne Strostrup that made C++. Smalltalk is probably more popular. But I still Simula as the first object oriented language.

    • @oysteinsoreide4323
      @oysteinsoreide4323 4 หลายเดือนก่อน +1

      @@darrennew8211 The Simula of 67 was made for general purpose. In 62 it was mostly for simulations. The 67 version was the version I used at university back in 1993 to -95.

  • @CYXXYC
    @CYXXYC 4 หลายเดือนก่อน +24

    what about a private constructor that takes in all fields?

    • @cogwheel42
      @cogwheel42 4 หลายเดือนก่อน +3

      Was about to suggest the same. I thought that's where he was headed...

    • @_noisecode
      @_noisecode  4 หลายเดือนก่อน +6

      Well I gotta keep you on your toes, don't I? That's definitely the other valid approach, but I prefer M because it's less typing (don't have to spam out and maintain that constructor that's pure boilerplate), and you get to use designated initializers when creating it in the factory.
      If C++ had reflection and we could auto-generate the boilerplate constructor, I'd be much more attracted to that approach. One of these days....

    • @CYXXYC
      @CYXXYC 4 หลายเดือนก่อน

      @@_noisecode what about macros or cpp2 @'s

    • @anon8510
      @anon8510 4 หลายเดือนก่อน +3

      @@CYXXYC macros are just disgusting

  • @simonfarre4907
    @simonfarre4907 3 หลายเดือนก่อน +1

    The unsafe part is wrong actually. We get the same reasoning for NonZeroU8 - it can be 0 and undefined at all points of the program, due to the exposed unchecked API. This is identical to the exposed static function in C++.
    And the constructors, you're doing what a lot of people unfortunately mistake the constructors in C++ to do. You're handling business logic in the constructor. That is "wrong". The constructor is meant to initialize ("acquisition of memory or a resource") not transform.
    The correct way would have been to have the definition of the constructor be CString(std::unique_ptr&& buf, u64 len, bool is_ascii) noexcept.
    This is a mistake a lot of people do. For instance, you wouldn't do this in Rust (pseudo code): let cstr = CString { ptr: alloc(ptr_cstr, strlen(ptr_cstr)), len: strlen(ptr_cstr), is_ascii: check_ascii(ptr_cstr) }. Obviously, what you would do is construct each part, and then hand it to the initialization of the object. This is how you should *always* go about doing it for classes and structs in C++ too.
    Also, use public members. Write your `Immutable` wrapper class that has an `const T& operator() const noexcept { return data; }` and you need no getters and setters.
    The problem is people reach for a class WAY, WAY, WAY (I mean, WAAAAAAAY) too often when they really need a struct. The CString in the C++ example should be a struct, honestly. struct CString { Immutable ptr, Immutable len, Immutable is_ascii; };
    Also, emplace_back API's takes `Args...` for a reason! To be able to construct "my" CString type like so:
    vec.emplace_back(std::move(un_ptr), 42, true);
    Otherwise I absolutely agree - constructors has been a mess in C++, since so many want to do logic in them. That is a problem this video highlights both explicitly and implicitly.

  • @sorek__
    @sorek__ 4 หลายเดือนก่อน +2

    Awesome video, I really like how clearly you explain and compare things.
    I would love to hear more, especially a Rust tutorial for C++ developers. Some concept in Rust are just so alien to me I can't grasp this language properly (like how would you control a global list of objects for some important thing in project well).

  • @discreaminant
    @discreaminant 2 หลายเดือนก่อน +2

    10:16 if rust enforce the invariant thru UB (no one can stop u from passing 0, but nothing is guaranteed if we do so), we could do the same in c++, tho? like this:
    if(x != 0) {
    NonZeroU8 res;
    res.value = x;
    return res;
    }
    // Nothing here afterwards. Not even a return statement. If 0 is passed, it’s UB since we don’t return
    It actually works (everything is optimized as if the x is never 0), except the memory layout is not enforced.
    We can use std::unreachable in c++ 23 to explicitly cause UB also

  • @isaaccloos1084
    @isaaccloos1084 4 หลายเดือนก่อน +6

    My favorite channel on TH-cam. Another great upload 👍🏻

  • @shenanicode
    @shenanicode 4 หลายเดือนก่อน +3

    What the hell, this is so good!
    Moving from C++ to C#, I've never been so confident about initialization as with your Rust code.
    So atomic and safe!

  • @VioletGiraffe
    @VioletGiraffe 4 หลายเดือนก่อน +3

    15:55 "There is no problem that can't be solved with additional layers of abstraction (other than having too many layers of abstraction)". All in all I felt throughout the video that the problem you are presenting a solution to is significantly exaggerated, but if it bothers you so much, I can't argue that it's not a real problem. But not for me (I have over 10 years of commercial C++ experience, FWIW).

    • @potatomaaan1757
      @potatomaaan1757 3 หลายเดือนก่อน +1

      Yeah, this problem alone probably won't cause many issues, but this is a nice example of how Rust (and some other languages) do things in a somewhat familiar, but also significantly different (and imo better) way.
      I've been writing a lof rust for around a year now, and it's really changed the way i write code in other languages as well. I think there are a lot of very valuable lessons to be learned from "the rust way" of programming, this just being one of them.

    • @user-li2yv5je5e
      @user-li2yv5je5e 3 หลายเดือนก่อน

      I've run into a few cases of bad things happening during construction, but nowhere near as many as the cases of absolutely maddening things happening during destruction.

    • @VioletGiraffe
      @VioletGiraffe 3 หลายเดือนก่อน

      @@user-li2yv5je5e, could you give us an example?

  • @tsunekakou1275
    @tsunekakou1275 4 หลายเดือนก่อน +2

    i use boost outcome + 3 phase initialization. sure C++ don't have unsafe keyword but that's good enough for me. I don't think most C++ programmers need hand holding while writing a constructor, "spaces" between lines of code in constructor sound like a trivial problem, because constructor should be as simple as possible, if someone write 50 lines of code in constructor that would be smelly to me.
    May be i haven't wrote that kind of constructor, or work in a team like you but i think it is fine for me for now. Very nice video. i like it.

    • @Evan490BC
      @Evan490BC 4 หลายเดือนก่อน +2

      Yes, but the thing is you might do the right thing but your colleague might not...

    • @tsunekakou1275
      @tsunekakou1275 4 หลายเดือนก่อน +3

      ​@@Evan490BC
      the argument can go all the way on every little things. safety is nice but it has cost, so it become risk assessment and trade-off consideration for specific situation. all i am saying is i have considered the risks and concluded it is not that beneficial to have further safety rail guards like unsafe keyword or prevent "spaces" in constructors in my situation/code base. There are no one-size-fits-all gloves.

  • @olovjohansson4835
    @olovjohansson4835 3 หลายเดือนก่อน +1

    I was thinking if another issue, at least in c++. Even if you initialise your NonZeroU8 properly, there no protection from setting the member “value” to zero afterwards. You would need a getter, or cast operations, since there is no such thing as read-only fields in c++

  • @flamewingsonic
    @flamewingsonic 2 หลายเดือนก่อน

    Around 11:40: Personally, I would use a delegating constructor for CString: add a second constructor taking a pointer and length, and have the first delegate to the second passing the result of calling strlen.

  • @ETBCOR
    @ETBCOR 4 หลายเดือนก่อน

    Great video, as always.

  • @clement-poull
    @clement-poull 3 หลายเดือนก่อน

    Great video, especially the talk about half-initialized instances, this is something that should have been solved ages ago by the standard.
    One thing is bugging me about exceptions though. You mention that they are extremely expensive. This used to be the case, but in most modern compilers, the cost of exception handling should be basically null unless an exception is actually thrown, in which case yes, exceptions are more costly than checking the return value (mostly due to the handler being store in cold memory). I used to rely on exceptions as my sole error-handling mechanism. Since moving to rust as my main programming language, when I go back to C++, I do rely less often on exceptions, but not because of performance concerns, I simply want to force users of my API to reason about error handling with std::expected or std::optional (including explicitly saying "I want an expection if the value is not valid").
    I think in some cases exceptions might still be slower than checking the return value because of easier optimizations (including calling std::unreachable in the "failure" branch of the error checking), but I have no data about this.

    • @_noisecode
      @_noisecode  3 หลายเดือนก่อน +1

      Something I might not have made clear enough in the video is that it's quite often that I WANT to be able to take the failure path. It's useful to have an unknown u8 value, and use the success or failure of constructing a NonZeroU8 from it as normal control flow. Exceptions are this strange form of control flow mechanism that you hope never actually takes the unhappy path--precisely because it's so expensive. I think that's a strange tool to use when there are simpler, more type-safe, and dirt cheap alternatives.
      Here's a quick microbenchmark where exceptions are over a thousand times slower for a simple piece of control flow that takes the failure path. quick-bench.com/q/KDEPFXLc7746GdbKRbyIuPghLe0
      I stand by my claim that this is unacceptably expensive just for a fallible constructor of a low-level primitive like NonZeroU8.

    • @clement-poull
      @clement-poull 3 หลายเดือนก่อน

      @@_noisecode I did not expect it to be 1000x slower, the "common knowledge" was that it is about 20x slower than error checking. I guess I'll only use exceptions in cases where errors are truly exceptional, like IO operations. Thanks for the answer.

  • @fabricehategekimana5350
    @fabricehategekimana5350 4 หลายเดือนก่อน

    Video of golden quality, thanks

  • @gbnam8
    @gbnam8 4 หลายเดือนก่อน

    iirc from a c++ weekly video that constructors are just converters (casts). You can't even take the address of constructors, much like destructors. Hence, without the explicit keyword, the constructor is used implicitly, much like how casts are done implicitly. The (old) constructor syntax (S s = S(42);) also comes from the syntax for C-style casts (float f = float(42);), so it's not designed to be for constructors (well the brace initialization S s{42}; or S s = S{42}; kinda make sense though, but it's newer). Therefore, constructors in C++ began simply as a glorified custom cast, and since this does not change as the language move forwards, constructors, which could only get some temporary patches and fixes (like the brace initialization syntax), remain broken.
    btw the pattern/paradigm in 16:42 is kinda genius ngl

  • @bloodgain
    @bloodgain 3 หลายเดือนก่อน

    You don't mention if you were aware of it, but Josh Bloch talks about this same thing (as _static factory methods_ ) in _Effective Java._ He gives a lot of the same reasons for it, even though Java has strong guarantees about not letting you operate on an uninitialized object and cleanup on creation failure.

  • @user-zj6xe6ov8z
    @user-zj6xe6ov8z 2 หลายเดือนก่อน

    Access modifiers are there to protect the programmers, it's not hardware level protected or anything. Like if I know where the instance of an object is stored in memory, I can just jump to it and change to whatever I want.

  • @jakubpavelka5031
    @jakubpavelka5031 4 หลายเดือนก่อน +2

    I think that there are good reasons to criticize c++ but in case of NonZeroU8 it’s not the case in my opinion. Example: In c++ you can easily create the same NonZeroU8 using std::optional, private constructor and static std::optional NonZeroU8::new(u8 x).
    #include
    using u8 = unsigned char;
    using opt_u8 = std::optional;
    class NonZeroU8 {
    u8 val;
    explicit NonZeroU8(u8 x) : val(x) {}
    public:
    static opt_u8 NonZeroU8(u8 x) {
    If(x != 0) {
    return NonZeroU8(x);
    }
    return std::nullopt;
    };
    It’s important to say that there’s the doom of backwards compatibility that makes thigs pretty annoying but the c++ standard is trying to make sensible changes to improve safety and simplicity of the language. For example in c++ 23 there is a strong notion to use std::excepted instead of exceptions to maintain safety with minimal overhead.

  • @WolfrostWasTaken
    @WolfrostWasTaken 3 หลายเดือนก่อน +1

    This is the best video on the subject by far

  • @jeromej.1992
    @jeromej.1992 3 หลายเดือนก่อน

    I noticed a similar issue in C# in some UTs where two-step initialization is kinda forced upon and bim, what you said would happen, did: Stuff was accessed before it was initialized and, worse, it didn't crash. It was happily null and creating really hard to find bug

  • @yakov9ify
    @yakov9ify 4 หลายเดือนก่อน

    Used to hate writing factory code whenever I could cause it felt ugly. Though I write rust now this video gave me a new perspective and I will try to use them as much as possible in the future.

  • @natashavartanian
    @natashavartanian 3 หลายเดือนก่อน

    10/10 video thank you for your contributions 🫡

  • @lockyaw
    @lockyaw 2 หลายเดือนก่อน

    Super solid video!

  • @JUMPINGxxJEFF
    @JUMPINGxxJEFF 4 หลายเดือนก่อน

    Interesting content, thanks

  • @zygoloid
    @zygoloid 4 หลายเดือนก่อน

    Awesome video, thank you!

    • @_noisecode
      @_noisecode  4 หลายเดือนก่อน +1

      Thank YOU for watching! 🫡

  • @0xCAFEF00D
    @0xCAFEF00D 4 หลายเดือนก่อน +1

    7:45
    Edit below
    I'm very ignorant of Rust. But this confuses me beyond what I expect.
    Why are we branching if the type level knowledge of the nonzerou8 not being valid if it's 0 exists? You said 0 can represent none in this type.
    Why isn't this factory just 'return x' after compilation? Why is there a branch in the compiler output?
    Whenever you check/extract the some out of your option it must check x == 0 unless it can figure out x isn't 0 of course.
    Maybe it's that branch you're eliminating instead. But it's very confusing to have the factory up on screen and pointing to it while you're actually talking about eliminating the check in option. What's described doesn't follow the types as written in the code (after the factory you'll have an option, not a nonzerou8). Perhaps this is a feature of Rust I don't know about.
    Edit:
    So I let this sit in my brain for a while. Now it makes perfect sense again. It's still confusing to look at, the description in the video wasn't what I expected for this. But obviously if I looked at the return types at 8:09 the unsafe doesn't return the option type.
    What made me fail to understand was that we had just talked about how the none state can be represented by 0. I thought that was relevant. It's completely irrelevant. It's an unrelated tangent that's important (wouldn't want to make a u8 fatter), just not relevant to the rest of the video.
    What's relevant is that the unsafe function doesn't ever make an option and garauntees (within the contract given to the caller of course) that the u8 isn't zero.

  • @igs8949
    @igs8949 4 หลายเดือนก่อน +1

    Great video as always!
    Small correction at 2:38 (and following slides): It is not valid to add a comma after Default::default() (or any value).

    • @_noisecode
      @_noisecode  4 หลายเดือนก่อน

      Thanks for the correction! You're right.

  • @kaga2922
    @kaga2922 4 หลายเดือนก่อน +12

    after learning all of c++'s initialization rules i didn't realize it was such a horrible nightmare... and for no justifiable reason. most of the complexity comes from backwards compatibility and lack of foresight.

    • @kuhluhOG
      @kuhluhOG 4 หลายเดือนก่อน +3

      tbf, the language concept of "moving" stuff (instead of just copying all the time) was added to the language after about 30 years (and still before Rusts came into being; what do you think where they got it from?)

    • @_noisecode
      @_noisecode  4 หลายเดือนก่อน +6

      The C++ committee is full of brilliant people doing their absolute best, but yeah backwards compatibility is the real killer. C++11 tried really hard to simplify initialization with uniform initialization, but it wasn't perfect (ahem std::initializer_list), and now we are stuck with its warts forever, and the sum total is that uniform initialization permanently added complexity instead of simplifying.

    • @__christopher__
      @__christopher__ 4 หลายเดือนก่อน +6

      When C++ was created, people just didn't have decades of experience with this sort of things. I guess technically you can call this lack of foresight, but I think no one can be expected to have that level of foresight.

    • @atijohn8135
      @atijohn8135 4 หลายเดือนก่อน

      ​@@__christopher__ the thing is that C++ tries to be 100% backwards compatible, the lack of foresight back then could've been justifiable if the language were to later undergo some fundamental change; but it didn't, and that's why it can be a pain in the ass to use nowadays. It doesn't seem like it'll ever be any different, and at this point maybe it's better to let it slowly die off and be replaced by Rust or even Go.

    • @orbital1337
      @orbital1337 4 หลายเดือนก่อน +9

      Spoiler alert, in 20 years people will say that [xyz] is such a better system than borrow checking and how come the Rust developers had such little foresight to build such a silly language. I mean personally I think that Val's system looks nicer to me than Rust even today. In any case, making programming languages is hard. Maintaining a programming language so that it stays relevant for decades is even harder.

  • @DBZM1k3
    @DBZM1k3 4 หลายเดือนก่อน +2

    This is a great video! I'm definitely going to send it to some colleagues of mine to check out.

  • @JA-in3hw
    @JA-in3hw 3 หลายเดือนก่อน +1

    I dislike sometimes how tightly OOP type things naturally couple lifetime into things. It's nice to use allocators (mainly arena allocators) to get a block and use a factory/c-style init function to fill in a pod struct into some aligned offset into the allocator's big block of memory. People teaching c++ will squak it has to be impossible to have invalid data but sometimes you just have to accept the responsibility.
    By dodging the construction, you can dodge the destruction. Just reset the allocator (that invalidates all structs placed in it) and go again for a new frame or hot loop iteration. I think c++ gets a lot better when you stop forcing everything to tangle data, lifetime, and functions into a class. Data, functions, and memory allocation can be handled seperately.

  • @microcolonel
    @microcolonel หลายเดือนก่อน +2

    I love how the best solution to C++'s unergonomic initialization is to make a C struct. 😂

  • @eliasrodriues6614
    @eliasrodriues6614 4 หลายเดือนก่อน +4

    C++ and Haskell are the best

  • @Omnifarious0
    @Omnifarious0 2 หลายเดือนก่อน

    I think this is a really interesting idea, and a valid C++ criticism. The throwing constructor is a huge concern in a lot of places.

  • @aleksanderwasowicz8499
    @aleksanderwasowicz8499 2 หลายเดือนก่อน

    15:51 I like how U come to inner struct model from safety reasons. I'm using it, because it looks nice in the long run (it states the layout in one place, sagesting that next 10+ lines don't). And it can shorten Ur code, when U have a lot of overloaded operators.
    But "m" is just ugly, use "mem" at lest. For example, I'm using "data", but l'm not rigid on this.

  • @germinolegrand
    @germinolegrand 3 หลายเดือนก่อน +2

    Your argument at 8:52 is broken by "there's a way to do it (exceptions), but i don't like it, so it doesn't exist, see, (my partial version of) C++ is broken". Well if you don't like it as it is, fine, but that's not a useful argument.
    Also at 10:35, WHY do you even make a create_unchecked function ? For the sake to explain "oh no, we can't know if it's unchecked, C++ is broken" ??? If it is necessary for speed you have the EXACT same problem in rust (sure you can grep it with unsafe...). You could instead have put a reinterpret_cast(u8) everywhere speed is required to do exactly that, and it would do the job you want.
    Your point at 12:30 is valid and a real problem.
    Interesting discussion overall 🙂

    • @potatomaaan1757
      @potatomaaan1757 3 หลายเดือนก่อน

      (disclaimer: i have much more experience with rust than with c++)
      Regarding the first point, he mentioned at least two arguments:
      1. Exceptions create a hidden execution path which not a lot of stuff accounts for and which is easy to overlook when writing code.
      2. Exceptions in c++ specifically kinda suck and are only used sporadically by most projects, many disabling them entirely. If you are working in such a project, you just outright don't have that option.
      Regarding the second point, the line in rust between safe and unsafe code is very clear. And for just about everything, you will get by without unsafe. So when you do end up using unsafe somewhere, you're (hopefully) much more aware of what you are doing.
      And if you do end up with a memory bug, you know that it originates in one of those unsafe blocks. Considering that most project's I've come across feature something from 0 to 0.5% unsafe code, that narrows the search space dramatically. That point is not to be overlooked i think.

  • @Turalcar
    @Turalcar 4 หลายเดือนก่อน

    13:56 I vividly remember dealing with uninitialized vtable when I was working in Chromium

  • @capncoolio
    @capncoolio 4 หลายเดือนก่อน +2

    This all sounds like you actually want to be doing functional programming 🙂
    But yeah, this would probably serve as great context to a talk on referential transparency

  • @John-yg1cq
    @John-yg1cq 3 หลายเดือนก่อน

    One thing about Cstring class for C++:
    Is a valid way of solving that problem not to push the responsibility of finding out the string length from the constructor to the callee, and pass it as a parameter?
    The isAscii function could be ran on the inputted pointer too, and then you should be able to get the invariant statement straight from valid arguments.

  • @coarse_snad
    @coarse_snad 4 หลายเดือนก่อน +20

    I'm glad I'm not the only one disillusioned with traditional 'seemingly infallible' constructors. Factory functions are a much better way to do this.

    • @PthariensFlame
      @PthariensFlame 4 หลายเดือนก่อน +5

      While I agree in the end, if you do have to have constructors in your language (say, for backward compatibility with a previous language, or familiarity to intended programmer audience), you can do what Swift did and make fallible constructors possible and language-supported.

    • @coarse_snad
      @coarse_snad 4 หลายเดือนก่อน +1

      @@PthariensFlame
      Yes! That's also a good option.

    • @CorvusPrudens
      @CorvusPrudens 4 หลายเดือนก่อน +2

      @@PthariensFlame I dunno... I think I'd cry if C++ became any more convoluted.

  • @pixelprizm
    @pixelprizm 3 หลายเดือนก่อน +1

    Perfect explanation of the benefits of Rust over C++!
    One thought - in Rust due to unsafe code you do still have the "initialized values might be valid", but the difference is that in Rust you can always find the unsafe block, whereas the C++ equivalent is a complicated self-enforced minefield of arcane safety rules you did or didn't follow.

  • @yash1152
    @yash1152 3 หลายเดือนก่อน

    [0:47 thanks for mentioning us - the neither ones too (:

  • @Musicdude14z
    @Musicdude14z 4 หลายเดือนก่อน +2

    Thanks for the video, this summarized a lot of thoughts I've been having about initialization in C++ -- particularly when it comes to simple "data + behaviors" types.
    Something I feel is missing from this analysis is the role of inheritance and polymorphism in constructor syntax + constraints (particularly multi-inheritance). I'd love to see a deeper dive on this topic that addresses construction of derived classes and how that might (or might not) blend with the factory function API. Would CRTP end up playing a big role? FWIW you could probably leave multi-inheritance with multiple non-pure abstract base classes as a side note or completely omitted.

  • @albertovelasquez9027
    @albertovelasquez9027 4 หลายเดือนก่อน

    I really love your videos

  • @AaroRissanen-pi4td
    @AaroRissanen-pi4td 3 หลายเดือนก่อน

    How does the nested structure way of holding member data play with inheritance? So far I've noticed it doesn't really.

  • @Templarfreak
    @Templarfreak 4 หลายเดือนก่อน

    8:24 this is a VERY good point. the fact that you can explicitly go out of your way to do something in an "unsafe" manner, but then be able to EASILY track down when you have done this is a VERY powerful tool you can have, that also isnt babying and distrusting of the user / developer, which is a pretty pervasive culture that i highly dislike. with this, though, you not only make safer and more robust code but you also lack this inherent distrust of the developer

  • @thesupremegod1
    @thesupremegod1 4 หลายเดือนก่อน +1

    To be fair the problems with struct padding and the order/performance of the code is not solved, it's just hidden away or there are compromises made, which you can't control. I think if you need control of such small details, writing code that has to be correct is a decent tradeoff

  • @ShiroAisu10
    @ShiroAisu10 3 หลายเดือนก่อน

    I do this for a lot of reasons, including not needing to rely on exceptions. Except you cannot RVO at all if you return optional / variant types, which means you most likely will need a move constructor for a lot of your heavier types, and need to delete the copy constructor in order to not get bit.
    C++ constructor wrangling is soulcrushing sometimes, feels like a circus.

    • @_noisecode
      @_noisecode  3 หลายเดือนก่อน

      Optional and variant do both support in-place construction which can give you RVO if you do things correctly/carefully. The WithResultOf/“Superconstructing Super Elider” I mention near the end can help with this, I’d recommend checking that out in more depth.

  • @rainerwahnsinn3262
    @rainerwahnsinn3262 4 หลายเดือนก่อน +2

    I only know a bit of Rust and no C++, but each time I'm watching you talk about C++, I appreciate how many footguns Rust avoids which I don't have to worry about. Each one is only tiny thing to think about, but in aggregate it's the difference between painful worrying and painless comfort.

    • @shinobuoshino5066
      @shinobuoshino5066 3 หลายเดือนก่อน

      Yes, better avoid useful features and understanding how things work for the sake of soycurity or something, lmao.

  • @pratikkore7947
    @pratikkore7947 3 หลายเดือนก่อน +4

    learning c++ as a grad student I thought it was cool to know these easter eggs and irrational behaviors that could vary depending on the machine and compiler 😂 now I realise it's basically putting up with someone else's 💩

    • @_noisecode
      @_noisecode  3 หลายเดือนก่อน +4

      I relate to this. I used to take pride in knowing all the little subtle or esoteric stuff in C++ and thought it was a good use of my brain cycles. I took some time away from C++ and now that I'm using it again, I'm like... man, dealing with all this complexity is mostly just a waste of our precious time on this earth.

    • @lucdina5118
      @lucdina5118 3 หลายเดือนก่อน

      God.. I’m felling upset 😞😞😞 I do love c++ It hurt me to hear that!!

  • @letsgetrusty
    @letsgetrusty 4 หลายเดือนก่อน +3

    Brilliant video 🦀🔥

  • @noelgomile3675
    @noelgomile3675 4 หลายเดือนก่อน

    I haven't used Rust or C++ but I'm a fan of the way Dart deals with initialization.

  • @OlxinosEtenn
    @OlxinosEtenn 3 หลายเดือนก่อน +6

    It's a neat idea, but I don't see how you can make it work in practice unfortunately:
    That static factory function returns an `std::optional`, this is all well and good (I don't see what better type to return without exceptions), but you'd like to get rid of that optional wrapper once you've made sure that optional contains a proper `T` and not `std::nullopt`. That means either copy-constructing or move-constructing a new `T` from the `T` inside that optional. Copying will likely be either inefficient or even incorrect if `T` is noncopyable (e.g. if `T` is a wrapper around a file descriptor). Which means we need to be able to move-construct a `T`, so an object of type `T` can be in an empty moved-from state, that is, one of those unusable states we intended to avoid being representable by using that factory method over the constructor in the first place.
    It'd have worked with a destructive move, but we don't have that.
    Bummer.
    That means you'd be doomed to drag that extra `std::optional` everywhere which kinda defeats the point of having all objects of type `T` always being valid since they're now replaced by `std::optional` (which can be invalid by definition). At best, you now know that if you pass a `T&` or `const T&`, you know it's a valid object (well, unless that reference is somehow dangling, but that would be unrecoverable anyway). Still the caller will have to work with an `std::optional` instead of a `T` and either ignore nullopt checks (i.e. "trust me, I know this optional contains something") or add extra unneeded checks everywhere (slight performance loss, less readable code).
    Still, interesting comparison.

    • @okuno54
      @okuno54 3 หลายเดือนก่อน +2

      I believe what you're meant to do is create the optional, but then soon after you need to unwrap it: if it's None, report some error, if it's Some, move the inner value out of the option. If it feels silly because you've already validated the invariant, then you could call an unchecked factory... but prefer to refactor your code so that the checked factory becomes your validation.
      Then again, it's been a loooong time since I've done any C++. I will say, I'm usually of the opinion that the performance cost of keeping the invariants is worth the code not segfaulting in production. Obviously, tight loops are an exception, but before any optimization: take measurements!

    • @antagonista8122
      @antagonista8122 หลายเดือนก่อน

      @@okuno54 That's the issue with C++ community nowadays. They seems to forget that it doesn't matter how fast something doesn't work.

  • @MrWorshipMe
    @MrWorshipMe 4 หลายเดือนก่อน

    The nonzero class could have been done differently:
    Class NonZeroU8 {
    optional _val;
    NonZeroU8(u8 val) {
    if (val > 0)
    _val = val
    }
    }

  • @lMINERl
    @lMINERl 4 หลายเดือนก่อน +3

    Great video as always thanks

  • @0xCAFEF00D
    @0xCAFEF00D 4 หลายเดือนก่อน +1

    I find it weird to mention out parameters as bad practice without any regard for the underlying reasons like unexpected mutation by ignorant callers (or poorly written functions).
    I can't think of any reason that holds for a standardized language feature like C++ constructors. The constructor exists to mutate the this pointer primarily.
    The audience is left to guess for themselves what the issue is. Hope the rest of this isn't as empty.