C++ Weekly - Ep 359 - std::array's Implementation Secret (That You Can't Use!)

แชร์
ฝัง
  • เผยแพร่เมื่อ 28 ก.ย. 2024

ความคิดเห็น • 95

  • @gurutonic
    @gurutonic ปีที่แล้ว +23

    might be a more proper place to find std::size_t

  • @Nobody1707
    @Nobody1707 ปีที่แล้ว +34

    What a wonderful example of why deducing this was standardized. It can't actually be used for `std::array`, mostly for compatibility concerns, but any third party container will absolutely want to deduce this instead of writing four overloads for every member function.

    • @ohwow2074
      @ohwow2074 ปีที่แล้ว +14

      Actually 8 overloads. You forgot rbegin, rend, crbegin, crend

  • @friedkeenan
    @friedkeenan ปีที่แล้ว +8

    Would've been neat to touch on the specialization stuff for 0-lengthed arrays that are necessary without a compiler extension

  • @nmmm2000
    @nmmm2000 ปีที่แล้ว +1

    Did not know you can tuple-ize that easy.
    Thanks!

    • @simonfarre4907
      @simonfarre4907 ปีที่แล้ว

      Honestly though when you want to do destructuring, the data should be plain old data (POD) at which point you don't need any that arcane stdlib stuff (which as far as I know, also introduces more machine code to the final binary).
      Destructuring the data directly is what I would prefer, i.e. `const auto [a, b, c] = array.data`.

    • @nmmm2000
      @nmmm2000 ปีที่แล้ว

      @@simonfarre4907 not sure if this is related to my comment? auto [a, b, c] = array.data copy the data. in the array Jason did, there is no proper d-tor, but is really difficult to make it properly - you need to create base class and extend it under template condition, so to have d-tor only for non trivial types etc. If Jason do that, the point he made will be lost.

    • @simonfarre4907
      @simonfarre4907 ปีที่แล้ว

      @@nmmm2000 yes you are correct, that copies the data, just like Jason did in the video. If you want references, you just say "auto& [a, b, c] = array.data" instead. But if you are not going to mutate the ints, copying them is *probably* faster than taking individual references.

    • @yurkoflisk
      @yurkoflisk ปีที่แล้ว

      @@simonfarre4907 Well, even "auto [a, b, c] = ar;" would make a, b, c references, but into a copied array with compiler-defined name (because a, b, c are defined to be references into ("into" as defined per tuple interface) a dummy thing, which would result if [a, b, c] is replaced with that generated name). Though, of course, in simple cases optimization strips such things away and sometimes could even avoid creating an array altogether.

  • @dekrain
    @dekrain ปีที่แล้ว +3

    It actually IS possible to make access of the array member undefined behavior. Just state in the documentation that the name of the member is unspecified and its use undefined 😉

  • @davidbell304
    @davidbell304 ปีที่แล้ว +2

    It would be nice to see something like a same_for_const keyword/modifier that would just make the const version of functions for you.

    • @JonahFoley
      @JonahFoley ปีที่แล้ว +2

      Have you heard of 'deducing this'? Its a C++23 feature which does essentially this.

    • @davidbell304
      @davidbell304 ปีที่แล้ว +1

      @@JonahFoley Thanks. I think Jason did a video on it a while back, but I had forgotten.

    • @JonahFoley
      @JonahFoley ปีที่แล้ว

      @@davidbell304 yeah, I'd check that out, th-cam.com/video/QyFVoYcaORg/w-d-xo.html this is also a great cppcon talk which covers the matter

  • @Xilefian
    @Xilefian ปีที่แล้ว +2

    6:19 I started using a single underscore `_` to define a variable that the user really should not care about. I imagine I'd probably do `_0`, `_1`, `_2` if there are multiple variables (but in that scenario I would say there's a code smell somewhere)
    I started this when I noticed CLion's parameter name hints ignored underscores, and the existence of the hint made the code look ugly in the IDE, so I went with `_` when it was incredibly obvious. Perhaps `_` in a user array implementation makes sense

    • @michaelwaters1358
      @michaelwaters1358 ปีที่แล้ว +1

      this is undefined behavior, per the standard. so you are risking name conflicts with other objects / types / etc with these names. While it is unlikely, technically your program can get totally messed up / do random things / generate a minecraft world because you can break the standard library or have the standard library break your code.
      I recommend not doing that just in case, it bit me once in a program I worked on in university.

    • @danielb2392
      @danielb2392 ปีที่แล้ว +7

      @@michaelwaters1358 it's not UB if it's an underscore followed by a non-uppercase character (or followed by nothing), in a non-global scope. so a local or member variable called just _ is fine.

    • @michaelwaters1358
      @michaelwaters1358 ปีที่แล้ว

      @@danielb2392 oh I actually misread his comment, I didn't realize he meant a variable that is just an '_'. Never mind, thanks for the clarification

    • @danielb2392
      @danielb2392 ปีที่แล้ว

      @@michaelwaters1358 _0, _1, etc are also OK, as is _lowercase, if not at global ::scope. :-)

  • @embeddor3023
    @embeddor3023 ปีที่แล้ว +2

    Wouldn't it be a good idea, if you start another channel called "Rust Weekly" where you basically do go over the same topics as here but for Rust. I think this would be greatly beneficial for your audience.

    • @quintrankid8045
      @quintrankid8045 ปีที่แล้ว +4

      Why Rust in particular?

    • @ussgordoncaptain
      @ussgordoncaptain ปีที่แล้ว +1

      @@quintrankid8045 Rust is very similar to C++ but has compile time safety checking (there are other differences but that's far and away the main reason people like rust AFAICT)

    • @embeddor3023
      @embeddor3023 ปีที่แล้ว +1

      @@quintrankid8045 Because it's praised as the system programming language for the next 40 years. Personally, I think it has solid foundations to be a proper replacement for C++ regarding new software and I'd like to know more about it from the perspective of a senior C++ developer like Jason Turner.

    • @oleksiistri8429
      @oleksiistri8429 ปีที่แล้ว

      It should not be seen as a replacement, but as a competitor - sure.

  • @selvakumarjawahar
    @selvakumarjawahar ปีที่แล้ว +26

    why the array data element is not private?

    • @fcolecumberri
      @fcolecumberri ปีที่แล้ว +34

      Because then you need a constructor to initialize it

    • @cppweekly
      @cppweekly  ปีที่แล้ว +7

      We're relying on direct initialization, which is very important for performance and usability.
      See also: th-cam.com/video/3LsRYnRDSRA/w-d-xo.html

  • @ricopin
    @ricopin ปีที่แล้ว +17

    Maybe nitpicky but writing a c array like 'S s[5]' DOES initialize the array. It always calls the default constructor of every element in the array. It just happens to be that the default constructor of int isn't doing anything.

  • @ohwow2074
    @ohwow2074 ปีที่แล้ว +29

    Just a quick reminder that `std::size_t` is defined in along with its signed counterpart `std::ptrdiff_t`. The header is only for fixed integer types like `std::int8_t`, `std::uint32_t`, etc.
    Also you should include in order to use `fmt::print` and not header.

    • @uvuvwevwevweossaswithglasses
      @uvuvwevwevweossaswithglasses ปีที่แล้ว

      Actually 🤓

    • @SK83RJOSH
      @SK83RJOSH ปีที่แล้ว

      And yet, SIZE_MAX/PTRDIFF_MAX are in cstdint - because C is cursed. :)

    • @fullfungo
      @fullfungo ปีที่แล้ว +2

      Signed counterpart of size is ssize.
      Ptrdiff is an unrelated type (but usually “the same” as ssize)

    • @ohwow2074
      @ohwow2074 ปีที่แล้ว

      @@fullfungo They are the counter types of each other. There's no such type as ssize in the standard. However there's a function called std::ssize() which actually returns a std::ptrdiff_t as the result. And obviously std::size() returns a std::size_t.

    • @fullfungo
      @fullfungo ปีที่แล้ว +3

      @@ohwow2074 What I mean is that they are *not designed* to be compatible.
      ptrdiff_t can be smaller than size_t, so you cannot rely on it as “the signed counterpart” of size_t.

  • @TNothingFree
    @TNothingFree ปีที่แล้ว +19

    Would like to see more on Structure binding in C++ (8:47)
    Good stuff, Thanks!

    • @simonfarre4907
      @simonfarre4907 ปีที่แล้ว

      Destructure the public data member directly, using `const auto [a,b,c] = array.data` - this won't introduce any function calls to the final binary either, as this will just create new names / references.
      Which is why I prefer public-by-default design. Very rarely do I find good reason to make members private.

    • @TNothingFree
      @TNothingFree ปีที่แล้ว

      @@simonfarre4907 Yeah I also began designing my types with public members, it's very convenient, I also add const to mark them immutable.
      But the thing he used in the video is std::tuple_element which is a bit weird

    • @simonfarre4907
      @simonfarre4907 ปีที่แล้ว +1

      @@TNothingFree Yeah I agree. As far as I understand, it's basically pre-destructuring functionality (the getters etc), meaning, pre C++-17 (if I'm not mistaken), so before it was actually standard to the language to destructure. But don't quote me on that.
      Yeah, I unfortunately got taught a lot of nonsense at college back in the day (where the main language used was Java, except the courses I chose because I wanted to take them, like C++ & C for instance). It took some time to shake the habit of making almost everything private. This was quite a long time ago now though, but none the less, colleges and universities here seem to teach it still, which is unfortunate.

    • @rauldragu9447
      @rauldragu9447 ปีที่แล้ว +2

      @@TNothingFree I like public members a lot too, but not const members, I think those are an anti-pattern. You can afford to make everything public when you don't have any invariants i.e. your members are completely independent, with no binding between them. An example of an invariant is a struct named Foo with a std::vector nums and a std::size_t odd_nums_count. In this case you require setters and getters to maintain the one true value inside odd_nums_count. If you were to publicly expose those two the invariant could easily be broken. You could argue that you only need to set those at construction and so you can mark them both const but that kinda breaks both value and const semantics. If you create another class Bar which has a member Foo foo you are forced to direct initialize foo inside Bar's constructor. If you needed to acquire a resource inside the constructor and only then initialize foo (read the vector of nums from std::cin for example) you can't.With a normal value-type you could acquire a Foo temp_foo = ReatFooInput(); from somewhere and then just move it in my_foo = std::move(temp_foo); or even better my_foo = ReatFooInput(); . But instead you are forced to wrap my_foo in a type that allows deferred initialization like a std::unique_ptr (idk if you can use std::variant/optional/expected to also defer initialization).
      TLDR: When you mark a single class member as const, the whole class gains all the restrictions of a const type, even if it's not. Always try to write out the appropriate setters/getters if you need to maintain some invariant, don't fake it with const members.
      Note: Marking a class member mutable is way less bad (even if some people would tell you otherwise). If it's on a private member, then it's invisible to the class user, and if it's public, then it's at least opt-in (even tho it's bad form to mutate const variables). On the other hand if you mark a class member const, the user of that class is always reminded of it and has to constantly (get it?) work around the broken value semantics.

  • @LenPopp
    @LenPopp ปีที่แล้ว +5

    oh god listening to this while I try to write my first ever Python program was a mistake.

  • @rere4202
    @rere4202 ปีที่แล้ว +6

    8:05 since constexpr is implicitly inline, size() == 0 turns into _Nm == 0 at compile time, so this is just for better readability. For someone who just jumped into the code, size() is pretty self-explanatory but _Nm requires some digging.

    • @Cornyfisch
      @Cornyfisch ปีที่แล้ว

      Yes, I don't know why they named it `_Nm`, something like `_Size` or `_Count` would be simpler perhaps.

    • @simonfarre4907
      @simonfarre4907 ปีที่แล้ว

      @@Cornyfisch I mean for me it's quite obvious. They chose this design because it's obstructive and meant to be as difficult to read as possible. If i'm allowed to speculate some what in bad faith, I would think it's because they consider themselves so superior or smart, that why write code that's easy to understand for the overwhelming majority of people? No, let's just use arcane and strange abbreviations and acronyms.
      C++ has a long way to go here. When digging through source code of other newer languages, it's clear they've learned the lesson.

    • @Cornyfisch
      @Cornyfisch ปีที่แล้ว

      @@simonfarre4907 Haha I can see your point.
      All these strange conventions like Hungarian notation, LPCWSTR (or whatever it's called) and weird _Ldng_snke_case__ things can seem like magical incantations at times.

    • @simonfarre4907
      @simonfarre4907 ปีที่แล้ว +1

      @@Cornyfisch yeah! I would be more forgiving if it was just found in stdlib stuff from the 90s - but noooo, even newer additions like ranges has to suffer from it.

    • @not_ever
      @not_ever ปีที่แล้ว

      @@simonfarre4907 "Never attribute to malice that which is adequately explained by stupidity".

  • @carloslemos6919
    @carloslemos6919 ปีที่แล้ว +11

    For those who want to a similar content of """implementing the basics components std from scratch""" and C++ basics/internals/history the Alex Stepanov lectures at A9 videos are a must.
    www.youtube.com/@A9Videos/playlists

  • @paulcook2320
    @paulcook2320 ปีที่แล้ว +7

    Why isn't it implemented in the standard library with the _M_values declared private, if it is illegal to access?

    • @HitoPrl
      @HitoPrl ปีที่แล้ว +21

      Because then it is no longer an aggregate type, and thus you couldn't initialize it with aggregate initialization.

    • @paulcook2320
      @paulcook2320 ปีที่แล้ว

      ​@@HitoPrl Ah, I see. Thanks.

  • @TsvetanDimitrov1976
    @TsvetanDimitrov1976 ปีที่แล้ว +1

    all the boilerplate required is just irritating. wish there was some shortcut for all those trivial overloads

    • @BenjaminBuch
      @BenjaminBuch ปีที่แล้ว

      With C++23 there is! 🥳
      th-cam.com/video/QyFVoYcaORg/w-d-xo.htmlm12s
      Deducing this based CRTP makes much better Mixins possible!

  • @djee02
    @djee02 ปีที่แล้ว +1

    Why are the accessors constexpr? Isn't the fact that they are inline enough for the compiler?

    • @cppweekly
      @cppweekly  ปีที่แล้ว +3

      They must be explicitly marked constexpr if you want to use them in a constexpr context (and we do want to!)

  • @stevesimpson5994
    @stevesimpson5994 ปีที่แล้ว +1

    Can I assume the memory layout of std::array is the same as a c array? I couldn't find anything in the standard, but it would seem to make sense.

    • @Modinthalis
      @Modinthalis ปีที่แล้ว

      Yes

    • @cppweekly
      @cppweekly  ปีที่แล้ว +2

      It is, as show, simply a wrapper around a c array. So the layout will be the same as its data, which is a c array.

  • @victotronics
    @victotronics ปีที่แล้ว +2

    No constructor. So I can not "void f( array i ); f( {1,2} );" Annoying.

    • @TNothingFree
      @TNothingFree ปีที่แล้ว

      How about a reference?

    • @TheClonerx
      @TheClonerx ปีที่แล้ว +1

      But you can do `f(array { 1, 2 });`!

    • @victotronics
      @victotronics ปีที่แล้ว

      @@TheClonerx Not as elegant. (What if "void f( array< array, 2 > )"? Things with a constructor you can specify by their elements.

    • @SirRebonack
      @SirRebonack ปีที่แล้ว

      @@victotronics This works. What's the problem?
      f( array { array { 1, 2 }, array { 3, 4 } } );

    • @NoNameAtAll2
      @NoNameAtAll2 ปีที่แล้ว +4

      you can do f( {{1,2}} );
      still annoying, but...

  • @andreialdea6072
    @andreialdea6072 ปีที่แล้ว

    The Waffle House Has Found Its New Host

  • @dwarftoad
    @dwarftoad ปีที่แล้ว

    Nice peek under the hood, should be convincing that using std::array instead of C array won't necessarily add much or any runtime or storage cost in normal usage (just a little bit of compilation time; and at() can throw but use -fno-exceptions if you are worried about that, or don't make out of bounds accesses!) but adds a better and safer interface.

    • @sqlexp
      @sqlexp ปีที่แล้ว +2

      std::array has operator[], which does not throw exception.

  • @violetasaravia7572
    @violetasaravia7572 ปีที่แล้ว

    What compiler parameter shows the uninitialized error at 3:20? I'm not getting it :/

    • @twinkel88
      @twinkel88 ปีที่แล้ว +1

      GCC is used. You have to add the flags "-Wpedantic -Wall -Wextra -Wconversion" or at least the first two (I think) to get the error

  • @fennecfox2366
    @fennecfox2366 ปีที่แล้ว

    Very informative. Thank you.

  • @tauicsicsics
    @tauicsicsics ปีที่แล้ว

    What IDE/page is at 07:42 when looking at GCC code? thanks

  • @fcolecumberri
    @fcolecumberri ปีที่แล้ว +7

    Just as a small note, depending on the allocator used, std::vector can also be stack allocated.

    • @antagonista8122
      @antagonista8122 ปีที่แล้ว +4

      And then it unfortunately becomes one of the worst option possible - mixed value/reference type.
      Semantics become much worse and you need to carry stack buffer alongside vector everywhere (in e.g. return statement).
      Types like static_vector/small_vector are much better replacement if you need stack-allocated/small buffer optimized vector.

    • @fcolecumberri
      @fcolecumberri ปีที่แล้ว +1

      @@antagonista8122 I am not saying it is a good idea in general terms (I can think of specific cases where it is), all I am saying it can be done.

    • @simonfarre4907
      @simonfarre4907 ปีที่แล้ว +1

      @@antagonista8122 static_vector / small_vector solve a subset of the same things that PMR solves. Using that same logic, they become one of the worst options possible as well.
      The point about PMR is the fact that you get full control of for instance memory dealloc. Using PMR, you can "blink" (i.e. it resolves to a simple `munmap` of the memory) the collection out of memory. Poof, a no-op, essentially. As far as I know, static_vector and small_vector both run the destructors of it's types.
      But being able to blink away an entire collection, without even worrying about destructors, meaning, the destructors aren't even run at all, (obviously, the contained types will has to use the same allocator as well), can be extremely performant.
      Using PMR, you can basically build pseudo-stack allocated containers, meaning, they're not on the stack, they're on the heap, but some operations on them, quite literally behave as though they were stack allocated (referring to the blinking out of existence).

    • @scarletlettersproductions4393
      @scarletlettersproductions4393 ปีที่แล้ว

      @@simonfarre4907 I'm not sure where you got your info from, but objects on the stack are always destructed unless their destructor is trivial, they never just "blink out of existence".
      Also, an allocator whose destroy member function doesn't actually call the destructor of the object is a really bad idea, as it could easily cause serious issues (memory not being freed, resources not being released, etc). Also, the destroy member function was deprecated in C++ 20, so in future versions of the standard it could be removed/unused entirely.

    • @simonfarre4907
      @simonfarre4907 ปีที่แล้ว

      @@scarletlettersproductions4393 yes, but objects constructed on the stack can have their destructors inlined. So there is still a performance benefit there. I've seen the performance gain in GDB codebase, personally. The comparison to stack allocated objects versus heap allocated ones was just to use something as a frame of reference, although probably not the best one.
      As far as blinking out of existence, this is essentially what can be used in games to avoid having to spend unnecessary time destructing objects since you A: is in complete control of the objects life times and B: in complete control of the memory that is being used. I suggest you watch the Bloomberg CppCon talk on Polymorphic memory allocators.
      So you are absolutely, 100% wrong about it being a bad idea. It is most certainly not a bad idea if the problem domain warrants it. It can have fantastic results on memory diffusion (and fragmentation) and performance, but it has to be used correctly. I think you probably have to rethink your understanding of allocators to really understand what can be gained here and the aforementioned cppcon talk from Bloomberg on Polymorphic memory allocators is a great place to start. There are many more PMR talks on cppcon though. Specifically, watch "CppCon 2017: John Lakos “Local ('Arena') Memory Allocators" parts 1 and 2.

  • @sqlexp
    @sqlexp ปีที่แล้ว +1

    People have been trying to use string literals as template arguments and failed. I just use std::to_array to convert string literal to std::array and use the std::array as template argument. You have to ignore the terminating null, but I use std::make_index_sequence and some template magic to remove the null from the std::array.

    • @antagonista8122
      @antagonista8122 ปีที่แล้ว

      The best option in this case is something like StringLiteral class template that contains underlying char array as public member and have constructor from c array reference.
      You can add std::string like interface, concatenate string literals at compile time etc.
      You can also add templated user defined literal e.g. _sl to construct StringLiteral object directly from "some_string"_sl.

  • @lukasz_kostka
    @lukasz_kostka ปีที่แล้ว

    HAHA I broke your code 😀

  • @trondenver5017
    @trondenver5017 ปีที่แล้ว +2

    Video starts off by denigrating C arrays (fair enough) then proceeds to write the same function in 8 different incantations to satisfy the C++ compiler. Of course there are issues with C array/pointer duality, there is talk of fat pointer primitives in C23 but I don't think anyone knows for sure. It's not that painful to roll your own range for loop enabled array in C99 (with a little less syntactic sugar, if the preprocessor offends you). Love the videos JT but to me it's unfortunate so much effort is wasted coping with the accidental complexity of C++. I won't hold my breath for C weekly, but I can dream.

    • @cppweekly
      @cppweekly  ปีที่แล้ว

      Thanks. I often forget which is the best header for size_t

  • @Dave_thenerd
    @Dave_thenerd ปีที่แล้ว +2

    One thing that infuriates me about std::array is that unlike std::vector "capacity" and "size" are the same value. Thus if you have situations like this:
    std::array arr;
    std::fill(arr.begin(), arr.end(), 0.0);
    //Imagine some type of user input like a stream from a signal of some kind here.
    arr[0] = 1.1;
    arr[1] = 45767.98;
    arr[2] = 2.33;
    //...
    arr[100'000] = 12.97;
    //We've got an array of 10 million elements but have only used 100 thousand.
    for(size_t i = 0uz; i < arr.size(); ++i)
    {
    arr[i] = std::sin(arr[i]) * std::tan(arr[i]) / 2.0;
    //9.9 Mil unnecessary calculations that could fill the unused part of our array with a bunch of NaN and -Nan or result in a ton of DIVIDE_BY_ZERO exceptions being generated and thrown.
    }
    In order to fix this I usually wrap std::array in a struct like this:
    template
    struct fixed_array{
    std::array data;
    std::size_t in_use_size = 0uz; //TO-DO Increment this when a new value is added, decrement when a value is removed ONLY from the end
    };
    Which usually necessitates adding operator overloads and wrapper functions for everything std::array does so that the code isn't confusing or even more infuriating to use.
    This issue makes using std::array a lot of work. I can't wait for boost's static_vector to get into the standard so we can have a stack allocated array with different variables for size and capacity.

    • @Tibor0991
      @Tibor0991 ปีที่แล้ว

      C++ doesn't make you pay for things you don't use. An extra value for use cases beyond the intended use case of std::array is an additional cost and throws off your expectations (that is, expecting that the size of a std::array is exactly equal to sizeof(T) * N).