I typically just assign a struct var with another. Compiler will either map out each assignment or compile it as a `memcpy` call. If the struct has pointers, then gotta do a special deep copying function.
@@gregorymorse8423 what's the point? shallow copies always optimize to a memcpy unless faster to do it individually. just do `a = b;` and let the compiler figure it out.
@@gregorymorse8423 that makes no sense in this case when you're copying two struct vars. If we're talking an array of structs, that's a different story but the point here is just copying one struct var to another.
I have been programing in C for 20 years and I still learn new things from your video's. I especially liked the EFAULT thing with a pipe. I may never use it but it's good to know. Please don't worry about your video's being to long, I'd watch an hour long video if you made one that did a really deep dive on some topic. But mostly i love c programing and I find your video's very interesting even if i already know the subject your discussing. I've written application's, kernel modules and even chipset drivers. Keep producing these videos!
Yeah, that tends to irritate me too. Hence, I almost always do a "sizeof(thingy) - 1" and blindly poke in the terminator anyway. Or: I do a check on length first - and on fail abort the function. I fail to see why the strncpy() behavior is useful. It's an accident waiting to happen. May be the remnants of a quick patch in the early days of C?
@@HansBezemer Strings with a maximum length and optionally shortened with a '\0' were used in the first file systems in Unix. There was a struct with 2 bytes for an inode number and 14 bytes for the file name.
@@HansBezemer - I think I read somewhere that strncpy was intended to be used when writing to fixed-length fields that _weren't_ intended to be used as strings - e.g. when constructing some data packet to be sent over the wire. In that case they probably sorta do what you want - pad out to the end with NULs, but don't write beyond the end, no expectation that the buffer is NUL-terminated.
thank you jacob for your great content! as others mentioned, if you assign person2 = person1 and one of the struct members is a pointer you will get at this field another pointer to the same memory. very important at the end of the video.
You rarely want to copy the pointers around cause if one object is destructed then the other object pointing to one of the members of the first object will malfunction. Essentially copying with memcpy is safe only for POD structs and not for non-POD.
I was waiting the whole time for you you to talk about shallow/deep copies. This is one of the reasons to overload the assignment operator, to give it deep copy behaviour.
NO! This is the sort of example that gave operator overloading in C++ a bad reputation when it was introduced. Please do not ever write an operator= that does anything other than copy every struct/class member. Maintainers of your code will hunt you down and hurt you. Okay, I'm sure someone can find an exception to this. Perhaps there's a field that holds some cached information that is better off being recalculated than copied, or perhaps for debugging purposes you keep some sort of unique ID that really does need to be unique. But the rule is that after assignment a = b, then a == b should always be true, so you'll also need to make sure your overloaded operator== does not consider the un- or mis-copied field as part of the meaning of equals for that type. In fact, don't overload operator= unless you need to, and the only time you need to is when your class/struct has some resource (like a pointer, file ID, network socket, etc.) that either cannot be copied, or which requires special care when being copied. And if that's the case, you will want to implement your own copy ctor and dtor (and perhaps the move operations also). Having control over these operations is one of the things that makes C++ unique; in particular it allows automatic resource cleanup (RAII) in ways that few other languages can match. But that control can be misused. And with all due respect (and that's not an empty cliché; I recommend a lot of your excellent videos to my students), this is an example of one of those misuses.
pretty sure instead of using the assignment operator, you can have copy constructor, move constructor which can be invoked by std cope , std move or am I wrong
@@mr.mirror1213 Constructors and assignment operators serve two different purposes. A constructor initializes a new object, and assignment modifies an existing one. So with assignment you have some old state that needs to be destroyed. Perhaps what you're thinking is that one way to implement an assignment operator is to take the parameter by value, then use std::swap() (which uses the move operators) to do the assignment; this leaves the former state of the assigned-to object in the parameter, which will then get destroyed. This has the added benefit that operator= can be marked noexcept, assuming that anything that could throw an exception would be in the copy constructor (which will be used to pass the argument by value). BTW, some of the above was off the top of my head; I don't know that I've ever implemented operator= that way. This is the internet, so I'm sure someone will let me know if I screwed it up :-).
Your code maintainers need to stop hurting people. 😰 Yes, this is a good point. I really think *all* overloadings of the assignment operator suffer from this same issue. If you simply wanted the standard shallow-copy-each-member behavior (the default behavior) then you wouldn't overload. But, introducing any kind of "special care" assignment (deep copy, partial copy, duplicate system resource, etc), your users run a high risk of confusion and should be notified. But, yes, I agree with you that if the assignment does a partial copy, then it's a good idea to make equivalence be a partial comparison. But, even so, it's still potentially confusing.
@@JacobSorber Perhaps I should apologize for the violent imagery. It was a reference to the old saying that you should write code as if it would be maintained by a violent psychopath who knows where you live.
The Ada solution just seems simpler: besides “private” types, you can have “limited private” types, which simply cannot be used in an assignment statement. Instead, you have to implement your own explicit copying procedures, as appropriate. This is why Ada is used where software needs high levels of reliability, such as safety-critical areas. And why C++ is banned in such areas.
Hey Dr. Sorber just discovered your channel and been binging the vids, currently a freshman and our language of choice is C. Your DSA videos have been super helpful! Hopefully you would upload more embedded content as that's what I'm also doing in my free time right now using STM32 microcontrollers. Keep doing what you do!
Something that hasn't been commented on is the issue of a potential UMR (uninitialized memory read) with a memcpy in this case. With the example struct, an int (4 bytes) is followed by a char30 (30 bytes) which is followed by a double, which on some/most systems will be double-word aligned, meaning that 2 additional padding bytes will appear before the double so that the offset to the double is 36, a multiple of 8. These padding bytes are technically uninitialized by element-wise assignment. When a memcpy occurs, a strict runtime code checker will detect these bytes of automatic storage have not been initialized and will flag the UMR. To avoid the UMR, the memory containing the struct should be initialized to 0, such as by memset(&person1,sizeof(person1),0); If the struct is coming from dynamically allocated memory, this is still necessary unless the memory is being explicitly zeroed when allocated. Another way to avoid the UMR, if it is acceptable, is to rearrange the structure elements so that no padding bytes are needed, by placing the doubles first, then the longs, then the ints, and then the chars. I know this is a bit esoteric, but in my pre-retirement development work, a UMR - whether a benign one such as this or a "real" one - always needed to be avoided.
Even if one was not concerned about uninitialized padding bytes, I wonder if their presence could actually make `memcpy` slower than the manual element-for-element copy in some obscure cases? The latter one only copies those bytes that are actually needed after all...
May I ask what issues UMR can cause in this case? I've never been in a situation where I had to think about uninitialized padding bytes, so I'm genuinely curious!
@@mario7501 If you had 2 instances of a structure as I described, and the int and the char30 and the double are all set to the same values, but the uninitialized alignment bytes were not set and happened to be different, then a memcmp(&struct1,&struct2,sizeof(struct1)) might not indicate a match. Also, if you are using tools that detect UMRs, this memcmp would raise a UMR condition just like a memcpy would. I'm not sure what a structure compare would do; I'm guessing that a given compiler c ould optimize it to compare all the bytes like the memcmp, in which case that would be a UMR as well.
@@mario7501 I just tested this with the Microsoft Community 2022 compiler and was able to replicate the problem using a memcmp between two instances of a structure having a char followed by a double (will have 7-byte alignment). Both elements of the structure were the same but the alignment bytes differed so the memcmp did not return indicating a match.
@@gooblymoo this is really interesting. Never would've thought this could be a problem. There are probably a log of production systems out there that have a bug somewhere because of this! Thanks for the explanation!
In automotive SW we need to markup all variables, constants, pointers, structs... For example, person1 and person2 in ASW would be person1_s and person2_s or sPerson1 and sPerson2, respectively and sPerson1_t and sPerson2_t for definitions. That's the easiest way to see with what you are dealing with, even IDE can highlight the elements. CamelCase and underscore are equally allowed.
Put me in the camp of people who learned c++ 30 years ago, went through an operator overloading phase and got over it. I simply don't do it anymore. C++ is already too clever to have glyphs with common meanings do exciting things. You want to do something clever, then it deserves a name. That said, one reason to maybe implement your own copy rather than simple assignment or memcpy is performance. Let's say that your name string member is a megabyte long but the typical name is 5 bytes. Assignment and memcpy will copy the whole struct, whereas if you write your own routine, you can use strncpy and save yourself some runtime in the common case.
It's possible direct struct copying didn't work in K&R C and manual copying is just a throwback from that. I tried it explicitly using the C89 standard and it seems to work (assuming the compiler it handling it correctly).
One thing to note is that when doing assignments on structs that has compiler paddings, the compiler is not required to copy the values of these padded bytes per C99. This could be a problem if the struct is used for checksum calculation purposes which, depending on how it's calculated, could cause an issue. This issue can be resolved by the user by explicitly pad the struct or simply ensure the struct will not result in padded bytes by the compiler.
You should always copy structs through assignment, since if `memcpy()` is the best way to copy the struct, the compiler will just do that anyway, but unlike using `memcpy()` directly, you are not forcing it to use `memcpy()`, as in some cases there is a better way to do it and then the compiler has the freedom to use this better way. If in doubt, always leave choices to the compiler as compilers today make better choices than programmers do. That's because most programmers make assumptions about what is fast and what not and quite often those assumptions are wrong, as they never really tested their assumption across a variety of platforms in the first place. In 9 out of 10 cases, a C compiler will generate faster code than a programmer writing whatever he thinks is fastest directly in assembly. And in the one case, were the compiler won't win, the differences to hand written code is usually neglectable, so not even in that case it really paid off.
Except this is wrong. Novice comment by a bad programmer. Nowhere it was stated the memory is not overlapped. Therefore memmove is correct. And you should learn more...
@@gregorymorse8423 How often would you get memory overlap? You would have to do something pretty exotic with your pointers to do so. An ordinary struct variable declaration or malloc() just cant create overlapping structs by itself.
@@psionl0 it's nothing to do with pointers or allocation. It's the data in the structure. Often with extra bytes at the end. Perhaps a network data comes in, with START FRAME indicator somewhere in it unknown aat first. Then you move the structure to the actual correct alignment. There aren't that many practical cases, but there are several such as this one.
Hey Jacob, great video as always! It would be interesting to see more complex cases, let's say instead of the char array, you've got a char pointer that you allocated dynamically. I guess you're kind of stuck with explicit copying of these pointer variable members? (or C++ overloading, which is the same as writing a C function)
Yep. Once you have pointers you start running into aliasing issues and usually have to do a deeper copy (unless you don't want to duplicate your objects-which is sometimes the case). I am planning on hitting on this in the future.
The real question is: "Is the destination big enough to hold 6+1 bytes without overflowing?" A beginner may adapt this example to swallow dreaded "user input", and handing strncpy() the length of the source string won't prevent buffer overruns... A sketch in the Monty Python TV series had John Cleese interview Terry Jones (RIP) who was portraying the character: Karl Gambelpötidevanausfenspendenschlittkraßkrembaunfreidichedangldünglwarsteinvonnechthresheräpfelbangerhorowitztiklenssikgrandenochichbelltinkelbrandißgruewuldnahweltbasikküstlichimbeleisenbahnwagengutenabendbitteeinenurenburgerbratwürstschengespurtmitzwienmachtenueberhundsfootgumberaberschönendankecalbfleichmittelrathevonhalbkopf of Ulm Good luck on that one with a 30 byte buffer! 🤣
personally the reason I'd guess that "person 2 = person 1" doesn't work is because it wouldn't work like that in Python, and I assume C is gonna be harder rather than easier. haha. I kind of like the memcopy option more because I was confident it would work immediately without much experience, so it must be good for communication, at least considering C newbies like me :p
at 8:50 you mention that it's probably better to work with memcpy in that case, but I disagree, because it has the same issue If person1 and person2 are pointers, this will still work &person1 takes the address of the variable that's holding the pointer, dito for person2, and sizeof(person1) will just be the size of a pointer, so everything still works but we did the exact same thing as say `person2 = person1;` - just more verbosely and harder to debug in that case too
Only problem with memcpy/memset is if you arent running on arm cortex m7 or on windows (havent tried on linux) you will run into packing issues when casting data / putting it into a buffer where sometimes it can even crash if you try writting to an odd memory address. NOTE: Coretex M7 chips pack by default (i think), and windows handles this is the background. Don't know how linux handles it, and some embedded system compilers support packing through a #pragma call.
I thought that by default structs weren't packed and members were word aligned. (Isn't this a compiler issue?) Either way, I don't see how memcpy() would mess this up - unless you were copying a packed struct into an unpacked struct or vice versa.
12:29 I know relying on tooling is not the best option. But most intelligent editors will highlight overloaded operators. In case of visual studio (code) the operator is written in yellow and if you hover it will tell you the overload signature like a function. You can even press ctrl + left click to jump to the definition. So I feel like clarity is a question of tooling in this case.
6:27 it's not the same though, is it? In the first example, you copy the value of the name from person1 to person2, whereas in the second example you only copy the pointer to the name, meaning that if you were to mutate the name of person1, it would also mutate the name of person2, which is not what the first example did and also not what a "copy" should do.
What I don't like with person2 = person1, it's a day someone change char name[30] by char *name with dynamic allocation, everything will be broken down. Maybe the best way it's to make a function?
Correct me if I'm wrong. I learnt C many many moons ago and I dont think you were allowed to copy structs using the '=' operator back then. If you could then none of my cohorts knew this either.
i think it would have been nice, when the print function would also give out the addresses of the struct members, so that you can be sure it's not just the same struct. Also I'm a C person and not a C++ person, so maybe that's why I don't like the overloaded assignment operator at the end but my thoughts are: If you create an extra function to copy your structs/classes/objects, call it what ever, you might as well just call it STRUCTNAME_copy(). I assume that you have the function call overhead anyway, so might as well make it explicit. I mean what is easier to understand: myObject A = ...; myObject B; B = A; or B = A.copy(); I think the second one is much clearer in what it does, especially when you are dealing with heap objects. Does B = A just do pointer stuff and doesn't copy actual data ? B = A.copy() is obvious in my opinion, especially when you write some documenting comments to the function.
@@sledgex9 in C it's definitely not if you have anything allocated on the heap and want to copy that. And it's also not always obvious in C++, since '=' can also mean assign can it not ?
@@timtreichel3161 I was simplifying. In C++ it does mean assign. However when you assign an object to another object a copy is done. Same for POD types (like in C). The difference is in raw pointers where only the memory address is being copied (or assigned).
0:55, 1st thing I notice, that height variable should come before the other members because it's a double, had the compiler spit out a number of alignment/endian warnings to me at one time after I updated it, they only went away after I moved the doubles to the top of the struct. **Edit:** might've been a union instead, been a long time since I had those errors after all
I faced a similar challenge in a task to exchange binary records (structs) between an existing Unix (database) program and a developing VB/Windows client... The compiler (& code base) on the former used no padding of structs; the latter (a DLL for VB... yech!) required padding to 4byte boundaries for multi-byte elements like ints and doubles. Instead of writing arcane "element transcribing" functions, as you found, trials showed that pushing the 'ragged' shorts, chars and char arrays to the bottom of the struct worked like a charm! Fortunately, 'endian-ness' wasn't a problem for this task... I imagine, even though Jacob works with memory starved microcontrollers, he didn't want to 'clutter' this video with a digression into 'element order and structure padding'.
@1:52 - It's hinting that strncpy() takes va_args. What kind of broken IDE is that? Even if it's the result of some underlying macro mapping mess, it should give the proper signature of what it maps to.
I've used vscode all my life and I've never seen a function having the wrong signature lol. But I have as litle extensions as possible, Cuz I don't like customizing. I basically have the themes and C/C++ extensions However I have seen functions having *ERROR* parameters, when iltelisence was still loading. Once it loaded, it showed the correct signature
I had to lookup strncpy() since your C example didn't take into account the terminating NULL from the source name. According to Tutorials Point, "In a case where the length of src is less than that of n, the remainder of dest will be padded with null bytes". Curious that this apparently doesn't happen in C++. One issue you didn't address is what to do if a struct contains pointers. Do we copy the pointer or do we duplicate the memory the pointer points to? (I know, the answer is "it depends"). I think that if I were to do frequent copying of structures I would create a separate function like personCopy() which would be clearer than using = or an overloaded = (especially if personCopy() is documented).
Can you pass non type defined memory as arguments to functions. Like if i have a swap function and i want it to be able to work with any datatype. Is there something you can do with memcopy or void pointers or with anything else in the c language
Read and think about how the C library function qsort() works... It sorts contiguous 'blobs' of memory. (Each 'blob' must be the same size,. If the 'blob' is a pointer, the actual data can be anything that can be compared. For example, a cmp() function can subsort records based on subordinate fields. eg: "sort by age, name")
Wait why doesn't the assignment operator work with arrays? Does trying to do that result in an error? If not, what is the result? Also, are "person1.name" and "person2.name" pointers (since memcpy needs pointers)? Even though the structs themselves weren't declared as pointers?
No, "person1.name" is not a pointer. It is a memory address. The struct element "name" is declared as an array of chars. The token "name" represents a FIXED memory address, kinda like the value of a pointer. But it is NOT a pointer (that could be reassigned to point somewhere else.) To memcpy(), the source and destination addresses are just blobs of memory, and it will happily copy as many bytes as requested from source to destination. Consider the massive C code base existing when "copying structs" became permitted syntax. void foo( char str[] ) { printf( "%s", str ); str = "barabajagel"; printf( "%s", str ); } void main( void ) { foo( "Foo" ); Masses of code had been written using the pointer array equivalence. Allowing '=' to act as a copy function would break more than all Y2K threatened to break.
@@rustycherkas8229 Thanks man! I just noticed that somehow my brain registered Jacob saying "requires pointers" when he actually said addresses. But thanks for clearing thing up for me!
means the compiler will look for the header files in the designated directory. For example in linux it will probably be /usr/include " " means the compiler will look for the header files in the current directory of the program
@@wardog697 You're answer is correct, but a bit short. The preprocessor will search one or more directories to find each named header file. When installed the "compiler package" provides standard header files in a 'known' dir. (Obviously the path to this dir depends on what compiler one installs/uses.) (There may be subdirectories, too, as in #include ) At compile time, additional paths can be nominated using the "/I" compiler flag. Angle brackets signify "search only those directories of (stable?) header files" Double quotes signify "search current directory first, but use others if needed." I've recently seen (local) header files named to 'override/augment' standard header files... Some people are simply too clever for everyone's good.
I wouldn't say it's bad, but if you want your code to be portable outside GCC, you should be careful. Some GCC extensions work in Clang too. One i use a lot is ?: (the Elvis operator), it's useful for defaulting assignments. But nested functions can be dangerous, as they don't survive outside the scope you declare them in, so you have to be careful with function pointers to them. But if you're just prototyping, nested functions are really useful.
I have an update for this. Nested functions no longer work with a non-executable stack. So your programs will break unless you tell GCC to make your stack space executable, and will flat-out not work at all on some operating systems. But personally, needing to use nested functions in C is suggestive of a design issue anyways.
strncpy() is not the correct way to prevent buffer overrun, because it doesn't append a terminating zero in the worst case. You would have to write strncpy(dst, src, sizeof(dst)-1); dst[sizeof(dst)-1] = 0; to be sure. Use C99's strcpy_s() instead: strcpy_s(dst, sizeof(dst), src);
Inrmemeber use different techinques to copy data from one class to another in c++ one of them is to use = operator though you need to use operator overload =
@@JacobSorber oh, I understand, I might research it a bit further. Since arrays and pointers are a bit ambiguous in C, i thought maybe it behaves the same way. Thanks!
@@GameplaySheep If the struct holds an array, then each struct saves space for the allocated array, so they are independent. If the struct holds a pointer, the reserved space is only for a pointer, and the array would be stored somewhere else, so they would share the same array. You have to be careful about the notion that "arrays are like pointers in C", it's actually a bit of a trap.
it's funny, I saw so many lectures about c++ constructors (rule of three, rule of five, rule of six, rule of zero), but hardly anything about the assignment operator. I guess it's just not popular to use it? because we can have scoped variables? because of std::optional?
Serious question: When shifting from C to C++, you've added a function to the struct that clones an original. I wonder why you chose to not change the struct to a class and use C++'s facilities of constructors and "copy constructors"... person_c p1( "Clair", 45, 5.2 ); person_c p2( p1 ); strncpy( p2.name, "Marie", sizeof( p2.name )-1 ); // EDIT: moved -1 outside of ')''... Doh!! or even person_c p2( p1, "Marie" ); that would create Claire's twin sister... Then it's on to "p1.Print(); p2.Print()" to, again, take advantage of data hiding and code reuse.
@@JacobSorber Understood! The 'clue' I overlooked was in the video's title! 😀 I was perplexed by the breadth of the presentation, from beginners copying each element to 'advanced' operator overloading. This video may be more 'satisfying' if it were 2 minutes longer to show a complete transformation to a C++ implementation... Just my two cents... Cheers! 🙂 Please keep these coming!
Keep in mind, that in C++ structs can also have (copy/move) constructors. As mentioned in the video, in C++ structs are the same as classes with members public by default.
@@sledgex9 Thank you for your response! I'd years of 'C' experience and only "high pressure" reasons to start coding in C++ (based on cursory reading of the language doco.) Your reply has just had me experimenting with C++ structs beyond mere aggregations of (public) data members. Just wanted to say thanks. I've learned a lot! 🙂
English: (using C, not C++) Very good video! But, I have a question: how can I copy using "memcpy", if, instead of having "char name[30]", I have "char * name" and dynamically allocate its space at runtime? Thank you!! Español: (usando C, no C++) ¡Muy bueno el video! Pero, tengo una pregunta: ¿cómo puedo copiar usando "memcpy", si, en vez de tener "char name[30]", tengo "char * name" y su espacio lo asingo dinámicamente en tiempo de ejecución? Gracias!!
you could count number of characters until you hit \0 and then you know how much to copy, but it is a standard in C to always pass * and array size as parameters into your function
On overloading "=" with a non-obvious copy facility: As the saying goes, "With great power comes great responsibility." This is why C++ needs the '±' operator... It's very like assignment ('=') but shows that the destination will be "more-or-less" like the source... 🤣
Nice, but I think a better name would be operator≈ (that's supposed to be the "approximately equal" character, U+2248, but you wouldn't know it from the font my browser is using). Or maybe that operator would be the logical operator that tests for "sort of similar but not equal". Not that we're introducing Unicode characters, how about being able to use ≤ and ≥ instead of =?
@@rdwells This is APL waiting to happen. I don't like operator overloading. With lasagna code it's already a challenge to figure out what virtual function ".GET" is doing today - let alone "Person1 | Person2" (give away: it's the "divorce" operator). I mean I like "code beautifying" and "syntactic sugar" as much as the next guy, but sometimes too much is just too much..
@@HansBezemer My reply was actually tongue in cheek, as I assumed the post I was replying was. But on a serious note, don't throw the baby out with the bathwater. If you're writing, say, a rational number class, do you want to write result = add(mult(a,x),y); or result = a * x + y; I'd far prefer the latter. The C++ operators have specific expected meanings; if those meanings make sense for your class, then by all means take advantage of that. At the very least, you probably want to override operator< so that your types can be used with the STL. Or, if you're using C++20, operator. Overloading operator> to be friendlier with stream I/O is often useful as well. But, of course, everything in moderation. Overloading += to add an item to a collection (as I've seen done) is probably a bad idea.
@@rdwells That "+=" example of yours is scary... When you uncover the effect of that clever override, there should be ominous background music playing (perhaps with a thunderclap) as you feel your stomach sink and your blood turn cold... "What other gems are buried in here?...." For Jacob's "person" example, KISS would suggest factoring out a simpler function createTwin( const char *nameOfTwin ); that does what's needed. (Works in C and in C++.) This would require the parameter "Marie" be supplied, instead of presuming a generic "NONAME" will suffice... Your example ("result = add(mult(a,x),y)") actually is what the parser will (should?) generate from "result = a * x + y;"... As others have (often) commented on these "demonstration" videos, the 3rd parameter to memcpy() or strncpy(), etc. should ALWAYS be tied to the size of the destination, not to the size of the source... Finally, for any beginners reading this far: "age" (used for demonstration) is a "volatile" derived quantity... On/after a person's next birthday, the quantity will, by definition, change... Use 'date_of_birth' instead, and calculate 'age' when needed... NB: sorting by 'age' will put youngsters first; sorting by 'DoB' will put seniors ahead of those young whippersnappers...
@@rdwells Well, automagically you referred to a class where mathematical operators make sense: mathematics. Let's see what else we got: "Hello" || "World" Logical OR between strings..? "Hello" & "World" Binary AND between strings..? As a matter of fact, these are string concatenations in (Oracle) SQL and VBA. It doesn't make much sense to me. CONCAT("Hello", "World", "!") seems much clearer, hardly longer and not much more difficult to understand than "Hello" + "World" + "!". And yes, I did make functions like that - like doing a 14-bit fixed point set of arithmetic operations. Main problem: it can get quite LONG - even for relatively SMALL formulas. But - being primary a Forther that is accustomed to writing RPN, can't say it bothers me that much. In Forth, every set of numbers HAS got it's own set of operators, like +, - ,* for single length, D+, D-, D* for double length and F+, F- and F* for floats. I must say - I pretty much like it that way. It always bothered me in C you can add a short to a char - or an unsigned to a signed long. I don't like my compilers doing so-called smart things behind my back (no, I'm not a fan of Python). IMHO - minimal: if you mix 'em up and don't cast, that's an error. It breeds bad habits, so in the end, people don't even see their errors in their ways when they write "123" + 234 and expect something sensible to come out of that. If it works for others - fine. But it's not my preference.
Hi Jacob - I like your videos and they all have great content. This one is misleading as you are taking very simple example to show it works. Typically, it the structure contains pointer to another structure or array.
This might be a dumb question but can Jacob or someone in the comments explain why `memcpy` was fine with person2.name and person1.name, but needed the address of person2 and person1? Oh wait a string is basically a char* so essentially an address. Wow that was a rubberduck debugging moment.
Yeah remember that any pointer type is just and 4- or 8-byte integer which happens to be the address of some type. And another quirky thing: An array variablename acts like a pointer constant. This is why you can use it as a char*. This does NOT mean arrays and pointers are the same thing. Its just the syntax to access them.
I notice that you use Person& person instead of Person &person. The & is a modifier, it modifies the variable not the type. The same counts for Person* person vs Person *person. If you write Person *personA, personB; it is clear that personA is an pointer and personB is a struct. However if you write Person* personA, personB; most programmers will assume both personA and personB are pointers. Yes I tested it in an exam and most students give a wrong answer. Of course it is better to avoid ambiguity and write two lines of code.
In the context of the (possibly comma separated) list of function parameters, it seems to be convention (in my experience) that C++ developers use the "reference to" style Jacob used here. There is nothing about your 2nd example ("Person personA, personB") to suggest both of these are pointers.
I have a problem with compilers - I don't trust them. So I tend to program quite defensively. On the upside, that approach allows 10 KLOC programs to compile (almost) warning free over a wide range of C compilers - starting with K&R.
I usually solve the pointer issue by keeping a copy of the original struct to the side prior to overwriting the destination struct: typedef struct _SUM { int a, b; int (*op)( struct _SUM *sum ); } SUM; int foo( SUM *dst, const SUM *src ) { SUM tmp = *dst; *dst = *src; dst->op = tmp.op; return dst->op( &two ); }
Thank you, Jacob, for this interesting episode. But I cannot understand why all these examples use a person's age instead of the person's year of birth. You might get in trouble if you have filled a database with the persons' ages over several years...
you struggling with the pointer really breaks my vi heart, you should try the vim extension for your editor. you wil still be able to navigate it as you used to but you'll gain all the superpowers of vim and hopefully eventually completely switch
all three examples are painfully wrong and demonstrate the problems with C and C++. Lets say your struct has handles to objects or pointers to btrees. step in STL/templates and reference counting. which already is beyond most amateur developers. I love C and C++. I write kernel drivers, so C/C++ is pretty much a given. second overloading is a nightmare, although it does demonstrate a good point. lets say you had a 1MB bitmap, hashtable, string ect... it is ridiculous to copy the whole thing if only 2 bytes are valid. C++ has a lot of member functions to deal with this kind of thing, and paired with templates/generics/patterns they can be solved. again, with added complexity, but if you are writing potenially shared code/ libraries the extra effort is always worth it. example: for the windows IO model you have a complex structure called an IRP. copying that would immediately blue screen the system as you have threading issues, memory issues, locking issues... the list goes on. so my super simple solution is just write a copy function for your struct (C). there are exemptions of course. if your struct represents a 3d transform matrix then memcpy or naked assignment is fine. aka if all the data in your struct is always vital to the identity of the structure and contained.
I typically just assign a struct var with another. Compiler will either map out each assignment or compile it as a `memcpy` call. If the struct has pointers, then gotta do a special deep copying function.
Or memmove...
@@gregorymorse8423 what's the point? shallow copies always optimize to a memcpy unless faster to do it individually. just do `a = b;` and let the compiler figure it out.
@@kevinyonan2147 you've never heard of overlapped memory regions???
@@gregorymorse8423 that makes no sense in this case when you're copying two struct vars. If we're talking an array of structs, that's a different story but the point here is just copying one struct var to another.
I wonder why the strings weren't made that way copyable in C?
I have been programing in C for 20 years and I still learn new things from your video's.
I especially liked the EFAULT thing with a pipe. I may never use it but it's good to know.
Please don't worry about your video's being to long, I'd watch an hour long video if you made one that did a really deep dive on some topic.
But mostly i love c programing and I find your video's very interesting even if i already know the subject your discussing. I've written application's, kernel modules and even chipset drivers.
Keep producing these videos!
Be very careful when using strncpy. If the source string length is >= size of dest buffer, the buffer will not be NUL terminated.
Very good point. Thanks.
Yeah, that tends to irritate me too. Hence, I almost always do a "sizeof(thingy) - 1" and blindly poke in the terminator anyway. Or: I do a check on length first - and on fail abort the function. I fail to see why the strncpy() behavior is useful. It's an accident waiting to happen. May be the remnants of a quick patch in the early days of C?
@@HansBezemer Strings with a maximum length and optionally shortened with a '\0' were used in the first file systems in Unix. There was a struct with 2 bytes for an inode number and 14 bytes for the file name.
@@HansBezemer - I think I read somewhere that strncpy was intended to be used when writing to fixed-length fields that _weren't_ intended to be used as strings - e.g. when constructing some data packet to be sent over the wire.
In that case they probably sorta do what you want - pad out to the end with NULs, but don't write beyond the end, no expectation that the buffer is NUL-terminated.
@@Hauketal Strings with a built-in max length would indeed make sense in this scenario.
thank you jacob for your great content!
as others mentioned, if you assign person2 = person1 and one of the struct members is a pointer you will get at this field another pointer to the same memory. very important at the end of the video.
You rarely want to copy the pointers around cause if one object is destructed then the other object pointing to one of the members of the first object will malfunction. Essentially copying with memcpy is safe only for POD structs and not for non-POD.
Man I've been bitten by this quite a lot. Good point!
Yeah I feel like this video is more misleading than helpful… I thought he was going to show a better way to deep copy….
There's no universal way to deep copy, you just write the routines. The more complicated the object relationships, the more messed up the routines.
I was waiting the whole time for you you to talk about shallow/deep copies. This is one of the reasons to overload the assignment operator, to give it deep copy behaviour.
NO! This is the sort of example that gave operator overloading in C++ a bad reputation when it was introduced. Please do not ever write an operator= that does anything other than copy every struct/class member. Maintainers of your code will hunt you down and hurt you.
Okay, I'm sure someone can find an exception to this. Perhaps there's a field that holds some cached information that is better off being recalculated than copied, or perhaps for debugging purposes you keep some sort of unique ID that really does need to be unique. But the rule is that after assignment a = b, then a == b should always be true, so you'll also need to make sure your overloaded operator== does not consider the un- or mis-copied field as part of the meaning of equals for that type.
In fact, don't overload operator= unless you need to, and the only time you need to is when your class/struct has some resource (like a pointer, file ID, network socket, etc.) that either cannot be copied, or which requires special care when being copied. And if that's the case, you will want to implement your own copy ctor and dtor (and perhaps the move operations also).
Having control over these operations is one of the things that makes C++ unique; in particular it allows automatic resource cleanup (RAII) in ways that few other languages can match. But that control can be misused. And with all due respect (and that's not an empty cliché; I recommend a lot of your excellent videos to my students), this is an example of one of those misuses.
pretty sure instead of using the assignment operator, you can have copy constructor, move constructor which can be invoked by std cope , std move or am I wrong
@@mr.mirror1213 Constructors and assignment operators serve two different purposes. A constructor initializes a new object, and assignment modifies an existing one. So with assignment you have some old state that needs to be destroyed.
Perhaps what you're thinking is that one way to implement an assignment operator is to take the parameter by value, then use std::swap() (which uses the move operators) to do the assignment; this leaves the former state of the assigned-to object in the parameter, which will then get destroyed. This has the added benefit that operator= can be marked noexcept, assuming that anything that could throw an exception would be in the copy constructor (which will be used to pass the argument by value).
BTW, some of the above was off the top of my head; I don't know that I've ever implemented operator= that way. This is the internet, so I'm sure someone will let me know if I screwed it up :-).
Your code maintainers need to stop hurting people. 😰
Yes, this is a good point. I really think *all* overloadings of the assignment operator suffer from this same issue. If you simply wanted the standard shallow-copy-each-member behavior (the default behavior) then you wouldn't overload. But, introducing any kind of "special care" assignment (deep copy, partial copy, duplicate system resource, etc), your users run a high risk of confusion and should be notified. But, yes, I agree with you that if the assignment does a partial copy, then it's a good idea to make equivalence be a partial comparison. But, even so, it's still potentially confusing.
@@JacobSorber Perhaps I should apologize for the violent imagery. It was a reference to the old saying that you should write code as if it would be maintained by a violent psychopath who knows where you live.
The Ada solution just seems simpler: besides “private” types, you can have “limited private” types, which simply cannot be used in an assignment statement. Instead, you have to implement your own explicit copying procedures, as appropriate.
This is why Ada is used where software needs high levels of reliability, such as safety-critical areas. And why C++ is banned in such areas.
Hey Dr. Sorber just discovered your channel and been binging the vids, currently a freshman and our language of choice is C. Your DSA videos have been super helpful! Hopefully you would upload more embedded content as that's what I'm also doing in my free time right now using STM32 microcontrollers. Keep doing what you do!
Thanks! Glad you're enjoying the channel. I'll see what I can do. Definitely more embedded stuff in the future.
Something that hasn't been commented on is the issue of a potential UMR (uninitialized memory read) with a memcpy in this case.
With the example struct, an int (4 bytes) is followed by a char30 (30 bytes) which is followed by a double, which on some/most systems will be double-word aligned, meaning that 2 additional padding bytes will appear before the double so that the offset to the double is 36, a multiple of 8.
These padding bytes are technically uninitialized by element-wise assignment.
When a memcpy occurs, a strict runtime code checker will detect these bytes of automatic storage have not been initialized and will flag the UMR.
To avoid the UMR, the memory containing the struct should be initialized to 0, such as by
memset(&person1,sizeof(person1),0);
If the struct is coming from dynamically allocated memory, this is still necessary unless the memory is being explicitly zeroed when allocated.
Another way to avoid the UMR, if it is acceptable, is to rearrange the structure elements so that no padding bytes are needed, by placing the doubles first, then the longs, then the ints, and then the chars.
I know this is a bit esoteric, but in my pre-retirement development work, a UMR - whether a benign one such as this or a "real" one - always needed to be avoided.
Even if one was not concerned about uninitialized padding bytes, I wonder if their presence could actually make `memcpy` slower than the manual element-for-element copy in some obscure cases? The latter one only copies those bytes that are actually needed after all...
May I ask what issues UMR can cause in this case? I've never been in a situation where I had to think about uninitialized padding bytes, so I'm genuinely curious!
@@mario7501 If you had 2 instances of a structure as I described, and the int and the char30 and the double are all set to the same values, but the uninitialized alignment bytes were not set and happened to be different, then a memcmp(&struct1,&struct2,sizeof(struct1)) might not indicate a match. Also, if you are using tools that detect UMRs, this memcmp would raise a UMR condition just like a memcpy would. I'm not sure what a structure compare would do; I'm guessing that a given compiler c
ould optimize it to compare all the bytes like the memcmp, in which case that would be a UMR as well.
@@mario7501 I just tested this with the Microsoft Community 2022 compiler and was able to replicate the problem using a memcmp between two instances of a structure having a char followed by a double (will have 7-byte alignment). Both elements of the structure were the same but the alignment bytes differed so the memcmp did not return indicating a match.
@@gooblymoo this is really interesting. Never would've thought this could be a problem. There are probably a log of production systems out there that have a bug somewhere because of this!
Thanks for the explanation!
In automotive SW we need to markup all variables, constants, pointers, structs... For example, person1 and person2 in ASW would be person1_s and person2_s or sPerson1 and sPerson2, respectively and sPerson1_t and sPerson2_t for definitions. That's the easiest way to see with what you are dealing with, even IDE can highlight the elements. CamelCase and underscore are equally allowed.
Put me in the camp of people who learned c++ 30 years ago, went through an operator overloading phase and got over it. I simply don't do it anymore. C++ is already too clever to have glyphs with common meanings do exciting things. You want to do something clever, then it deserves a name.
That said, one reason to maybe implement your own copy rather than simple assignment or memcpy is performance. Let's say that your name string member is a megabyte long but the typical name is 5 bytes. Assignment and memcpy will copy the whole struct, whereas if you write your own routine, you can use strncpy and save yourself some runtime in the common case.
It's possible direct struct copying didn't work in K&R C and manual copying is just a throwback from that. I tried it explicitly using the C89 standard and it seems to work (assuming the compiler it handling it correctly).
Thank a lot, I am looking for how to do this specifically and I finally found it very well explained
One thing to note is that when doing assignments on structs that has compiler paddings, the compiler is not required to copy the values of these padded bytes per C99. This could be a problem if the struct is used for checksum calculation purposes which, depending on how it's calculated, could cause an issue. This issue can be resolved by the user by explicitly pad the struct or simply ensure the struct will not result in padded bytes by the compiler.
Could you explain why you can return a *this for the reference return type in the c++ example ? Many thanks !
You should always copy structs through assignment, since if `memcpy()` is the best way to copy the struct, the compiler will just do that anyway, but unlike using `memcpy()` directly, you are not forcing it to use `memcpy()`, as in some cases there is a better way to do it and then the compiler has the freedom to use this better way. If in doubt, always leave choices to the compiler as compilers today make better choices than programmers do. That's because most programmers make assumptions about what is fast and what not and quite often those assumptions are wrong, as they never really tested their assumption across a variety of platforms in the first place. In 9 out of 10 cases, a C compiler will generate faster code than a programmer writing whatever he thinks is fastest directly in assembly. And in the one case, were the compiler won't win, the differences to hand written code is usually neglectable, so not even in that case it really paid off.
Custom data types such as uint8_t wont work correctly Because of padding but byte aligning can help
memcpy(&str2, &str1, sizeof(struct_type)); - please don't throw tomatoes at me -
No tomatoes from me. Thanks for being here. 😀
Splash! 😉
Except this is wrong. Novice comment by a bad programmer. Nowhere it was stated the memory is not overlapped. Therefore memmove is correct. And you should learn more...
@@gregorymorse8423 How often would you get memory overlap? You would have to do something pretty exotic with your pointers to do so. An ordinary struct variable declaration or malloc() just cant create overlapping structs by itself.
@@psionl0 it's nothing to do with pointers or allocation. It's the data in the structure. Often with extra bytes at the end. Perhaps a network data comes in, with START FRAME indicator somewhere in it unknown aat first. Then you move the structure to the actual correct alignment. There aren't that many practical cases, but there are several such as this one.
Hey Jacob, great video as always! It would be interesting to see more complex cases, let's say instead of the char array, you've got a char pointer that you allocated dynamically. I guess you're kind of stuck with explicit copying of these pointer variable members? (or C++ overloading, which is the same as writing a C function)
Yep. Once you have pointers you start running into aliasing issues and usually have to do a deeper copy (unless you don't want to duplicate your objects-which is sometimes the case). I am planning on hitting on this in the future.
Question: why do strlen("NONAME") + 1 instead of sizeof("NONAME")?
The real question is: "Is the destination big enough to hold 6+1 bytes without overflowing?"
A beginner may adapt this example to swallow dreaded "user input", and handing strncpy() the length of the source string won't prevent buffer overruns...
A sketch in the Monty Python TV series had John Cleese interview Terry Jones (RIP) who was portraying the character:
Karl Gambelpötidevanausfenspendenschlittkraßkrembaunfreidichedangldünglwarsteinvonnechthresheräpfelbangerhorowitztiklenssikgrandenochichbelltinkelbrandißgruewuldnahweltbasikküstlichimbeleisenbahnwagengutenabendbitteeinenurenburgerbratwürstschengespurtmitzwienmachtenueberhundsfootgumberaberschönendankecalbfleichmittelrathevonhalbkopf
of Ulm
Good luck on that one with a 30 byte buffer! 🤣
personally the reason I'd guess that "person 2 = person 1" doesn't work is because it wouldn't work like that in Python, and I assume C is gonna be harder rather than easier. haha. I kind of like the memcopy option more because I was confident it would work immediately without much experience, so it must be good for communication, at least considering C newbies like me :p
at 8:50 you mention that it's probably better to work with memcpy in that case, but I disagree, because it has the same issue
If person1 and person2 are pointers, this will still work
&person1 takes the address of the variable that's holding the pointer, dito for person2, and sizeof(person1) will just be the size of a pointer, so everything still works
but we did the exact same thing as say `person2 = person1;` - just more verbosely and harder to debug in that case too
Only problem with memcpy/memset is if you arent running on arm cortex m7 or on windows (havent tried on linux) you will run into packing issues when casting data / putting it into a buffer where sometimes it can even crash if you try writting to an odd memory address. NOTE: Coretex M7 chips pack by default (i think), and windows handles this is the background. Don't know how linux handles it, and some embedded system compilers support packing through a #pragma call.
I thought that by default structs weren't packed and members were word aligned. (Isn't this a compiler issue?) Either way, I don't see how memcpy() would mess this up - unless you were copying a packed struct into an unpacked struct or vice versa.
12:29 I know relying on tooling is not the best option. But most intelligent editors will highlight overloaded operators. In case of visual studio (code) the operator is written in yellow and if you hover it will tell you the overload signature like a function. You can even press ctrl + left click to jump to the definition.
So I feel like clarity is a question of tooling in this case.
Well, approach 1 is begging to be turned into a mystruct_copy(a, b) function.
6:27 it's not the same though, is it? In the first example, you copy the value of the name from person1 to person2, whereas in the second example you only copy the pointer to the name, meaning that if you were to mutate the name of person1, it would also mutate the name of person2, which is not what the first example did and also not what a "copy" should do.
What I don't like with person2 = person1, it's a day someone change char name[30] by char *name with dynamic allocation, everything will be broken down. Maybe the best way it's to make a function?
Correct me if I'm wrong. I learnt C many many moons ago and I dont think you were allowed to copy structs using the '=' operator back then. If you could then none of my cohorts knew this either.
i think it would have been nice, when the print function would also give out the addresses of the struct members, so that you can be sure it's not just the same struct.
Also I'm a C person and not a C++ person, so maybe that's why I don't like the overloaded assignment operator at the end but my thoughts are: If you create an extra function to copy your structs/classes/objects, call it what ever, you might as well just call it STRUCTNAME_copy(). I assume that you have the function call overhead anyway, so might as well make it explicit.
I mean what is easier to understand:
myObject A = ...;
myObject B;
B = A;
or
B = A.copy();
I think the second one is much clearer in what it does, especially when you are dealing with heap objects. Does B = A just do pointer stuff and doesn't copy actual data ? B = A.copy() is obvious in my opinion, especially when you write some documenting comments to the function.
In C++ (and C ?) the "=" operator usually means copy. Having an explicit .copy() function is probably superfluous in most cases.
@@sledgex9 in C it's definitely not if you have anything allocated on the heap and want to copy that. And it's also not always obvious in C++, since '=' can also mean assign can it not ?
@@timtreichel3161 I was simplifying. In C++ it does mean assign. However when you assign an object to another object a copy is done. Same for POD types (like in C). The difference is in raw pointers where only the memory address is being copied (or assigned).
0:55, 1st thing I notice, that height variable should come before the other members because it's a double, had the compiler spit out a number of alignment/endian warnings to me at one time after I updated it, they only went away after I moved the doubles to the top of the struct.
**Edit:** might've been a union instead, been a long time since I had those errors after all
I faced a similar challenge in a task to exchange binary records (structs) between an existing Unix (database) program and a developing VB/Windows client... The compiler (& code base) on the former used no padding of structs; the latter (a DLL for VB... yech!) required padding to 4byte boundaries for multi-byte elements like ints and doubles.
Instead of writing arcane "element transcribing" functions, as you found, trials showed that pushing the 'ragged' shorts, chars and char arrays to the bottom of the struct worked like a charm!
Fortunately, 'endian-ness' wasn't a problem for this task...
I imagine, even though Jacob works with memory starved microcontrollers, he didn't want to 'clutter' this video with a digression into 'element order and structure padding'.
struct {
int age;
char beauty;
};
Age comes before beauty, as demanded by chivalry...
@@rustycherkas8229 Not in some cases, sometimes beauty comes before age, like in architecture, or nature
That what I was taught to do as a matter of a routine 30 years ago. Watch the ordering of elements in the structure to avoid padding
Very good to know the various approaches, what would be the best way to swap two struct?
If you have to add the whole string library only for memcpy() then maybe it's not the best solution.
In that case you use a "dumb: linker...?
Only required code from a library should end up in your binary.
@@maxaafbackname5562 headers are compiled, afaik.
@1:52 - It's hinting that strncpy() takes va_args. What kind of broken IDE is that? Even if it's the result of some underlying macro mapping mess, it should give the proper signature of what it maps to.
I've used vscode all my life and I've never seen a function having the wrong signature lol. But I have as litle extensions as possible, Cuz I don't like customizing. I basically have the themes and C/C++ extensions
However I have seen functions having *ERROR* parameters, when iltelisence was still loading. Once it loaded, it showed the correct signature
I had to lookup strncpy() since your C example didn't take into account the terminating NULL from the source name. According to Tutorials Point, "In a case where the length of src is less than that of n, the remainder of dest will be padded with null bytes". Curious that this apparently doesn't happen in C++.
One issue you didn't address is what to do if a struct contains pointers. Do we copy the pointer or do we duplicate the memory the pointer points to? (I know, the answer is "it depends").
I think that if I were to do frequent copying of structures I would create a separate function like personCopy() which would be clearer than using = or an overloaded = (especially if personCopy() is documented).
are you making an embedded systems course anytime soon?
people assume it doesn't work with structs because arrays' ids are treated as pointers, and struct might be too.
Can you pass non type defined memory as arguments to functions.
Like if i have a swap function and i want it to be able to work with any datatype.
Is there something you can do with memcopy or void pointers or with anything else in the c language
Read and think about how the C library function qsort() works...
It sorts contiguous 'blobs' of memory.
(Each 'blob' must be the same size,. If the 'blob' is a pointer, the actual data can be anything that can be compared. For example, a cmp() function can subsort records based on subordinate fields. eg: "sort by age, name")
Wait why doesn't the assignment operator work with arrays? Does trying to do that result in an error? If not, what is the result?
Also, are "person1.name" and "person2.name" pointers (since memcpy needs pointers)? Even though the structs themselves weren't declared as pointers?
No, "person1.name" is not a pointer. It is a memory address.
The struct element "name" is declared as an array of chars.
The token "name" represents a FIXED memory address, kinda like the value of a pointer.
But it is NOT a pointer (that could be reassigned to point somewhere else.)
To memcpy(), the source and destination addresses are just blobs of memory, and it will happily copy as many bytes as requested from source to destination.
Consider the massive C code base existing when "copying structs" became permitted syntax.
void foo( char str[] ) {
printf( "%s", str );
str = "barabajagel";
printf( "%s", str );
}
void main( void ) {
foo( "Foo" );
Masses of code had been written using the pointer array equivalence.
Allowing '=' to act as a copy function would break more than all Y2K threatened to break.
@@rustycherkas8229 Thanks man! I just noticed that somehow my brain registered Jacob saying "requires pointers" when he actually said addresses. But thanks for clearing thing up for me!
@@_veikkomies Glad to help 🙂
What is the difference between #include and #include "filename"?
means the compiler will look for the header files in the designated directory. For example in linux it will probably be /usr/include
" " means the compiler will look for the header files in the current directory of the program
@@wardog697 You're answer is correct, but a bit short.
The preprocessor will search one or more directories to find each named header file.
When installed the "compiler package" provides standard header files in a 'known' dir.
(Obviously the path to this dir depends on what compiler one installs/uses.)
(There may be subdirectories, too, as in #include )
At compile time, additional paths can be nominated using the "/I" compiler flag.
Angle brackets signify "search only those directories of (stable?) header files"
Double quotes signify "search current directory first, but use others if needed."
I've recently seen (local) header files named to 'override/augment' standard header files...
Some people are simply too clever for everyone's good.
= System include files (Your compiler knows where to look for these)
"" = Local include files, relative to your project
hey jacob, newbie question but is it bad to use compiler specific additions to C (ex. nested functions in gcc)
I wouldn't say it's bad, but if you want your code to be portable outside GCC, you should be careful. Some GCC extensions work in Clang too. One i use a lot is ?: (the Elvis operator), it's useful for defaulting assignments. But nested functions can be dangerous, as they don't survive outside the scope you declare them in, so you have to be careful with function pointers to them.
But if you're just prototyping, nested functions are really useful.
I have an update for this. Nested functions no longer work with a non-executable stack. So your programs will break unless you tell GCC to make your stack space executable, and will flat-out not work at all on some operating systems. But personally, needing to use nested functions in C is suggestive of a design issue anyways.
@Daniel-be1xn Yes. Always try to write portable code.
strncpy() is not the correct way to prevent buffer overrun, because it doesn't append a terminating zero in the worst case. You would have to write
strncpy(dst, src, sizeof(dst)-1);
dst[sizeof(dst)-1] = 0;
to be sure.
Use C99's strcpy_s() instead:
strcpy_s(dst, sizeof(dst), src);
Inrmemeber use different techinques to copy data from one class to another in c++ one of them is to use = operator though you need to use operator overload =
person2 = person1;
Wouldn't this leave out the actual copying of the string? Changes made to person1's name would reflect in person2, and vice-versa
This depends on whether the person's name member is an array or a pointer (to an array).
@@JacobSorber oh, I understand, I might research it a bit further. Since arrays and pointers are a bit ambiguous in C, i thought maybe it behaves the same way. Thanks!
@@GameplaySheep If the struct holds an array, then each struct saves space for the allocated array, so they are independent. If the struct holds a pointer, the reserved space is only for a pointer, and the array would be stored somewhere else, so they would share the same array. You have to be careful about the notion that "arrays are like pointers in C", it's actually a bit of a trap.
@@inakiarias7465 Thanks for the clarification, it does make sense
it's funny, I saw so many lectures about c++ constructors (rule of three, rule of five, rule of six, rule of zero), but hardly anything about the assignment operator.
I guess it's just not popular to use it? because we can have scoped variables? because of std::optional?
0:36 me right after beginning with the program compilation
Re C++; the will look at the class defn and see any overloaded operators..
Serious question:
When shifting from C to C++, you've added a function to the struct that clones an original.
I wonder why you chose to not change the struct to a class and use C++'s facilities of constructors and "copy constructors"...
person_c p1( "Clair", 45, 5.2 );
person_c p2( p1 );
strncpy( p2.name, "Marie", sizeof( p2.name )-1 ); // EDIT: moved -1 outside of ')''... Doh!!
or even
person_c p2( p1, "Marie" );
that would create Claire's twin sister...
Then it's on to "p1.Print(); p2.Print()" to, again, take advantage of data hiding and code reuse.
No good reason, other than I was talking about structs, and so I decided to stick with structs.
Thanks for pointing out the alternative.
@@JacobSorber
Understood! The 'clue' I overlooked was in the video's title! 😀
I was perplexed by the breadth of the presentation, from beginners copying each element to 'advanced' operator overloading.
This video may be more 'satisfying' if it were 2 minutes longer to show a complete transformation to a C++ implementation... Just my two cents...
Cheers! 🙂 Please keep these coming!
Keep in mind, that in C++ structs can also have (copy/move) constructors. As mentioned in the video, in C++ structs are the same as classes with members public by default.
@@sledgex9 Thank you for your response!
I'd years of 'C' experience and only "high pressure" reasons to start coding in C++ (based on cursory reading of the language doco.)
Your reply has just had me experimenting with C++ structs beyond mere aggregations of (public) data members.
Just wanted to say thanks. I've learned a lot! 🙂
English:
(using C, not C++)
Very good video!
But, I have a question: how can I copy using "memcpy", if, instead of having
"char name[30]",
I have
"char * name"
and dynamically allocate its space at runtime?
Thank you!!
Español:
(usando C, no C++)
¡Muy bueno el video!
Pero, tengo una pregunta: ¿cómo puedo copiar usando "memcpy", si, en vez de tener
"char name[30]",
tengo
"char * name"
y su espacio lo asingo dinámicamente en tiempo de ejecución?
Gracias!!
you could count number of characters until you hit \0 and then you know how much to copy, but it is a standard in C to always pass * and array size as parameters into your function
@@henrykkaufman1488 Thank you very much!
0th!
🏆
On overloading "=" with a non-obvious copy facility:
As the saying goes, "With great power comes great responsibility."
This is why C++ needs the '±' operator... It's very like assignment ('=') but shows that the destination will be "more-or-less" like the source... 🤣
Nice, but I think a better name would be operator≈ (that's supposed to be the "approximately equal" character, U+2248, but you wouldn't know it from the font my browser is using). Or maybe that operator would be the logical operator that tests for "sort of similar but not equal".
Not that we're introducing Unicode characters, how about being able to use ≤ and ≥ instead of =?
@@rdwells This is APL waiting to happen. I don't like operator overloading. With lasagna code it's already a challenge to figure out what virtual function ".GET" is doing today - let alone "Person1 | Person2" (give away: it's the "divorce" operator).
I mean I like "code beautifying" and "syntactic sugar" as much as the next guy, but sometimes too much is just too much..
@@HansBezemer My reply was actually tongue in cheek, as I assumed the post I was replying was. But on a serious note, don't throw the baby out with the bathwater. If you're writing, say, a rational number class, do you want to write
result = add(mult(a,x),y);
or
result = a * x + y;
I'd far prefer the latter. The C++ operators have specific expected meanings; if those meanings make sense for your class, then by all means take advantage of that.
At the very least, you probably want to override operator< so that your types can be used with the STL. Or, if you're using C++20, operator. Overloading operator> to be friendlier with stream I/O is often useful as well.
But, of course, everything in moderation. Overloading += to add an item to a collection (as I've seen done) is probably a bad idea.
@@rdwells That "+=" example of yours is scary... When you uncover the effect of that clever override, there should be ominous background music playing (perhaps with a thunderclap) as you feel your stomach sink and your blood turn cold... "What other gems are buried in here?...."
For Jacob's "person" example, KISS would suggest factoring out a simpler function
createTwin( const char *nameOfTwin );
that does what's needed. (Works in C and in C++.) This would require the parameter "Marie" be supplied, instead of presuming a generic "NONAME" will suffice...
Your example ("result = add(mult(a,x),y)") actually is what the parser will (should?) generate from "result = a * x + y;"...
As others have (often) commented on these "demonstration" videos, the 3rd parameter to memcpy() or strncpy(), etc. should ALWAYS be tied to the size of the destination, not to the size of the source...
Finally, for any beginners reading this far: "age" (used for demonstration) is a "volatile" derived quantity... On/after a person's next birthday, the quantity will, by definition, change...
Use 'date_of_birth' instead, and calculate 'age' when needed... NB: sorting by 'age' will put youngsters first; sorting by 'DoB' will put seniors ahead of those young whippersnappers...
@@rdwells Well, automagically you referred to a class where mathematical operators make sense: mathematics.
Let's see what else we got:
"Hello" || "World"
Logical OR between strings..?
"Hello" & "World"
Binary AND between strings..?
As a matter of fact, these are string concatenations in (Oracle) SQL and VBA. It doesn't make much sense to me.
CONCAT("Hello", "World", "!") seems much clearer, hardly longer and not much more difficult to understand than "Hello" + "World" + "!".
And yes, I did make functions like that - like doing a 14-bit fixed point set of arithmetic operations. Main problem: it can get quite LONG - even for relatively SMALL formulas.
But - being primary a Forther that is accustomed to writing RPN, can't say it bothers me that much.
In Forth, every set of numbers HAS got it's own set of operators, like +, - ,* for single length, D+, D-, D* for double length and F+, F- and F* for floats.
I must say - I pretty much like it that way. It always bothered me in C you can add a short to a char - or an unsigned to a signed long. I don't like my compilers doing so-called smart things behind my back (no, I'm not a fan of Python).
IMHO - minimal: if you mix 'em up and don't cast, that's an error. It breeds bad habits, so in the end, people don't even see their errors in their ways when they write "123" + 234 and expect something sensible to come out of that.
If it works for others - fine. But it's not my preference.
Hi Jacob - I like your videos and they all have great content. This one is misleading as you are taking very simple example to show it works. Typically, it the structure contains pointer to another structure or array.
This might be a dumb question but can Jacob or someone in the comments explain why `memcpy` was fine with person2.name and person1.name, but needed the address of person2 and person1? Oh wait a string is basically a char* so essentially an address. Wow that was a rubberduck debugging moment.
Yeah remember that any pointer type is just and 4- or 8-byte integer which happens to be the address of some type.
And another quirky thing: An array variablename acts like a pointer constant. This is why you can use it as a char*.
This does NOT mean arrays and pointers are the same thing. Its just the syntax to access them.
0:27 Jacob be like:
Only a sith deals in absolutes... I will do what I must.
Indeed. Understood me, you did.
I notice that you use Person& person instead of Person &person. The & is a modifier, it modifies the variable not the type. The same counts for Person* person vs Person *person. If you write Person *personA, personB; it is clear that personA is an pointer and personB is a struct. However if you write Person* personA, personB; most programmers will assume both personA and personB are pointers. Yes I tested it in an exam and most students give a wrong answer. Of course it is better to avoid ambiguity and write two lines of code.
In the context of the (possibly comma separated) list of function parameters, it seems to be convention (in my experience) that C++ developers use the "reference to" style Jacob used here.
There is nothing about your 2nd example ("Person personA, personB") to suggest both of these are pointers.
I have a problem with compilers - I don't trust them. So I tend to program quite defensively. On the upside, that approach allows 10 KLOC programs to compile (almost) warning free over a wide range of C compilers - starting with K&R.
I usually solve the pointer issue by keeping a copy of the original struct to the side prior to overwriting the destination struct:
typedef struct _SUM { int a, b; int (*op)( struct _SUM *sum ); } SUM;
int foo( SUM *dst, const SUM *src )
{
SUM tmp = *dst;
*dst = *src;
dst->op = tmp.op;
return dst->op( &two );
}
Thank you, Jacob, for this interesting episode.
But I cannot understand why all these examples use a person's age instead of the person's year of birth. You might get in trouble if you have filled a database with the persons' ages over several years...
Very true, Michael. I was focusing on the copying the structs, not the utility of the data in the structs. 🤔
Beware the evil cloned string pointer, by copying the structs!
you struggling with the pointer really breaks my vi heart, you should try the vim extension for your editor. you wil still be able to navigate it as you used to but you'll gain all the superpowers of vim and hopefully eventually completely switch
Eventually someone will make a better editor that's intuitive and powerful, but for now, we'll just have to be power users of (neo)vim :)
it seems less sucky with just a few thousand loc compared to vims million+ but is unfortunally written in soy++
Pls don't throw away const correctness... T& operator=(T const &other);
all three examples are painfully wrong and demonstrate the problems with C and C++. Lets say your struct has handles to objects or pointers to btrees. step in STL/templates and reference counting. which already is beyond most amateur developers. I love C and C++. I write kernel drivers, so C/C++ is pretty much a given. second overloading is a nightmare, although it does demonstrate a good point. lets say you had a 1MB bitmap, hashtable, string ect... it is ridiculous to copy the whole thing if only 2 bytes are valid. C++ has a lot of member functions to deal with this kind of thing, and paired with templates/generics/patterns they can be solved. again, with added complexity, but if you are writing potenially shared code/ libraries the extra effort is always worth it. example: for the windows IO model you have a complex structure called an IRP. copying that would immediately blue screen the system as you have threading issues, memory issues, locking issues... the list goes on. so my super simple solution is just write a copy function for your struct (C). there are exemptions of course. if your struct represents a 3d transform matrix then memcpy or naked assignment is fine. aka if all the data in your struct is always vital to the identity of the structure and contained.
Sirr, Comments are a thing in C, pretty useful when you want to make your intent clear rather than having someone read everything
True. I sometimes treat my talking as the comments, and forget to include them in the code. Do you mind sharing what you found confusing?