ERRATA: 1. I mention that stack memory has faster access time than heap memory. While *allocating* and *deallocating* stack memory is much faster than doing so on the heap, it seems like access time for both types of memory is usually roughly the same.
I was just thinking about this at the beginning of the video. Heap and stack are just different areas of the same system memory. What matters here is that the stack is used to keep the "frame", i.e. all the values that are local, to the current function. This is how, after a function call returns, local variables retain their values, and this is what makes recursion possible. This stack behavior is implemented by keeping a pointer to the "top" of the stack and, on each function call, moving that pointer by an amount equal to the size of the new function's stack frame. That's why the compiler needs to know the size of the stack frame, and consequently, the size of any local variable to a function. Every other object that's dynamic in nature, or recursive, will have to live outside the stack, i.e. using Box. And like you just explained, deallocating on the stack is quite fast, since things aren't really "deallocated", the Stack Pointer is just moved back to where it was before the function call, while allocating and deallocating on the heap usually involves interacting with the Operating System to ask for available memory. Great video! Keep it up!
I think "stack is faster than heap" is a pretty reasonable starting point, especially for a talk that isn't going into nitty gritty details about allocators and caching. Stack memory is pretty much guaranteed to be in your fastest cache, but with heap memory a lot depends on access patterns. If you have a really hot Vec then sure, there's probably no performance difference compared to an array on the stack. But for example a Vec where each String has its own heap pointer into some random page, isn't going to perform as well.
@@oconnor663 For most programmers that aren't going down the nitty-gritty sysprog hole the assumption that "stack is faster than heap" covers 95% of all use-cases. The msot time spent when dealing with memory is allocating and deallocating after all.
You'd need to set another register than EBP but the type of memory is indeed exactly the same, and the cache will cover both. But there may be system calls when using the heap. "In an ideal world you'd have everything on the stack" - I disagree if that's in the absolute, bear in mind the stack is limited in size and if you cannot often control what was stacked before your function is called or what will be stacked by the code called by your function. It's not appropriate for collections either because it would complicate size management and cause more memory moves (which are very power-consuming). But I think you meant it otherwise, for small objects in simple cases where this isn't a concern. These days memories are so large that people tend to forget about those limitations and then they are surprised the first time they have to deal with embedded code. ;-)
It makes total sense, both are in RAM. The thing is the stack is contiguous so writing to it is fast because the writes are sequential, while the heap is probably fragmented, which means random writes. Edit: without taking into account what the others have said, about frames, OS allocation, etc, everything contributes.
Sir, your Rust tutorial are cohesive, easy to follow ( due to great examples ) and don't go overly deep into the details. Perfect combination. Keep up with the good work.
Honestly, I 've read about these things 3-4 times, and I more or less understand them, but it really clicks differently when someone tells you "these are the two main uses of Box: unsized things and self-referencing structs". Thank you, this is really helpful!
If you're coming from C++, Box is basically an std::unique_ptr and Rc in an std::shared_ptr without atomicity and Arc is std::shared_ptr with atomicity (the standard) BTW, the ordering guarantees on x86 and arm are pretty strong, so a lot of atomic operations are implemented with normal instructions like MOV
WOW WOW WOW! Rust is my favorite programming language, and I’ve used it for all sorts of things, but I’ve never dived into smart pointers (except box) and this was super helpful!
Thanks for the helpful video! It takes me a bit to catch everything on the first time around so I need repeat parts, but the clear examples and broken down explanation really help a lot.
Also, to mention about Box usecases. The first use cases covers it, but it's not straightforward. Imagine that we are possibly returning many structs that implement the same trait from the function. In this case, the return type can not be known at compile time, so we need to make it Box
I saw a lot of examples, including THE BOOK, and rust by examples, a lot of youtube videos. still didn't fully understand why how what. now i think i understood Rc finally. Thank you.
Omg, I _love_ your intro graphic, played at 0:30. *It's short!* Who wants to sit through 5 or ten seconds of some boring intro boilerplate every time we visit that channel, like a bad modal dialog box on some Windows 95 app, drives me nuts.
thanks modolief! I'd thought about creating a little intro reel, but every time I consider it I conclude that it would hinder my mission to provide as much value as possible in as little time as possible
@@codetothemoon The channel "PBS Eons" also has a really good intro bit. They start their video, then at around 20 or 30 seconds they give their little imprint. But what I really like about it is that even though it's more than about 3 seconds it fades out quickly, and they already start talking again before the sound is done. Very artistic, yet not intrusive.
Great video! I think what would have been simpler to explain the difference between Rc and Arc without mentioning reordering, is that the increment and decrement of the internal strong and weak counters are represented as AtomicUsize in Arc (i.e. thread-safe) and usize (i.e. non-thread-safe) in Rc.
Thanks and thanks for the feedback! Touching on ordering was probably a little confusing, to your point I probably could have just mentioned the different counter types, and that one is thread safe while the other isn't
11:02 this what I don't rust for. Where did we pass truck_b ownership to the thread? I don't see any obvious code that tells me that truck_b moved to the thread. The variable of type Arc is cloned by readonly reference, so why it passes ownership?
.clone() allocates new memory on the heap while Rc::clone make it points to the same space in memory without duplicating data, that makes a huge difference if you're into memory management.
I think it's just to make it explicitly clear that we're cloning a pointer, not the underlying struct. If I see foo.clone() in the wild, I'm instantly suspicious, but Rc::clone() is using the type exactly as intended.
I excluded it from this video to keep things concise, and I wasn't convinced it would be useful for the vast majority of folks. But several people have requested I cover it, so I may at some point. In the meantime there is coverage of it in one of the later chapters of the Rust book.
I wouldn't say stack memory is faster to access, just that the allocation and deallocation is faster. It might be a bit faster in certain conditions since it will stay in cache most of the time.
Got it! Yeah my understanding was that stack memory is more likely to be stored on the CPU cache - but maybe that's possible for the heap as well... Though I haven't actually benchmarked this, maybe I'll do that...
Ordinary variables could also be assigned by the compiler to CPU registers, which makes them as fast as they get. This doesn't happen to the heap-allocated variables.
@@codetothemoon Access is fastest when the data is "near" the recent access. Which is a part of why data oriented programming is so much faster. but I bet the methods of memory access have changed so much that what we are taught is not what is implemented in the most recent technology
that's correct! Rc doesn't really help much if you intend to hang on to one reference until the program ends - you could just use regular borrows in that case - but in this example to show the strong_count function I just kept a reference in main.
I am unsure whether one should practice both safe and bad programming. At least it is safe, I suppose. Specifically, I do not understand one of these clone examples when good programming might ask the instance to remain singleton, all the way through (both literally and figuratively). You show us how to do it, and you behave as if: awesome.
I'm not sure exactly what you're asking - but if a routine (function) receives a shared reference to something, it does not get a copy (it basically is given a pointer) but it also cannot be modified.
The stack is not faster than heap. Both are locations in main memory. True, stack might be partially in registers, but in general, stack is no different to heap. Heap memory involves an allocator which in turn of course causes more overhead (internal some atomics need to be swapped and free memory has to be found). But stack and heap are both located in equally fast main memory.
I'm interested how Rc knows when data is going out of scope, or being dropped like you did. How is it aware that the memory is no longer accessible after a specific point without knowing where the objects are created in the program? How does the Rc know that there is a reference to truck_b in the main function, for example?
great question, in Rc's implementation of clone there is `self.inner().inc_strong();` which increments the strong reference counter. So it doesn't necessarily know where the references are, it just increments a counter each time one is created. Then in Rc's implementation of the Drop trait (which has a drop method that is invoked when the implementor goes out of scope) we have `self.inner().dec_strong();` then if `self.inner().strong() == 0 { /*code for cleaning up memory here */ }`
One more thing. I'm assuming that for clarity, you used the explicit Arc::clone instead of the suffixed version. You can use .clone() on an Rc/Arc and it will clone the reference instead of the data.
2:11 Why accessing the heap would be slower? It's still RAM like the stack, and can be cached by the CPU like any other memory. The only drawback of the heap is that it can suffer from fragmentation during allocation and deallocation. But it's incorrect to say it has slower access time.
Allocation and deallocation themselves are slower for the heap. Moreover, (just reading this from StackOverflow), the heap often needs to be thread-safe, meaning it cannot benefit from some of the same optimisations as the stack can.
@@spaghettiking653 yes fragmentation can make allocation slower, but memory access isn't slower, which is what the video implied. Having an object on the heap is exactly as fast as anywhere else, and fragmentation issues only occur in rare cases. We're talking literal nanoseconds slower to find free space on the heap instead of putting it on the stack. Unless we're talking about a very hot loop on performance critical software, it doesn't matter, and you shouldn't allocate in a hot loop anyway.
@@sbef Yes, fair point. What about the problems with thread safety? I really have no clue whether that's a real concern or whether it is a problem at all, as I literally read it minutes ago-what do you think/know?
Yeah I may have misspoken a bit here - stack memory is faster to allocate / deallocate than heap memory. Would patch this if I could :/ I'll pin a comment.
@@spaghettiking653 not sure how thread-safe the Rust default allocator is to be honest, but I would expect to be pretty much lock-free even in heavily concurrent applications. It's not my area of expertise, but allocator technology has been refined over the past 3 decades.
How is cyclic data handled by Rc? As we can mutate the data we can give the value of the Rc a clone, right? Thus causing the data to never be deallocated
It isn't. You can define circular data with Rc that will never be deallocated. It's the programmer's job to handle this case correctly. This was actually at the centre of the Leakpocalypse. It was decided that, while accessing deallocated memory is `unsafe`, leaking memory isn't. You can somewhat get around this with weak references, to get circular data with deallocation, but it gets complicated pretty quickly.
Would be great to understand ownership and the stack. "The stack it's much faster than the heap" - i assume that if you pass variables by ref, the CPU Knows "Hey - i am going to use this storage, so i keep it in the cache", but what happens if F1() passes ownership to F2(), passes to F3()... F999() - is the data still on bottom Stack Frame and the storage is still in the cache ?? AFAIK the size of a stack frame cannot be changed. So is it save to always say "Stack is faster than Heap!!!". What comes to some crazy ideas like allocating a huge array for data that acts as "Database" with a fixed huge size in the most bottom stack frame, and then pass it through - or do i get something like "Stack frame to big" ? I can't believe that using the Stack is better than the Heap in this case. Maybe someone has a link that explains it in depth ?
Technically a stack frame can't be too big, the error that can occur is that the stack runs out of memory / stack overflow. A stack overflow could be achieved either by one mega stack frame or a multitude of small ones. Never the less, the error is that the stack memory is depleted (stack size varies from platform to platform and OS to OS) the size of any individual frame doesn't matter, it's the total memory that matters, either 1 large or N smaller ones, going over the stack size. He completely unnecessarily confuse ownership, lifetimes and stack vs heap, for these examples. The heap is generally "farther away" in memory than what the stack is. In computers we have cache, often multiple levels, these are extremely fast, pre fetched from main memory, and so, using data that is either A: close in space or close in time (temporal locality). The cpu will fetch this memory. So he also confuses what is fast about stack, because technically, operating on a large "database" as you refer to it, is also fast, because its temporal and spatial locality are both close- the cpu will understand that you want to do N things to that large array of data, so if you are operating on each element in a loop, the CPU will read that heap memory and pre fetch the data as your loop executes. When this happens, the heap is _exactly_ as fast as the stack, as, your large data blob is being operated on in a sequential manner, one element after the other (just like how the stack is laid out, close in space and close in time). This is the main reason why you want data elements close in memory to each other, because that will make it so that the CPU can "see" what you are trying to do and fetch the memory ahead of time and place some of it in the cache. There is another benefit of the stack, and that is that the clean up of stack memory involves just subtracting N bytes from the stack pointer. If all your data on the stack is "trivial" no involved destructors are run, compare this with the heap, where some clean up must happen to free the memory - and sometimes this could involve a system call which is much slower than normal functions, but even without system calls, there will be some overhead.
@@simonfarre4907 thanks a lot for this detailed answer. ah yeah i tested it out and the largest amount of data on my system was about 8 MB - what is even less then the cache size of the CPU. (Ubuntu 18, ryzen) Probably there are good reasons why to do so.
Thanks Thomas and Simon for pointing all of this out. I can definitely appreciate that "Stack vs Heap" is more nuanced than my brief portrayal of it in the video would lead you to believe.
Thanks, yeah I digged a little deeper. As far as I understand now: allocating and deallocating is faster on the stack. But for data that lives long it doesn't make meaningfull difference. I have not tried it out, but I can tell the linker to allow larger stacks. Therefore it could be possible to provoke a cache miss even on the stack? Or the OS panics if the stack exceeds the cpu cache size, because it always want to have the whole stack at least in L2 or L3. Would be a good reasoning for the default only allowing tiny stacks. If so, it might be faster in some scenarios to keep the stacks small, so the cpu has enough cache for the heap, instead of storing barely accessed data on the stack.
Definitely doing this at some point, given the spooky factor it would have been a good one for halloween, but unfortunately it probably won't be ready in time 🎃
This was a super helpful primer on why/when to use these types! Would love to see more content building on it. I'm trying to form some internal decision tree for how to decide how long a given piece of data should live for. Going to go see if you have any videos on that topic right now... 😁
great, really happy you got something out of the video! I don't have a video specifically on deciding how long a piece of data should live for, but "Rust Demystified" does cover lifetimes.
I understand if you’re coming from C or C++, the conceptual overhead of this stuff could make sense for you because it is largely stuff you actually already have to think about in a slightly different way. But if you have the option to use a garbage collected language, I have no idea why you’d drag along all of this conceptual baggage with you. I mean just look at the litany of peripheral specifiers that was created in this tiny example for no other reason than to appease the compiler. It’s a complete distraction from the problem you’re trying to solve.
actually interestingly, I think C/C++ knowledge doesn't help much unless you're writing `unsafe` Rust. then it might. But in `safe` Rust code, while you'll see some of the same symbols - mainly '&' - they may have a completely different meaning. as for the "why", most folks should probably stick with a garbage collected language. Rust can shine in the following situations, where may be well suited for solving said problem: 1. Performance is valued above all else 2. The project needs to run on hardware with extremely limited resources 3. The project needs to handle a large volume of traffic while minimizing hosting costs - ie the "great problem to have" where a very small company makes a product that becomes heavily used
Thanks for the reply! I enjoy your videos. I completely agree with your list of use cases. My mention of C/C++ wasn’t necessarily that it would make learning Rust easier, but that the seemingly crufty stuff that Rust does actually is an interesting solution to problems that do arise in those languages. So the overhead of dealing with it might make sense because it’s solving real problems that you commonly deal with in those languages (and not many others). For that reason I do think it would be easier for a C/C++ dev to pick up, because they’re at least familiar with the reasoning behind the design choices. But that’s definitely up for debate.
🤔 I would understand them more intuitively if they were named more intuitively and consistently. One is a single ownership pointer, uniquely owned. One is a shared ownership pointer, implemented via reference counting. Another is the same as the previous, just with interlocked atomic increment/decrement. Names like "Box" and "Arc" though feel pulled out of a hat. A box has height, width, and depth, but there is nothing volumetric in Rust's "Box" (and loosely co-opting the concept of "boxing" from C# feels weird here).
Rc stands for reference counter and Arc stands for atomic reference counter, they are just abbreviations which is good because they are frequently used and imagine writing ReferenceCounter every time, especially when you have to wrap many things with them. For box it could be named better maybe, but there is no type that is going to be called a "box". If it is a math library it would call it cuboid, cube, rectangular prism or something else. For types that are frequently used short names are good.
Totally understand your frustration - to add to the other response, I believe "Box" and "Boxing" are terms that have histories that extend well prior to the inception of Rust, but are usually hidden from the developer by developer-facing language abstractions. I think Rust is just one of the first to actually expose the term directly to the developer.
@@codetothemoon Example dated usage: X.Leroy. Unboxed objects and polymorphic typing, 1992. The terms have been used in libraries also, at least since 2007 in Haskell and 2000 in Steel Bank Common Lisp. I suspect it could be traced back several decades more.
U made a wrong statment about heap vs stack and preformance. Basicly Allocating on the stack is faster but u can't free it urself and there is less room there by design. So ideally u stack allocate stuff that are small and ur using for a hot minute in a function. But something like a linked list should be on the heap.bu absolutely can stack allocate it in a languge like c but that should not be ur go to. Side note: stack vs heap is an abstruction over raw memory handling (assembly has a stack but it's an actual stack data structure while the c stack is random access) it's just different alocators
ERRATA:
1. I mention that stack memory has faster access time than heap memory. While *allocating* and *deallocating* stack memory is much faster than doing so on the heap, it seems like access time for both types of memory is usually roughly the same.
I was just thinking about this at the beginning of the video. Heap and stack are just different areas of the same system memory.
What matters here is that the stack is used to keep the "frame", i.e. all the values that are local, to the current function. This is how, after a function call returns, local variables retain their values, and this is what makes recursion possible. This stack behavior is implemented by keeping a pointer to the "top" of the stack and, on each function call, moving that pointer by an amount equal to the size of the new function's stack frame. That's why the compiler needs to know the size of the stack frame, and consequently, the size of any local variable to a function. Every other object that's dynamic in nature, or recursive, will have to live outside the stack, i.e. using Box.
And like you just explained, deallocating on the stack is quite fast, since things aren't really "deallocated", the Stack Pointer is just moved back to where it was before the function call, while allocating and deallocating on the heap usually involves interacting with the Operating System to ask for available memory.
Great video! Keep it up!
I think "stack is faster than heap" is a pretty reasonable starting point, especially for a talk that isn't going into nitty gritty details about allocators and caching. Stack memory is pretty much guaranteed to be in your fastest cache, but with heap memory a lot depends on access patterns. If you have a really hot Vec then sure, there's probably no performance difference compared to an array on the stack. But for example a Vec where each String has its own heap pointer into some random page, isn't going to perform as well.
@@oconnor663 For most programmers that aren't going down the nitty-gritty sysprog hole the assumption that "stack is faster than heap" covers 95% of all use-cases. The msot time spent when dealing with memory is allocating and deallocating after all.
You'd need to set another register than EBP but the type of memory is indeed exactly the same, and the cache will cover both. But there may be system calls when using the heap. "In an ideal world you'd have everything on the stack" - I disagree if that's in the absolute, bear in mind the stack is limited in size and if you cannot often control what was stacked before your function is called or what will be stacked by the code called by your function. It's not appropriate for collections either because it would complicate size management and cause more memory moves (which are very power-consuming). But I think you meant it otherwise, for small objects in simple cases where this isn't a concern.
These days memories are so large that people tend to forget about those limitations and then they are surprised the first time they have to deal with embedded code. ;-)
It makes total sense, both are in RAM. The thing is the stack is contiguous so writing to it is fast because the writes are sequential, while the heap is probably fragmented, which means random writes.
Edit: without taking into account what the others have said, about frames, OS allocation, etc, everything contributes.
Sir, your Rust tutorial are cohesive, easy to follow ( due to great examples ) and don't go overly deep into the details. Perfect combination. Keep up with the good work.
Thanks for the kind words Miguel! It's thrilling to know that these videos can make these concepts a bit more palatable.
@codetothemoon, the way you described lifetimes just clicks
Honestly, I 've read about these things 3-4 times, and I more or less understand them, but it really clicks differently when someone tells you "these are the two main uses of Box: unsized things and self-referencing structs". Thank you, this is really helpful!
Nice, I'm so glad you found that perspective valuable!
Stuff on Cell and RefCell would be exactly what I'm looking for, thanks for these great videos! 😄
Nice, I've put it on the video idea list!
As far as I can see, if your implementation requires RefCell then your implementation is probably wrong. ;)
If you're coming from C++, Box is basically an std::unique_ptr and Rc in an std::shared_ptr without atomicity and Arc is std::shared_ptr with atomicity (the standard)
BTW, the ordering guarantees on x86 and arm are pretty strong, so a lot of atomic operations are implemented with normal instructions like MOV
ahh nice, heard there were C++ analogs now but was never quite sure what they were!
WOW WOW WOW! Rust is my favorite programming language, and I’ve used it for all sorts of things, but I’ve never dived into smart pointers (except box) and this was super helpful!
Nice, glad you found it valuable!
Thanks!
Wow thank you so much Erlang!! Much appreciated!!
Thanks for the helpful video! It takes me a bit to catch everything on the first time around so I need repeat parts, but the clear examples and broken down explanation really help a lot.
This is sooo awesome!! I never understood the concept of Arc pointer until now, thank you so much :D
thanks for the kind words, really happy you got something out of the video!
Your a great teacher. I would love videos where you develop small programs that illustrate various language features.
Your tutorial is very clear and easy to understand. Thank you so much.
I hope you will create a video about RefCell soon.
These are extremely nice video's, thank you!
Thanks and thanks for watching Jos!
Also, to mention about Box usecases. The first use cases covers it, but it's not straightforward. Imagine that we are possibly returning many structs that implement the same trait from the function. In this case, the return type can not be known at compile time, so we need to make it Box
you're doing amazing work doing those videos! please keep going. it would be also cool to see ffi and unsafe rust
Thank you gorudonu! More on the way, and I've put FFI/unsafe on the video idea list.
I saw a lot of examples, including THE BOOK, and rust by examples, a lot of youtube videos. still didn't fully understand why how what. now i think i understood Rc finally. Thank you.
nice, this is really great to hear!
I love your videos. Thanks for taking the time to make these videos.
thanks for the kind words and thanks for watching!
Your tutorials are clean, comparatively fast and easy to understand
Thanks Namaste (amazing name btw!), glad you found it valuable!
Thanks for this video! These smart pointers are confusing. Could you also cover Cow in one of your next videos?
Seems like we have a few requests for Cow, I’ve added it to the video idea list!
@@codetothemoon thanks!
This was super informative, Rc finally clicked for me!
Thank you!
great, really happy you got something out of it!
The quality of these videos is great, 60fps is a nice touch
Thanks Gavin! Impressed you noticed the 60fps ;)
Great video, concise and well explained, just what I was looking for Rc. Please keep them coming.
Nice CJ! Glad you found it valuable - more to come!
Omg, I _love_ your intro graphic, played at 0:30. *It's short!* Who wants to sit through 5 or ten seconds of some boring intro boilerplate every time we visit that channel, like a bad modal dialog box on some Windows 95 app, drives me nuts.
thanks modolief! I'd thought about creating a little intro reel, but every time I consider it I conclude that it would hinder my mission to provide as much value as possible in as little time as possible
@@codetothemoon The channel "PBS Eons" also has a really good intro bit. They start their video, then at around 20 or 30 seconds they give their little imprint. But what I really like about it is that even though it's more than about 3 seconds it fades out quickly, and they already start talking again before the sound is done. Very artistic, yet not intrusive.
It's short, which I like, but the sound is kind of jarring.
Great video! I think what would have been simpler to explain the difference between Rc and Arc without mentioning reordering, is that the increment and decrement of the internal strong and weak counters are represented as AtomicUsize in Arc (i.e. thread-safe) and usize (i.e. non-thread-safe) in Rc.
Thanks and thanks for the feedback! Touching on ordering was probably a little confusing, to your point I probably could have just mentioned the different counter types, and that one is thread safe while the other isn't
11:02 this what I don't rust for. Where did we pass truck_b ownership to the thread? I don't see any obvious code that tells me that truck_b moved to the thread. The variable of type Arc is cloned by readonly reference, so why it passes ownership?
It should be noted that in the Rc example, you could just have written truck_b.clone() instead of Rc::clone(truck_b)
The rust book teaches like he did, Rc::clone(&an_rc), i think the reason is just to be idiomatic. Nice to know both ways are fine.
.clone() allocates new memory on the heap while Rc::clone make it points to the same space in memory without duplicating data, that makes a huge difference if you're into memory management.
@@andrescamilo7406 I thought it took the method name from the outermost type
I think it's just to make it explicitly clear that we're cloning a pointer, not the underlying struct. If I see foo.clone() in the wild, I'm instantly suspicious, but Rc::clone() is using the type exactly as intended.
@@andrescamilo7406.clone() does the same thing as R.C::clone in an RC context
What about the RefCell? It is mentioned in the intro but never explained what it does
I excluded it from this video to keep things concise, and I wasn't convinced it would be useful for the vast majority of folks. But several people have requested I cover it, so I may at some point. In the meantime there is coverage of it in one of the later chapters of the Rust book.
You explained so clear for these complicated concepts~Thx!
Glad it was helpful!
As always, great job dude ! Tks a lot
thanks, glad you got something out of it!
I’m so glad that I found you channel. So easy to understand now
Your content is insanely good.
thank you so much!
Amazing help! Instantly subscribed.. I've been trying to figure out Dependency Injection in Rust and had no idea Rc is what I needed.
Loved your video. There was some handy pointers in there 🥁. But absolutely would love to see a video covering RefCell
Haha! Seems like there is a lot of desire for RefCell, I've placed it high on the video idea list.
I wouldn't say stack memory is faster to access, just that the allocation and deallocation is faster. It might be a bit faster in certain conditions since it will stay in cache most of the time.
Got it! Yeah my understanding was that stack memory is more likely to be stored on the CPU cache - but maybe that's possible for the heap as well... Though I haven't actually benchmarked this, maybe I'll do that...
Ordinary variables could also be assigned by the compiler to CPU registers, which makes them as fast as they get. This doesn't happen to the heap-allocated variables.
@@codetothemoon Access is fastest when the data is "near" the recent access. Which is a part of why data oriented programming is so much faster.
but I bet the methods of memory access have changed so much that what we are taught is not what is implemented in the most recent technology
nice explanations!!! finally i understood pointers
fantastic, glad you got something out of it!
THE best tut on Box, RC and Arc!
thank you, glad you liked it!
As a C# developer my understanding is that Rc basically turns structs into classes
How so? I thought C# uses garbage collection as opposed to reference counting?
@@codetothemoon I didn't mean on the memory allocation part, more so of how reference types work in C#
Absolutely love your videos! Keep up the great work. 😍
Thanks so much for your support Fotis!
Such high quality videos. Thank you :)
thanks for watching!
Thanks a ton for creating this!
Can't wait for new rust videos.
Thanks for watching, more to come!
Great video! I finally understood smart pointers and its appropriate usecases 🎉
Thanks Ramkumar, so happy it helped you!
Literally best place to explain Box I found
nice, really happy that you found it valuable!
It was very helpful to put forward usage scenarios.
best rust tutorial online, period
thank you so much!
These videos are wonderful as someone new to the language. Thank you!
Great, that's precisely what I'm aiming for! Glad you found it valuable!
Whenever i use the GMS and put it in the soft, it holds out the note forever! please help, i am very confused
🔥
Thanks, just what I needed
glad it was helpful!
So in the RC example would the memory exist until the main function gets completed since it adds to the strong count?
that's correct! Rc doesn't really help much if you intend to hang on to one reference until the program ends - you could just use regular borrows in that case - but in this example to show the strong_count function I just kept a reference in main.
I am unsure whether one should practice both safe and bad programming. At least it is safe, I suppose. Specifically, I do not understand one of these clone examples when good programming might ask the instance to remain singleton, all the way through (both literally and figuratively). You show us how to do it, and you behave as if: awesome.
they are singletons - when we call clone on the Rc/Arc smart pointers, it's the pointer that's being cloned, not the underlying data
@@codetothemoon That you can do it is not the point.
Still fairly new to Rust. If a routine has a reference of a clones structure, can it be changed, or does it more like get a copy?
I'm not sure exactly what you're asking - but if a routine (function) receives a shared reference to something, it does not get a copy (it basically is given a pointer) but it also cannot be modified.
The stack is not faster than heap. Both are locations in main memory. True, stack might be partially in registers, but in general, stack is no different to heap. Heap memory involves an allocator which in turn of course causes more overhead (internal some atomics need to be swapped and free memory has to be found). But stack and heap are both located in equally fast main memory.
I misspoke on this - thanks for pointing it out! I made a pinned comment about it.
i'm liking the quick vids
glad to hear, thanks for watching!
Thanks for your great content!!
thanks for watching!
Sir, what extension you use to have the UI Run in the main function.
good question - I'm actually not sure. This was in VSCode a long time ago, and I haven't used VSCode in years at this point.
Just the vid I needed
nice, glad you found it valuable!
Why is "recursive without indirection" an error? (3:00 ish)
Because Rust doesn’t like it when a type directly contains itself - can lead to an infinitely sized type
Wow. Amazing content!!!
thank you!! 😎
I'm interested how Rc knows when data is going out of scope, or being dropped like you did. How is it aware that the memory is no longer accessible after a specific point without knowing where the objects are created in the program? How does the Rc know that there is a reference to truck_b in the main function, for example?
great question, in Rc's implementation of clone there is `self.inner().inc_strong();` which increments the strong reference counter. So it doesn't necessarily know where the references are, it just increments a counter each time one is created. Then in Rc's implementation of the Drop trait (which has a drop method that is invoked when the implementor goes out of scope) we have `self.inner().dec_strong();` then if `self.inner().strong() == 0 { /*code for cleaning up memory here */ }`
@@codetothemoon Ohh I see :)) Thanks very much, that makes sense!
btw mem::drop is in prelude so you can just use drop(...)
ohh nice thanks for the pointer (no pun intended) !
Very helpful thanks!
Glad you found it valuable, thanks for watching!
Very good meta informations! Thank you
Thanks and thanks for watching!
Welcome back
Thanks!
This video is great, thank you for making it.
Thanks for watching!
You got a new subscriber !
Thanks Tsiory, very happy to have you onboard!
Wow that's an excellent video!
thank you, glad you got something out of it!
What about the Cow type? Still struggle with that, even when I have the documentation open
been meaning to make a video about it! stay tuned...
One more thing. I'm assuming that for clarity, you used the explicit Arc::clone instead of the suffixed version. You can use .clone() on an Rc/Arc and it will clone the reference instead of the data.
thanks for pointing this out - I should have mentioned this in the video if I didn't!
2:11 Why accessing the heap would be slower? It's still RAM like the stack, and can be cached by the CPU like any other memory. The only drawback of the heap is that it can suffer from fragmentation during allocation and deallocation. But it's incorrect to say it has slower access time.
Allocation and deallocation themselves are slower for the heap. Moreover, (just reading this from StackOverflow), the heap often needs to be thread-safe, meaning it cannot benefit from some of the same optimisations as the stack can.
@@spaghettiking653 yes fragmentation can make allocation slower, but memory access isn't slower, which is what the video implied. Having an object on the heap is exactly as fast as anywhere else, and fragmentation issues only occur in rare cases. We're talking literal nanoseconds slower to find free space on the heap instead of putting it on the stack. Unless we're talking about a very hot loop on performance critical software, it doesn't matter, and you shouldn't allocate in a hot loop anyway.
@@sbef Yes, fair point. What about the problems with thread safety? I really have no clue whether that's a real concern or whether it is a problem at all, as I literally read it minutes ago-what do you think/know?
Yeah I may have misspoken a bit here - stack memory is faster to allocate / deallocate than heap memory. Would patch this if I could :/ I'll pin a comment.
@@spaghettiking653 not sure how thread-safe the Rust default allocator is to be honest, but I would expect to be pretty much lock-free even in heavily concurrent applications. It's not my area of expertise, but allocator technology has been refined over the past 3 decades.
Watched a bunch of videos before this and didn't really get it at all. Now I feel like I have a pretty good idea of how to use each
Julian - that's fantastic! It thrills me to make tough concepts more palatable.
production. Thanks again!
Thank you too!
I like the pace of this video.
Thanks Thorkil, glad you liked it!
Good stuff, just came across Box today
Thanks Brandon!
This video is great, thank you
Glad you found it valuable, thanks for watching!
Finally a rust tutorial that clicks !
awesome, glad you got some value out of it!
How is cyclic data handled by Rc? As we can mutate the data we can give the value of the Rc a clone, right? Thus causing the data to never be deallocated
It isn't. You can define circular data with Rc that will never be deallocated. It's the programmer's job to handle this case correctly.
This was actually at the centre of the Leakpocalypse. It was decided that, while accessing deallocated memory is `unsafe`, leaking memory isn't.
You can somewhat get around this with weak references, to get circular data with deallocation, but it gets complicated pretty quickly.
great content!
thank you!
great video!
thank you!
are Rc’s safe? How do they prevent immortal reference loops?
What is your vscode theme?
Dark+!
can i know your keyboard name or kind of switch of your keyboard. it's sound great
I believe I used a Redragon K552 with blue switches on this one
@@codetothemoon nice thanks
awesome video, thanks.
thanks, glad you liked it!
If you are going to cover refcell, you should surely also cover it's siblings, Cell, UnsafeCell, Mutex and RwLock.
I have another video for all of these (except UnsafeCell) - check out “Rust Interior Mutability”
Hey man, I really like your VSCode theme, can you tell me which one are you using?
Sure it's Dark+!
@@codetothemoon Thanks! Have changed my theme.
The stack and the heap are just as fast, because they are on the same system memory. What takes time is allocation and pointer dereferencing.
yeah, now I see the stickied comment
4:10 Truck structure... struckture
lol nice!
very nice video
thank you!
Great video. What is this vscode theme?
Thanks and thanks for watching! VSCode theme is Dark+
Would be great to understand ownership and the stack. "The stack it's much faster than the heap" - i assume that if you pass variables by ref, the CPU Knows "Hey - i am going to use this storage, so i keep it in the cache", but what happens if F1() passes ownership to F2(), passes to F3()... F999() - is the data still on bottom Stack Frame and the storage is still in the cache ??
AFAIK the size of a stack frame cannot be changed.
So is it save to always say "Stack is faster than Heap!!!".
What comes to some crazy ideas like allocating a huge array for data that acts as "Database" with a fixed huge size in the most bottom stack frame, and then pass it through - or do i get something like "Stack frame to big" ?
I can't believe that using the Stack is better than the Heap in this case.
Maybe someone has a link that explains it in depth ?
Technically a stack frame can't be too big, the error that can occur is that the stack runs out of memory / stack overflow. A stack overflow could be achieved either by one mega stack frame or a multitude of small ones. Never the less, the error is that the stack memory is depleted (stack size varies from platform to platform and OS to OS) the size of any individual frame doesn't matter, it's the total memory that matters, either 1 large or N smaller ones, going over the stack size.
He completely unnecessarily confuse ownership, lifetimes and stack vs heap, for these examples.
The heap is generally "farther away" in memory than what the stack is. In computers we have cache, often multiple levels, these are extremely fast, pre fetched from main memory, and so, using data that is either A: close in space or close in time (temporal locality). The cpu will fetch this memory. So he also confuses what is fast about stack, because technically, operating on a large "database" as you refer to it, is also fast, because its temporal and spatial locality are both close- the cpu will understand that you want to do N things to that large array of data, so if you are operating on each element in a loop, the CPU will read that heap memory and pre fetch the data as your loop executes. When this happens, the heap is _exactly_ as fast as the stack, as, your large data blob is being operated on in a sequential manner, one element after the other (just like how the stack is laid out, close in space and close in time).
This is the main reason why you want data elements close in memory to each other, because that will make it so that the CPU can "see" what you are trying to do and fetch the memory ahead of time and place some of it in the cache.
There is another benefit of the stack, and that is that the clean up of stack memory involves just subtracting N bytes from the stack pointer. If all your data on the stack is "trivial" no involved destructors are run, compare this with the heap, where some clean up must happen to free the memory - and sometimes this could involve a system call which is much slower than normal functions, but even without system calls, there will be some overhead.
@@simonfarre4907 thanks a lot for this detailed answer. ah yeah i tested it out and the largest amount of data on my system was about 8 MB - what is even less then the cache size of the CPU. (Ubuntu 18, ryzen) Probably there are good reasons why to do so.
Thanks Thomas and Simon for pointing all of this out. I can definitely appreciate that "Stack vs Heap" is more nuanced than my brief portrayal of it in the video would lead you to believe.
Thanks, yeah I digged a little deeper. As far as I understand now: allocating and deallocating is faster on the stack. But for data that lives long it doesn't make meaningfull difference. I have not tried it out, but I can tell the linker to allow larger stacks. Therefore it could be possible to provoke a cache miss even on the stack? Or the OS panics if the stack exceeds the cpu cache size, because it always want to have the whole stack at least in L2 or L3. Would be a good reasoning for the default only allowing tiny stacks. If so, it might be faster in some scenarios to keep the stacks small, so the cpu has enough cache for the heap, instead of storing barely accessed data on the stack.
Why do you write Rc::clone() explicitely, instead of truck.clone() ?
great video!!
thanks, glad you enjoyed it!
Hey please create a video about refcell and cell!
Definitely doing this at some point, given the spooky factor it would have been a good one for halloween, but unfortunately it probably won't be ready in time 🎃
This was a super helpful primer on why/when to use these types! Would love to see more content building on it.
I'm trying to form some internal decision tree for how to decide how long a given piece of data should live for. Going to go see if you have any videos on that topic right now... 😁
great, really happy you got something out of the video! I don't have a video specifically on deciding how long a piece of data should live for, but "Rust Demystified" does cover lifetimes.
Could you demonstrate or explain Yeet? Love your eplanations
I had to look this up - is this what you're referring to? lol areweyeetyet.rs/
Sorry I misspelled. its Yew - gui for rust
@@noblenetdk Oh actually I already have a video about Yew - check out "Build A Rust Frontend" from earlier this year!
Was watching your Box part and was like... yep, I know those errors 😂😂😂
they are a rite of passage every Rust developer must traverse.... 😎
I understand if you’re coming from C or C++, the conceptual overhead of this stuff could make sense for you because it is largely stuff you actually already have to think about in a slightly different way.
But if you have the option to use a garbage collected language, I have no idea why you’d drag along all of this conceptual baggage with you.
I mean just look at the litany of peripheral specifiers that was created in this tiny example for no other reason than to appease the compiler. It’s a complete distraction from the problem you’re trying to solve.
actually interestingly, I think C/C++ knowledge doesn't help much unless you're writing `unsafe` Rust. then it might. But in `safe` Rust code, while you'll see some of the same symbols - mainly '&' - they may have a completely different meaning.
as for the "why", most folks should probably stick with a garbage collected language. Rust can shine in the following situations, where may be well suited for solving said problem:
1. Performance is valued above all else
2. The project needs to run on hardware with extremely limited resources
3. The project needs to handle a large volume of traffic while minimizing hosting costs - ie the "great problem to have" where a very small company makes a product that becomes heavily used
Thanks for the reply! I enjoy your videos. I completely agree with your list of use cases.
My mention of C/C++ wasn’t necessarily that it would make learning Rust easier, but that the seemingly crufty stuff that Rust does actually is an interesting solution to problems that do arise in those languages. So the overhead of dealing with it might make sense because it’s solving real problems that you commonly deal with in those languages (and not many others).
For that reason I do think it would be easier for a C/C++ dev to pick up, because they’re at least familiar with the reasoning behind the design choices. But that’s definitely up for debate.
you are awesome!!
thank you, glad you found the video valuable!
🤔 I would understand them more intuitively if they were named more intuitively and consistently. One is a single ownership pointer, uniquely owned. One is a shared ownership pointer, implemented via reference counting. Another is the same as the previous, just with interlocked atomic increment/decrement. Names like "Box" and "Arc" though feel pulled out of a hat. A box has height, width, and depth, but there is nothing volumetric in Rust's "Box" (and loosely co-opting the concept of "boxing" from C# feels weird here).
Rc stands for reference counter and Arc stands for atomic reference counter, they are just abbreviations which is good because they are frequently used and imagine writing ReferenceCounter every time, especially when you have to wrap many things with them.
For box it could be named better maybe, but there is no type that is going to be called a "box". If it is a math library it would call it cuboid, cube, rectangular prism or something else. For types that are frequently used short names are good.
Totally understand your frustration - to add to the other response, I believe "Box" and "Boxing" are terms that have histories that extend well prior to the inception of Rust, but are usually hidden from the developer by developer-facing language abstractions. I think Rust is just one of the first to actually expose the term directly to the developer.
@@codetothemoon Example dated usage: X.Leroy. Unboxed objects and polymorphic typing, 1992. The terms have been used in libraries also, at least since 2007 in Haskell and 2000 in Steel Bank Common Lisp. I suspect it could be traced back several decades more.
Hmm... Interesting, Maybe there would no cost for accessing variable that stored on heap, But rather there is a cost for allocation.
Yeah, I definitely appreciate that stack vs heap is much more nuanced than I made it out to be in this video...
U made a wrong statment about heap vs stack and preformance.
Basicly Allocating on the stack is faster but u can't free it urself and there is less room there by design.
So ideally u stack allocate stuff that are small and ur using for a hot minute in a function.
But something like a linked list should be on the heap.bu absolutely can stack allocate it in a languge like c but that should not be ur go to.
Side note: stack vs heap is an abstruction over raw memory handling (assembly has a stack but it's an actual stack data structure while the c stack is random access) it's just different alocators
Me (a frontend javascript webdev): fascinating!
nice, it seems like many JS frontend devs are interested in Rust!
Good video and One RefCell pls.
Thanks, will do one eventually, wishing I had done it for Halloween as I think it has the appropriate level of spookiness 🎃