Wow. Number 2 literally happened to me at work a couple months ago. Fortunately we noticed before anything horrible happened. Glad to see such a respected pillar of the community warning us about the use of unqualified enum variants, when it's often talked about as a cool feature of the language.
I would really like to widen this point to a general advice to avoid the wildcard use syntax in general, except for specific things like tests. Because I believe sumular issues can happen more code forms. I even remember that the Book talks about it.
This is what I love about the Rust community, the amount of resources to teach us lessons about writing GOOD code. Not just tutorials, but reasoning behind design desicions. Can't say how much I appreciate this content.
I appreciate the solution you give for tip 2. I had been doing the wildcard thing, and I perhaps foolishly disabled clippy's warning assuming that it was just a style thing. I didn't realize that making a short alias was an option, and I'll definitely start doing that because I prefer the terse and explicit form to the other forms (verbose, or implicit/dangerous).
Oh my god… I've run into that long-enum-name thing so many times, and always avoided using the wildcard because using it for a name that "should" be qualified like an enumeration variant felt wrong… …but never in all my time tinkering with Rust did I ever think to just… alias the base enum name. I feel so dumb right now 😂
In general, feeling stupid because "that was the obvious solution why didn't I think to check for it" typically is a sign that it's something that should have been there from the start, but for some reason wasn't. When I first switched to linux I tried very jankily to get print-screen snipping-tool like functionality (push print screen, drag rectangle over area, it's copied to clipboard) and spent hours trying to, only to find out it was literally just a shortcut option within KDE natively that I could have just searched for. It wasn't in some obscure menu, didn't need some weird workarounds, it was just there, one search bar and done. When your first instinct is to just ignore the most obvious solution because it's obviously not going to be implemented, that almost always says far more about the tool you're used to using than it does you.
Technically seen, there is a reason to return impl Into, though it's niche. If the conversion is relatively expensive (maybe it has to allocate a new String), and the caller may have some reason to not always call `.into()` on it.
This is a good point I didn't consider. True even if they are always gonna call `.into()`, just maybe not right away - which could be the case especially in concurrency-related applications where you're trying to timeshare effectively. Although, one may argue that a conversion expensive enough to have this concern may be better off in its own dedicated method. 🤔
when he said number five, I felt personally accused. don't worry about the fact that I've used child processes to catch segfaults generated by my rust code. it's fine i swear.
Would really like to hear your opinions on mod structure. Mine are usually set up like: - mods - use std/core - use external crates - use crate/super - re-exports - consts - statics - trait(s) - main module struct(s) - main struct impl - supporting structs/functions/impls - tests With macros living in separate files.
I usually flip external crates and std around but I'm not super sure why. Rest seems about what I do but I have no logical explanation for anything I do in that regard, I just do what feels right. Anyone who doesn't put tests at the bottom needs to get checked though
I think this is pretty close to what I do too! Personally I don't super care about the order of my imports though. My knee-jerk instinct is to order them reverse from you, i.e. crate/super->external->std/core, but I think that's just a habit carried over from C++ where the order of #includes actually matters semantically and ordering them as local->external->std is a best practice.
Regarding #2, the simplest solution is to use your LSP to auto-fill the match arms. If the resulting code is unreadable, consider whether the enum name really needs to be that long.
A different solution is to enforce distinct variables and type constructors (enum values). That's what Haskell did. You could tag sections to do this sort of thing in Rust via #![deny(unused_variables, non_camel_case_types, non_snake_case, non_upper_case_globals)] or similar, I think. unreachable_patterns and non_exhaustive may also help.
For no. 1 I usually put results into a variable. If a function returns a result that is unused, just use `let _ = a();`, otherwise you can explicitly see the return of the variable at the end.
That's a bad idea. You're telling the compiler that whatever `a()` returns, you don't care about. If `a()` started off returning `()` and is changed to return a `Result`, now the compiler doesn't know to tell you "hey you should probably handle that".
if it returns an unused result that's okay, but maybe not preferable. if you're using that to call functions which themselves return unit, don't. in either case, if the function ever changes to return `Result`, the compiler can't warn you!
@@razielhamalakh9813 I think the video is suggesting that either is fine as long as it's an intentional choice. Sometimes you want to call something just for it's side effects. `let _ = a();` is the same as `a();` from the video, just even more explicit about your intentions. Take logging as an example. Maybe your function uses IO and can fail, so it's reasonable to return a result. Do you really need to handle the failure of your logger in every function you're attempting to diagnose? Is logging critical to your application? Then handle it. If fault tolerance is the critical spec, maybe `let _ = log(result);` is the right answer. (Or Erlang.)
@@itishappy In that situation I'd rather see `log(..).ok();` tbh, though 'let _ = log(..);` does the job if you're into that. But that wasn't a situation I or the video was talking about! Discarding Results explicitly if that is your intention is _fine._ Discarding other return values in a way that will make you miss behavioural changes, such as a function call becoming fallible, is a dangerous habit to form.
@@razielhamalakh9813 Apologies for my poor explanation. I intended to describe a situation where `log(something)` currently returns `()` but it's reasonable to expect it to change in the future. It seems to me an intentional choice to discard the return value may make sense there (as might being extra explicit about it). I do agree it's a dangerous habit to form.
Thank you so much for your videos about intermediate Rust! They really help me learning Rust. There’s way too little such content out there. Looking forward for more! Video suggestion: What *is* a slice `[T]`? My current understanding: - note: often confusingly conflates term "slice" `[T]` with "reference to slice" `&[T]`, here we keep them separate! - slice is composite collection type - slice is a DST - slice is part of existing collection value, created by coercing collection - no separate value in memory - reference to slice is pointer to that existing collection value in memory and length, etc. - the existing collection value can be anywhere in memory, including on stack, e.g. array My questions: - can’t “create” slice by itself on imaginary memory island... so does a slice even "exist"? - if you say slice does exist, then you'd also have to say DSTs can be anywhere in memory including on the stack and only SST types can be on stack is wrong... - if you say slice doesn't exist, then a reference to a slice can't exist, just like a "car door" can't exist without the concept of a "car"... - is that what we call an _abstract_ data type?
I really, really like these deep thoughts about this thing most (including me) usually take for granted; definitely feels like something I’d be proud to discuss on my channel (although I’m not sure I could do a better job than you did in this comment :) ). One thing that comes to mind is that [T] actually can exist in its own right, on the heap: as the pointee of something like a Box, or as the last field in a struct (which is indeed how Arc is implemented). [T] itself (not &[T]*) also implements Drop and it destroys all the contained elements, which is how e.g. Vec implements dropping of its contained T’s (it calls drop_in_place on a slice constructed from its contents). Your point still stands that you never _create_ a [T], only convert things to it. Its a curious type indeed. And then str adds another layer of strangeness, since it’s just a UTF-8 “view” of a [u8], which is already (usually) a “view” of some other data…. *Edit: &[T] also implements Drop of course, I just mean I’m specifically talking about [T] here.
Thanks for your insight! I'm sure you'd do a much better job explaining it with your experience and deep understanding! Thanks for mentioning `Box`. It's a great example I hadn't thought of. It seems like a slice can have a separate value in memory instead of being part of an existing value. Though to create it, there still needs to be an existing collection value in memory like a `Vec`, that is copied or "taken over". This doesn't feel entirely satisfying answer to the existence question... It's like renaming an existing thing and giving it a new name. *Is* this something by itself, or *is* it the old thing called a different name? (I guess this is what happens if one is secretly born a philosopher but studies programming...) A string slice `str` also is a good example of a slice, though probably a more difficult rather than simpler one, because it adds the whole Unicode complexity. It seems to be a special case of a slice `[u8]` that under the hood "groups" some of those bytes together, and will panic if operations don't respect that grouping like indexing. It helped me to study slices before string slices, and vectors before owned strings, actually references and smart pointers before that. There's not much good learning material out there, so thank you again for your work to educate!
Number 2 really caught me by surprise, I was not aware of that failure mode. I will, however, continue using the wildcard import because I don't see a scenario in my projects that this wouldn't be caught in CI. It's not only the "should have a snake case name" error that you get when doing this, you also get "unused variable ..." and as long as you deny warnings in CI, this is fine. It still is a pretty bad footgun though, I agree. Maybe there is something that can be done upstream to detect this case and create a diagnostic that explains the situation better.
@4:13 - Not if you want to avoid allocations. You're assuming the caller needs an "owned" by converting it to a String inside the function yielding another allocation for every return.
&mut Vec/&mut String etc. definitely make sense as parameter types! I was careful to repeatedly specify that it’s _immutable_ references to Vec/String/Box that never make sense.
For no. 4, I'm thinking you might want to return `impl FnOnce`, because you might want to return a closure, which doesn't have a type you can define? Edit: in a reply I realized (also did a bit of googling). The another more likely reason is because .into() takes no args, whereas call_once (in FnOnce) takes args. With Into, the caller only can do one thing (call it, and get the converted value). With FnOnce, they can pass whatever args they want.
I think that example was specific to traits like `Into`. IMO, there are lots of times where the `impl Trait` syntax is ideal. Fn traits, yes, but also Iterators. Your caller don't need to know that you are actually returning a `Chain>>`. They only need to know that it is an `Iterator`.
@@tylerbloom4830 I see what you're saying. In the video, I think he was specifically limiting the advice/question to traits with only one method, i.e there is only one thing to do, with Into, all you can do is take the return value and call .into() and get a T out. With something like f once, you can pass in the arguments to the method (i.e. call_once). With Into, there are no args that the caller can pass.
Yes, you would want to return it in that case because it has parameters, which the caller should be able to set. In `Into` and `IntoIterator` examples there isn't any parameters, so the callee can already call them.
On the other hand, returning a FnOnce that takes no arguments can still have a use -- delayed evaluation of some kind. I think a case can be made for some of the other traits too -- but only if the trait implementation is very expensive for some reason. Otherwise, meh.
@@kitlith with returning an FnOnce with no args, that still lines up with what I was saying (I'm guessing you would have to pass unit to the call_once function to execute). Because the trait FnOnce has one function (call_once), which takes one arg (args, which is a tuple of arguments the function can accept, I think) the user will have to call call_once with something (whatever that something is). For Into/IntoIterator, they can only call .into()/.iter(), which takes no args. I can't think of a way that returning impl because of an expensive trait implementation would be useful, but maybe someone else in the comments might have an example.
I never thought about people using impl Into, as I always had as a rule to only use impl in return types for unamed or hard to spell out types and never ever in any other situation. But as you didn't explicitly said that I am wondering if I have missed cool use cases for impl in return types ?
Also great video, i just wrote like yesterday the match on long enum one and you really made me fear for my life there (and change the code immediately).
Returning impl Trait isn't _just_ for golfing hard-to-spell types (although that's a big benefit of it); it could also be for not committing to a concrete type if it might to change in the future. When you return `impl Iterator`, the exact details of the iterator you return can evolve without breaking API compatibility. IMHO this still isn't a good reason to return "consuming traits" like Into though, since the `T` part isn't abstracted there. Into is (only) for getting a `T`, so you might as well just give the caller a `T` directly.
@@_noisecode impl Trait should never be a part of a public API. This doesn't compile but it would if I had a concrete type. mod other { pub fn f() -> impl Ord {} } fn g() -> impl Ord { other::f() } fn h(x: bool) -> impl Ord { match x { false => other::f(), true => g(), } } Make a wrapper. At least until Rust gets some sort of decltype thing.
I don't do Rust but I like these great quality videos, however I'm a little critical on the part about returning direct references and applying that to other languages like C++. In C++ that probably isn't the best idea because you can end up with UAF bugs because a lot of times (especially returning views from functions) the view will live longer than the actual object that allocated the memory. Is that not a problem in Rust because of the borrow checker?
Yes. Borrow checker makes it impossible to return a reference to something on the stack frame of returning function. However I think Logan's argument here is a little bit different. This argument is essentially saying that more general APIs are better. &str is more general than &String (because it doesn't tie down the implementation). So the same argument in C++ still holds. string_view is (in theory) more general than const string&. Consumer of this API can do the same things with them (modulo C-strings ABI guarantees), yet string_view doesn't tie down implementation to use string.
These rust videos of yours are real gems! Thanks a lot! Points 3..=5 in this one weren't really new to me, but I have never thought about 1 and 2, although theiy seem so obvious now :D About the enum thing: I just tried that just to see if that would at least trigger some warnings and indeed, since the variable (which was a variant before) is usually unused in the match arm, you'd get an unused variable warning. That's still too dangerous to risk it, but at least a bit calming since I'd never ship anything with unresolved warnings.
I don't think I've seen "-> impl Into". Saw "-> impl Display" though which is acceptable since passing them to print-like macros can avoid extra allocation.
I'm glad to see you are still making these videos with excellent quality. Keep it up, you'll only keep growing and you'll be at 100k before you know it.
Because if you have `void foo(T*)`, you can do foo(myUniquePtr.get()), foo(mySharedPtr.get()), foo(myRawPtr), foo(&myStackObject), etc. But if you take a const unique_ptr&, it means the caller's object MUST be stored in a unique_ptr--but that unique_ptr is const, so you can't even do anything interesting like take ownership of it, reset it, or repoint it. So you've just made your API less flexible for no reason. Taking unique_ptr by value (or non-const reference) is a reasonable function signature though. That usually implies that you're taking ownership of the parameter, so you actually do need the "unique_ptr-ness" there, and since it's non-const, you can move out of it.
@@_noisecode Ah I see, that makes sense. I assume it's also cleaner to return const T* instead of std::unique_ptr when writing a getter as well too. However, is there a reason we don't return const T& instead or const T*? I assume you intended to generalize to include a case where the return value could be nullptr.
Yeah exactly, I just wanted to match the semantics of unique_ptr (which can be null), so I used T* in the video. If I know my pointer is non-null I much prefer to pass around T& (and const T& is even better of course). :)
ปีที่แล้ว
1: good point! 2: IMO warning is good enough so I wouldn't word it that strongly 3: technically you might need &String if you want to access capacity for some reason but probably never happens in practice 4. IIRC I once had to return impl IntoIterator due to borrowing issues so that might be a legit reason for it. 5. Agreed but keep in mind that many times unsafe causes the whole module to become unsafe.
@@FaZekiller-qe3uf stockholm syndrome, i see. i personally prefer the letter "K" over "C", but that's my personal opinion. now im just waiting for KLang the rust successor.
@@aemogie now that I'm thinking more about single letter ones, I want One with ℵ (aleph) as the name Hmmm The letter looked cooler in another font. Oh well.
As far as I know, yes, there is work being done to do this automatically as a compiler optimization. I don't have a link handy. There's also a proc macro crate that does it for you: docs.rs/momo/latest/momo/
AFAIK, this falls under the polymorphization wg's authority, which is still ironing out the last bugs in their analysis of functions that don't depend on their generics at all. the next planned step is merging functions that depend only on compile-time known properties of the generic type (layout, repr, const trait items?). apparently, you can already try the fruits of their labour out with `-Zpolymorphization=on` and have (hopefully) only rarely miscomputed programs. what you're describing is far trickier analysis that I haven't read anything concrete about anywhere, but I haven't looked for that info either.
Disagree on 1:55 Why? Because this kind of thing can happen just by accidentally forgetting the enum name in front of the variants, I had multiple bugs already because of this. The "use" line isn't the problem here imho, it's that the compiler only emits a warning. I always add the setting to turn details like this into hard errors. I think this is one of the mistakes in Rust syntax, it makes pattern matching needlessly verbose and enables bugs. It's a situation where you're basically forced to add type annotations and leaving them out leads to a bug instead of a type error...
To make sure I'm understanding--are you saying that the fundamental issue here is that unreachable pattern isn't a hard error? If it were, we could fearlessly use wildcards? I agree that unreachable patterns should be made hard errors, and it's probably a mistake that they're not by default. That error alone isn't enough to prevent all bugs here though--you can also just typo a variant name and then, if you're using the variant names unqualified, you've accidentally written a catchall pattern. Though if you then make CaseStyle violations hard errors too, you probably are fully covered.
@@_noisecode Pretty much, yes. And while I agree that wildcards aren't a good choice, I think if the matching syntax was a bit different nobody would have a reason to use them for enum variants in the first place. But then again the Rust devs aren't stupid and probably had reasons for this design.
While I generally agree with you, const T& isn’t a drop-in replacement for const unique_ptr&, which is why I used T* in the video. Like T*, unique_ptr can be null, and it doesn’t ensure that the T is const, both of which are different than the semantics of const T&.
Oh wow, the unsafe_op_in_unsafe_fn lint and its implications totally flew under my radar. Apparently there was talk about making that the default in a future edition, which seems like a good idea, but apparently it didn't go anywhere. I wonder why.
Here's my favourite complaint about unsafe: Global assembly blocks not only don't need unsafe, but can't be marked with unsafe, and the only motivation I've seen is that they didn't want to add another context where the keyword made sense. The effect is that you have to know about other keywords than unsafe to find things that can totally alter the behaviour of your programs in Rust source.
@@0LoneTech Isn't that just a misunderstanding? Declaring global assembly blocks doesn't require unsafe, calling into them does. Just like declaring a raw pointer doesn't require unsafe, but dereferencing it does. Either is perfectly safe - as long as you don't use it. And then you have to mark unsafe.
@@swapode Global assembly blocks can change behaviours without your Rust code specifically calling into them. As a proof of concept, I replaced stat64 within a hello world program using only text inside global_asm. I don't know why hello world contained a stat64 in the first place, but there we are.
Something that looks kind of like #4 is in Axum and is handled reasonably there. It's not quite the same because you generally won't be specifying an IntoResponse in your handler return types, unless what you're writing is some kind of wrapper around something else. But the interfaces you're coding to do so liberally under the covers.
So what's the reason for not putting a semicolon at 4:45? Is it because in this case, when we use the inner function trick, the intent is "we return unit here because our actual inner function returns unit"? As in, because the inner function trick is meant to be the thinnest possible wrapper around our monomorphic code?
I used Rust a long time ago, and now as Nim user this is funny to me. Nim forces you to either use or discard values (and I found that very useful to avoid a few mistakes) so 1 is irrelevant. Enums are always inferred from the type so there's no need to prefix them except when there's ambiguity, and there's no name for the default value (matching is done with case... of... and the default is else) so 2 is irrelevant. In Nim it's common to have "ref" part of the type, so 3 comes naturally as there's for example FooRef but not someref. You could always return "ref some[T]" but nobody does that. 4 and 5 are much more language dependent. Nim generics and concepts work differently than in Rust. In Nim you're allowed to use unsafe operations anywhere but they're so few and so rarely needed that in practice you can very easily find the parts that would have been marked as "unsafe" in Rust. Just search for /addr|ptr|pointer/. If you're not interfacing with C libraries and not doing some basic data type library, you shouldn't need to use unchecked memory.
4:34 What do you mean by "monomorphization overhead"? The "inner function trick"? Also, about use-ing the enum name with an alias: isn't there a syntax, e.g. like "with" in Python, that does this temporarily, over a single block of code? There should be, in my opinion. Then you could do (e.g.) with use MyLongEnumName as E { // stuff } // No longer bound to E after that This should also be a zero-cost abstraction, since it should happen only internally while the compiler is considering each block.
Each time you call a generic function with a new argument type, the compiler has to stamp out a separate copy (monomorphization) for that argument type. This can lead to large amounts of code getting generated if your generic functions are long and/or complex. You can often do better; in the example from the video, the only part of the function that needs to be generic is the call to .into(), since the rest of the function afterward operates on the String. So--you can pull that String stuff into a separate, non-generic function, that you call after you've gotten the generic part over with. That means only the call to .into() needs to be stamped out for each parameter type, giving you minimal code duplication. The rest is a single, non-generic function (albeit defined inside the generic outer function for convenience). As for the naming thing, Rust doesn't have special syntax for it, but you can just create a block and declare the alias inside it. { use MyLongEnumName as E; // stuff } // E is gone
@@_noisecode I see! I didn't know it was called that, nor that you could put a function inside another function, nor that the use statement would take a block scope :) Thanks for your detailed answer!
I was thinking there might be a tiny reason to use &Vec sometimes because of the spare_capacity-functions, but there's only the mut versions. I guess that's reasonably, what would you even want to do with an &MaybeUninit.
Thanks for this EXCELLENT video! Very great points to think about! Point 5 is a difficult one to form hard opinions on, but also really interesting. I tend to disagree with your opinion the following reason, I expect the reason for NOT duplicating the unsafe keyword is that the unsafe function as a WHOLE should be considered unsafe and therefore ENTIRELY be kept as small as possible. However, my eventual opinion will also depend on whether someone can convince me that you have a case where you really need an unsafe function to be larger than the actual unsafe piece. I am curious. Maybe the conclusion will really be that we simply need some new language elements for this. Maybe there actually should be a *safe* keyword, or something like that. Really interesting.
I used to do #2, never really had issues with it. But one day I decided that its more readable when fully-qualified, and saving 30 seconds of typing wasnt enough justification to keep doing it.
2:50 You could do this which eliminates that problem while reducing the annoyance of typing a long enum name. enum Enum { A, B, } fn main() { let val = Enum::A; type e = Enum; match val { e::A => {}, e::B => {}, } }
On #2, this can be an issue when matching on an integer against imported const values (typically in ffi-related code). I like to #![forbid(unreachable_patterns)] in all my projects, personally, and honestly believe that should be a hard error by default. On #4, one exception I find acceptable in limited circumstances is returning 'impl Drop'. If the scope that the drop is relavant exceeds a module, then I'll probably give it a named type, but in a relatively local scope I think '-> impl Drop' can be okay for wrapping something that just needs to have some drop glue attached and nothing else.
Note that "impl Drop" and "has some drop glue attached" are entirely different things. (Well, the former implies the latter, but still..) Also, realistically, a `T: Drop` bound should never ever ever ever be written, because most people *would* misinterpret it, and it's a useless constraint anyways. Same goes thus for "impl Drop". The standard library doesn't come with any "implemented for all types" kind of dummy trait that you could use instead, but perhaps you should just quickly define and use such a trait instead.
Whenever I write code in a language that doesn’t have import/include/use aliases, I really miss it. Not just for things shown in the video but for other reasons like avoiding naming collisions etc.
im curious about the enum pattern matching issue, why didn’t rust choose to differentiate all bindings from types by making all types start with uppercase and viceversa? in haskell this issue is caught by the compiler, if you put an uppercase binding in a case, it would fail to compile complaining about an inexistent constructor…
I actually setup my Clippy so that it requires the explicit `return` statement. Yes, it's more verbose, but I find it easier to read. Single semicolon is in my opinion too easy to miss for something as important as returning.
Number 6: Always specify the type of unused variables. Example: do you see the bug below? ``` async fn call_api() -> Result { /* ... */ } async fn my_code() { // we don't care about the error, so discard it let _ = call_api(); } ``` You fix the bug by doing this: ``` let _: Result = call_api(); // compilation error let _: Result = call_api().await; // now it's fixed ``` The resulting code looks ugly, but it saves you from a painful, hard to find bug.
I think #2 should just be generalized to: "Never use star imports." All that star imports do is introduce name shadowing down the line. Sometimes i even want to completely disable non qualified imports altogether.
@@thegreatb3 is anyone still usin those? I remember them being all over the place a few years a ago, but it's been a long time since I last saw a library exporting a prelude module.
For number 2, is there any specific reason to use `use VerboseName as V;' instead of `type V = VerboseName;', or does Rust interpret these the same way?
the idiomatic ways to write discard in rust are: let _ = a(); // or _something if you want it to drop at the end of the scope! drop(a()); // explicit drop, same as _ (not _something), drop is the fn that converts anything -> (), so its convenient in match expressions for example it is NOT just a semicolon, and if the behavior you are trying to differentiate is that important, please use drop() or even return. in most cases its not as important, so then people can use semicolon or none
Please don't call `drop()` explicitly as a habit. "I will convert anything this does into a unit" is a dangerous antipattern with a potential to miss unhandled errors (or any other `must_use` types) which will leave you with weird behavioural bugs. Use a semicolon to discard a return value you don't care about (so that rustc has a chance to warn you when the signature of `a()` changes), or handle the `must_use` value properly, even if it is just calling `.ok()` on a `Result` that legitimately does not matter.
Implicit return as shown in number 1 was a design mistake in Rust. Why load all this important semantic baggage onto something as ubiquitous and automatic as a semicolon? Would it really hurt that much to just explicitly invoke a return keyword? It strikes me as extra strange when Rust otherwise prides itself on how explicit it is in everything. In my own Rust code, I always return explicitly.
I was thinking a lot about this exact case from axum while working on this video. I think it's fine to use impl IntoResponse in that case, since it's part of the core design of axum; axum is the one calling your function, and axum is specifically designed to call all the .into_response() stuff for you internally. In other words, the problem with 'infecting' call sites doesn't really apply, since axum already ships the full set of all those call sites as part of what it provides to you as a library, and it pre-pays the cost of calling .into_response() at all those call sites in order to make the library easier to use on the user's end. It would be annoying, on the flip side, if axum provided a bunch of public APIs that returned `impl IntoResponse`, since that would mean you would have to call `.into_response()` all the time in your code. Not sure how well I articulated that, but I appreciate you bringing this up since it forced me to write out some thoughts that were nebulous before. TL;DR use IntoResponse with axum because axum is specifically designed with this pattern in mind.
@@_noisecode "if axum provided a bunch of public APIs that returned `impl IntoResponse`" What I saw is at least one library returning "impl Future" which made me realize library implementors can't use async blocks if they want a reasonable API. Also, warp, for example, uses opaque types so they might as well be impl Filter (it also uses private traits to define the bounds in public API which I'm surprised is allowed).
I just saw someone do { stuff; } instead of { stuff } and that's how they discovered that () implements IntoResponse. If you can spell out the type, try.
I disagree with no. 4: you should return `impl Into` if you have that value anyway, because the `into()` might not be called. Similarly, an `impl FnOnce` might be called later or conditionally and not immediately after it was returned. Not returning `impl Into` will probably only hurt performance because an out-of-memory error is unlikely but not returning `impl FnOnce` actually limits the logic of the caller.
1. 100% agree 2. i agree that this is a problem, but i feel like your recommendation is the wrong way to solve the problem. the fact that this can even happen is a language design flaw, and the correct way to solve it would have been to use a different syntax for wildcard patterns so the problem cannot occur in the first place, but i know this is no longer possible as it would be a massive breaking change. i know its a very silly bandaid solution, but isnt there some #[deny()] that could be applied locally at match blocks that turns naming convention violations into hard errors? at least for scenarios where giving up on Enum::* would result in massive verbosity this could be an alternative worth considering 3. i agree with the point that this is trying to make, but the "'never do X" phrasing used here is overshooting the goal. there are definitely scenarios where it does make sense to use them, especially with generics like &T 4. 99% agree. in scenarios where a function performs a desired side effect but returns something you dont need this could in theory serve as a performance optimization as it gives you the option to not perform the conversion at all. but i admit this is very nieche and usually a result of insufficient separation of concerns 5. i'm not sure about this one. need to think about it for a while
Re: unsafe fn, I do like that lint, but I understand why it's an option and not the default. If you're writing an unsafe function, the expectation is that this function will contain unsafe calls (but not always!). If you're writing a library that, say, interacts with a massive external C library, requiring unsafe { ... } inside unsafe fn just adds a lot of noise inside every function. However, I do still think it's the right thing to do, because I think the goal with Rust is to reach unsafe-zero, so every line of code that is pulled out of unsafe and into safe is a win.
I get semi colons but Htf do our lsp’s not have auto correct and put them in on save! Drives me baaaanananananans. Only thing I like more in go just saying
When would you ever want to use a wildcard? I respect that there could be a reason, such as if you were doing something funky with modules, but any pattern I can come up with would be better and more idiomatically handled without it. I've never seen someone suggest a good usage. For a language so dedicated to being exhaustive, what was the rationale for including this?
I think the best use case is typically with `use some_crate::prelude::*;`, where `prelude` is a module that the crate exposes that is specifically designed to be used with a wildcard. `use super::*` inside `mod test` can be nice too. Otherwise yeah, I think it's typically a bad idea.
@@_noisecodeI agree with your statement, but I think there might be a better way. Like maybe some way to mark particular modules as safe to wildcard on.
I have an issue with point 1. Even with semicolon ommited, you'd still have to specify the return type of a and b separately. If you really want to couple a() and b() together, you could put a() into a trait (say trait A) and have it return it's associated type A::Output, then b() could return A::Output and it would always be the same as output of a()
Why would you create a trait for a one-off thing like that? To silence the compiler? You'll probably _want_ it to yell at you if the return type of `a()` (but not `b()`) changes so that you have a chance to consider the implications rather than just blindly forwarding the result.
@@razielhamalakh9813 why not? Just make the trait private and it won't be a big burdain. To clarify, I don't think binding two functions together is a good idea, but in case we really need to, the semicolumn/no semicolumn thing is pretty useless as you'd still have to update the return type
@@BulbaWarrior why not is because you're adding complexity under the assumption that a() and b() should always return the same type and that b() is defined in terms of a(), rather than a() being an implementation detail of b(). When the return type of a() changes, while it's possible that just the return type of b() should change, it's far more likely that it needs a logic change as well, either in the setup before it calls a(), or perhaps it needs to process the a()'s return value now. And if b() is only there to call a() and return its result, b() probably shouldn't exist in the first place.
@@razielhamalakh9813 Once again, I am not saying that it is a good approach. All I'm saying is that the semicolmn/no semicolumn argument stated in the video makes no sence, and I show how the relationship between return types CAN be expressed in rust IF you actually really need it.
Call me old fashioned, but I’ve never really liked how Rust returns the last thing from a block if there’s no semicolon. In the example you give, explicit return statements make the behaviour and intent way clearer: return a(); vs a(); return;
@@_noisecodePersonally (as a primarily non-rust person) I don't think having a random variable name at the end of the function is elegant at all. First of all instead of always being able to see what is being returned you now have to check "is there a semicolon or no?" (which is more error-prone). And second having a random "j" at the end of the block just looks goofy.
@@KaneYorkWhat? The idea is that you think of the semicolon as an operator just like +, -, &&, etc. So a;b is a left-associative operator with less priority than every other operator that returns b but first executing a. When you end a block with a semicolon or have two semicolons in a row, syntactically you are inserting empty statements which evaluate to ()
You listed a number of bad design choices. The one related to the semicolumn `;` is absolutely terrible, why in the hell did they do that? It is so misleading. I disagree with you about `unsafe`. If you declare a function unsafe, I don't see why you should repeat that in the code of the function.
I think that ret should be explicit in rust everyware, since ret is more visible and verbose than semicolon. it is even better for example to codereview
1. "Imagine if the return type of a changed to i32" is just bizarre. When would the return type of a function change like that? And if you change the return type of a function, you are changing its meaning so much that you should probably rename it and change every single call-site. 2. "Look what happens if you remove VarC". If you remove a variant from an enum, you should obviously search for all its uses first. Why would you remove a variant from an enum anyway? It's just an absurd suggestion. I don't think I have *ever* removed a variant from an enum. It's also clearly a bug in the compiler that it allows you to have a variable name in a match called 'VarC'. Haskell got it right when it made CapitalisedNames and unCapitalisedNames *syntactically* distinct. A variable *cannot* start with a capital letter in that language. It's not meant to in Rust either, so why allow it to? Bad decision.
This is what you sound like: "Why would you ever edit your code? If you have to edit it, that means you shouldn't have written it that way to begin with!" Specs change. People learn. Features get added and removed. Rustc provides a massive set of useful diagnostics specifically so that you don't have to search for every use of an enum variant or a function call. _The compiler will tell you._ All you have to do is not hinder it.
That works if your function just uses a single return statement. However, you could also do this "let value = if some_condition { "foo" } else { "bar" }". If you omit the return, you return the value from the block. If you use return you return from the enclosing function. Forcing a "return" at the end of a function makes it more complicated. Use "return" to indicate an early return from a function, not just as a generic return.
Wow. Number 2 literally happened to me at work a couple months ago. Fortunately we noticed before anything horrible happened. Glad to see such a respected pillar of the community warning us about the use of unqualified enum variants, when it's often talked about as a cool feature of the language.
But didn't you get a warning? Were you building in such a way that warnings were not errors?
I would really like to widen this point to a general advice to avoid the wildcard use syntax in general, except for specific things like tests.
Because I believe sumular issues can happen more code forms. I even remember that the Book talks about it.
Would it not give a capitalization warning?
Respected pillar of the community lol. A programming youtuber that posts videos about trivialities.
@@ClearerThanMudyup, you get warned and told to prepend an underscore.
This is what I love about the Rust community, the amount of resources to teach us lessons about writing GOOD code. Not just tutorials, but reasoning behind design desicions. Can't say how much I appreciate this content.
I appreciate the solution you give for tip 2. I had been doing the wildcard thing, and I perhaps foolishly disabled clippy's warning assuming that it was just a style thing. I didn't realize that making a short alias was an option, and I'll definitely start doing that because I prefer the terse and explicit form to the other forms (verbose, or implicit/dangerous).
Meanwhile I didn't know this is possible with the "use as" keywords too. I always did it with "type S = SomeEnumName"
same here @@SaHaRaSquad
Oh my god… I've run into that long-enum-name thing so many times, and always avoided using the wildcard because using it for a name that "should" be qualified like an enumeration variant felt wrong…
…but never in all my time tinkering with Rust did I ever think to just… alias the base enum name. I feel so dumb right now 😂
In general, feeling stupid because "that was the obvious solution why didn't I think to check for it" typically is a sign that it's something that should have been there from the start, but for some reason wasn't. When I first switched to linux I tried very jankily to get print-screen snipping-tool like functionality (push print screen, drag rectangle over area, it's copied to clipboard) and spent hours trying to, only to find out it was literally just a shortcut option within KDE natively that I could have just searched for. It wasn't in some obscure menu, didn't need some weird workarounds, it was just there, one search bar and done.
When your first instinct is to just ignore the most obvious solution because it's obviously not going to be implemented, that almost always says far more about the tool you're used to using than it does you.
Holy. I was not aware of number 2. Thank you for saving me hours of debugging in the future ❤
Technically seen, there is a reason to return impl Into, though it's niche. If the conversion is relatively expensive (maybe it has to allocate a new String), and the caller may have some reason to not always call `.into()` on it.
This is a good point I didn't consider. True even if they are always gonna call `.into()`, just maybe not right away - which could be the case especially in concurrency-related applications where you're trying to timeshare effectively.
Although, one may argue that a conversion expensive enough to have this concern may be better off in its own dedicated method. 🤔
when he said number five, I felt personally accused. don't worry about the fact that I've used child processes to catch segfaults generated by my rust code. it's fine i swear.
Your advice at 2:50 is so helpful. Obvious in retrospect but I guess I will have to refactor some code.
yeah that's such a good suggestion lol
I already never do point 3 and didn't even know people were doing point 4, but hot damn point 1 2 and 5 are a mind blown and a half.
Yea 1 was alot of fun
Would really like to hear your opinions on mod structure. Mine are usually set up like:
- mods
- use std/core
- use external crates
- use crate/super
- re-exports
- consts
- statics
- trait(s)
- main module struct(s)
- main struct impl
- supporting structs/functions/impls
- tests
With macros living in separate files.
I usually flip external crates and std around but I'm not super sure why. Rest seems about what I do but I have no logical explanation for anything I do in that regard, I just do what feels right. Anyone who doesn't put tests at the bottom needs to get checked though
Hey! I do similar, but with functions above traits and stucts, generally.
I think this is pretty close to what I do too! Personally I don't super care about the order of my imports though. My knee-jerk instinct is to order them reverse from you, i.e. crate/super->external->std/core, but I think that's just a habit carried over from C++ where the order of #includes actually matters semantically and ordering them as local->external->std is a best practice.
Regarding #2, the simplest solution is to use your LSP to auto-fill the match arms. If the resulting code is unreadable, consider whether the enum name really needs to be that long.
A different solution is to enforce distinct variables and type constructors (enum values). That's what Haskell did.
You could tag sections to do this sort of thing in Rust via #![deny(unused_variables, non_camel_case_types, non_snake_case, non_upper_case_globals)] or similar, I think. unreachable_patterns and non_exhaustive may also help.
My strong opinion is that this is the bestest Rust channel on TH-cam
Nice to see you again. People of culture 🫡
It's honestly hard to beat noboilerplate at this catagory but some how this channel reached that level.
Wow, okay. The enum one got me good. I'm gonna update my patch series tomorrow. Hats off.
For no. 1 I usually put results into a variable. If a function returns a result that is unused, just use `let _ = a();`, otherwise you can explicitly see the return of the variable at the end.
That's a bad idea. You're telling the compiler that whatever `a()` returns, you don't care about. If `a()` started off returning `()` and is changed to return a `Result`, now the compiler doesn't know to tell you "hey you should probably handle that".
if it returns an unused result that's okay, but maybe not preferable. if you're using that to call functions which themselves return unit, don't. in either case, if the function ever changes to return `Result`, the compiler can't warn you!
@@razielhamalakh9813 I think the video is suggesting that either is fine as long as it's an intentional choice.
Sometimes you want to call something just for it's side effects. `let _ = a();` is the same as `a();` from the video, just even more explicit about your intentions.
Take logging as an example. Maybe your function uses IO and can fail, so it's reasonable to return a result. Do you really need to handle the failure of your logger in every function you're attempting to diagnose?
Is logging critical to your application? Then handle it.
If fault tolerance is the critical spec, maybe `let _ = log(result);` is the right answer. (Or Erlang.)
@@itishappy In that situation I'd rather see `log(..).ok();` tbh, though 'let _ = log(..);` does the job if you're into that. But that wasn't a situation I or the video was talking about! Discarding Results explicitly if that is your intention is _fine._ Discarding other return values in a way that will make you miss behavioural changes, such as a function call becoming fallible, is a dangerous habit to form.
@@razielhamalakh9813 Apologies for my poor explanation. I intended to describe a situation where `log(something)` currently returns `()` but it's reasonable to expect it to change in the future. It seems to me an intentional choice to discard the return value may make sense there (as might being extra explicit about it). I do agree it's a dangerous habit to form.
Thank you so much for your videos about intermediate Rust! They really help me learning Rust. There’s way too little such content out there. Looking forward for more!
Video suggestion: What *is* a slice `[T]`?
My current understanding:
- note: often confusingly conflates term "slice" `[T]` with "reference to slice" `&[T]`, here we keep them separate!
- slice is composite collection type
- slice is a DST
- slice is part of existing collection value, created by coercing collection
- no separate value in memory
- reference to slice is pointer to that existing collection value in memory and length, etc.
- the existing collection value can be anywhere in memory, including on stack, e.g. array
My questions:
- can’t “create” slice by itself on imaginary memory island... so does a slice even "exist"?
- if you say slice does exist, then you'd also have to say DSTs can be anywhere in memory including on the stack and only SST types can be on stack is wrong...
- if you say slice doesn't exist, then a reference to a slice can't exist, just like a "car door" can't exist without the concept of a "car"...
- is that what we call an _abstract_ data type?
I really, really like these deep thoughts about this thing most (including me) usually take for granted; definitely feels like something I’d be proud to discuss on my channel (although I’m not sure I could do a better job than you did in this comment :) ). One thing that comes to mind is that [T] actually can exist in its own right, on the heap: as the pointee of something like a Box, or as the last field in a struct (which is indeed how Arc is implemented). [T] itself (not &[T]*) also implements Drop and it destroys all the contained elements, which is how e.g. Vec implements dropping of its contained T’s (it calls drop_in_place on a slice constructed from its contents). Your point still stands that you never _create_ a [T], only convert things to it. Its a curious type indeed. And then str adds another layer of strangeness, since it’s just a UTF-8 “view” of a [u8], which is already (usually) a “view” of some other data….
*Edit: &[T] also implements Drop of course, I just mean I’m specifically talking about [T] here.
Thanks for your insight! I'm sure you'd do a much better job explaining it with your experience and deep understanding!
Thanks for mentioning `Box`. It's a great example I hadn't thought of. It seems like a slice can have a separate value in memory instead of being part of an existing value. Though to create it, there still needs to be an existing collection value in memory like a `Vec`, that is copied or "taken over". This doesn't feel entirely satisfying answer to the existence question... It's like renaming an existing thing and giving it a new name. *Is* this something by itself, or *is* it the old thing called a different name? (I guess this is what happens if one is secretly born a philosopher but studies programming...)
A string slice `str` also is a good example of a slice, though probably a more difficult rather than simpler one, because it adds the whole Unicode complexity. It seems to be a special case of a slice `[u8]` that under the hood "groups" some of those bytes together, and will panic if operations don't respect that grouping like indexing. It helped me to study slices before string slices, and vectors before owned strings, actually references and smart pointers before that.
There's not much good learning material out there, so thank you again for your work to educate!
Awesome video! Your videos actually teach me things.
Love the little details too e.g. the tip at 4:45
I wasn’t sure I was with you on point two until you mentioned aliasing-but now I’m completely on board. Great, elegant solution to the wildcard issue
Number 2 really caught me by surprise, I was not aware of that failure mode. I will, however, continue using the wildcard import because I don't see a scenario in my projects that this wouldn't be caught in CI. It's not only the "should have a snake case name" error that you get when doing this, you also get "unused variable ..." and as long as you deny warnings in CI, this is fine.
It still is a pretty bad footgun though, I agree. Maybe there is something that can be done upstream to detect this case and create a diagnostic that explains the situation better.
@4:13 - Not if you want to avoid allocations. You're assuming the caller needs an "owned" by converting it to a String inside the function yielding another allocation for every return.
This is an exception to 3:
I wrote a specialized version of Vec::retain recently that, obviously, only works for &mut Vec inputs.
&mut Vec/&mut String etc. definitely make sense as parameter types! I was careful to repeatedly specify that it’s _immutable_ references to Vec/String/Box that never make sense.
For no. 4, I'm thinking you might want to return `impl FnOnce`, because you might want to return a closure, which doesn't have a type you can define?
Edit: in a reply I realized (also did a bit of googling). The another more likely reason is because .into() takes no args, whereas call_once (in FnOnce) takes args. With Into, the caller only can do one thing (call it, and get the converted value). With FnOnce, they can pass whatever args they want.
I think that example was specific to traits like `Into`. IMO, there are lots of times where the `impl Trait` syntax is ideal. Fn traits, yes, but also Iterators. Your caller don't need to know that you are actually returning a `Chain>>`. They only need to know that it is an `Iterator`.
@@tylerbloom4830 I see what you're saying. In the video, I think he was specifically limiting the advice/question to traits with only one method, i.e there is only one thing to do, with Into, all you can do is take the return value and call .into() and get a T out. With something like f once, you can pass in the arguments to the method (i.e. call_once). With Into, there are no args that the caller can pass.
Yes, you would want to return it in that case because it has parameters, which the caller should be able to set. In `Into` and `IntoIterator` examples there isn't any parameters, so the callee can already call them.
On the other hand, returning a FnOnce that takes no arguments can still have a use -- delayed evaluation of some kind.
I think a case can be made for some of the other traits too -- but only if the trait implementation is very expensive for some reason. Otherwise, meh.
@@kitlith with returning an FnOnce with no args, that still lines up with what I was saying (I'm guessing you would have to pass unit to the call_once function to execute). Because the trait FnOnce has one function (call_once), which takes one arg (args, which is a tuple of arguments the function can accept, I think) the user will have to call call_once with something (whatever that something is). For Into/IntoIterator, they can only call .into()/.iter(), which takes no args.
I can't think of a way that returning impl because of an expensive trait implementation would be useful, but maybe someone else in the comments might have an example.
I never thought about people using impl Into, as I always had as a rule to only use impl in return types for unamed or hard to spell out types and never ever in any other situation.
But as you didn't explicitly said that I am wondering if I have missed cool use cases for impl in return types ?
Also great video, i just wrote like yesterday the match on long enum one and you really made me fear for my life there (and change the code immediately).
Returning impl Trait isn't _just_ for golfing hard-to-spell types (although that's a big benefit of it); it could also be for not committing to a concrete type if it might to change in the future. When you return `impl Iterator`, the exact details of the iterator you return can evolve without breaking API compatibility.
IMHO this still isn't a good reason to return "consuming traits" like Into though, since the `T` part isn't abstracted there. Into is (only) for getting a `T`, so you might as well just give the caller a `T` directly.
@@_noisecode impl Trait should never be a part of a public API. This doesn't compile but it would if I had a concrete type.
mod other { pub fn f() -> impl Ord {} }
fn g() -> impl Ord { other::f() }
fn h(x: bool) -> impl Ord {
match x {
false => other::f(),
true => g(),
}
}
Make a wrapper. At least until Rust gets some sort of decltype thing.
I don't do Rust but I like these great quality videos, however I'm a little critical on the part about returning direct references and applying that to other languages like C++. In C++ that probably isn't the best idea because you can end up with UAF bugs because a lot of times (especially returning views from functions) the view will live longer than the actual object that allocated the memory. Is that not a problem in Rust because of the borrow checker?
Yes. Borrow checker makes it impossible to return a reference to something on the stack frame of returning function. However I think Logan's argument here is a little bit different. This argument is essentially saying that more general APIs are better. &str is more general than &String (because it doesn't tie down the implementation). So the same argument in C++ still holds. string_view is (in theory) more general than const string&. Consumer of this API can do the same things with them (modulo C-strings ABI guarantees), yet string_view doesn't tie down implementation to use string.
These rust videos of yours are real gems! Thanks a lot! Points 3..=5 in this one weren't really new to me, but I have never thought about 1 and 2, although theiy seem so obvious now :D
About the enum thing: I just tried that just to see if that would at least trigger some warnings and indeed, since the variable (which was a variant before) is usually unused in the match arm, you'd get an unused variable warning. That's still too dangerous to risk it, but at least a bit calming since I'd never ship anything with unresolved warnings.
I don't think I've seen "-> impl Into". Saw "-> impl Display" though which is acceptable since passing them to print-like macros can avoid extra allocation.
Good points. I can see mistakes/frustrations arising from these consequences.
I'm glad to see you are still making these videos with excellent quality. Keep it up, you'll only keep growing and you'll be at 100k before you know it.
Never was so sure of time being well spent watching a video. Thanks for the awesome content, keep em coming! ❤
At 3:44, why is it better to return a raw pointer to T than a const unique_ptr to T? Great video by the way.
Because if you have `void foo(T*)`, you can do foo(myUniquePtr.get()), foo(mySharedPtr.get()), foo(myRawPtr), foo(&myStackObject), etc. But if you take a const unique_ptr&, it means the caller's object MUST be stored in a unique_ptr--but that unique_ptr is const, so you can't even do anything interesting like take ownership of it, reset it, or repoint it. So you've just made your API less flexible for no reason.
Taking unique_ptr by value (or non-const reference) is a reasonable function signature though. That usually implies that you're taking ownership of the parameter, so you actually do need the "unique_ptr-ness" there, and since it's non-const, you can move out of it.
@@_noisecode Ah I see, that makes sense. I assume it's also cleaner to return const T* instead of std::unique_ptr when writing a getter as well too. However, is there a reason we don't return const T& instead or const T*? I assume you intended to generalize to include a case where the return value could be nullptr.
Yeah exactly, I just wanted to match the semantics of unique_ptr (which can be null), so I used T* in the video. If I know my pointer is non-null I much prefer to pass around T& (and const T& is even better of course). :)
1: good point!
2: IMO warning is good enough so I wouldn't word it that strongly
3: technically you might need &String if you want to access capacity for some reason but probably never happens in practice
4. IIRC I once had to return impl IntoIterator due to borrowing issues so that might be a legit reason for it.
5. Agreed but keep in mind that many times unsafe causes the whole module to become unsafe.
My number 1 strong opinion is, rust is a very cool name for a programming language
C is pretty good too. I don't think A or B work as well.
@@FaZekiller-qe3uf stockholm syndrome, i see. i personally prefer the letter "K" over "C", but that's my personal opinion. now im just waiting for KLang the rust successor.
@@aemogie now that I'm thinking more about single letter ones, I want
One with ℵ (aleph) as the name
Hmmm
The letter looked cooler in another font.
Oh well.
@@aemogiek
This was a really nice video. BTW impl Into does have its use cases, for example, check out the axum crate.
4:42 I feel like the compiler should be able to slice a function into those parts, is there any discussion about it?
As far as I know, yes, there is work being done to do this automatically as a compiler optimization. I don't have a link handy.
There's also a proc macro crate that does it for you: docs.rs/momo/latest/momo/
AFAIK, this falls under the polymorphization wg's authority, which is still ironing out the last bugs in their analysis of functions that don't depend on their generics at all. the next planned step is merging functions that depend only on compile-time known properties of the generic type (layout, repr, const trait items?).
apparently, you can already try the fruits of their labour out with `-Zpolymorphization=on` and have (hopefully) only rarely miscomputed programs.
what you're describing is far trickier analysis that I haven't read anything concrete about anywhere, but I haven't looked for that info either.
Disagree on 1:55
Why? Because this kind of thing can happen just by accidentally forgetting the enum name in front of the variants, I had multiple bugs already because of this. The "use" line isn't the problem here imho, it's that the compiler only emits a warning. I always add the setting to turn details like this into hard errors.
I think this is one of the mistakes in Rust syntax, it makes pattern matching needlessly verbose and enables bugs. It's a situation where you're basically forced to add type annotations and leaving them out leads to a bug instead of a type error...
To make sure I'm understanding--are you saying that the fundamental issue here is that unreachable pattern isn't a hard error? If it were, we could fearlessly use wildcards?
I agree that unreachable patterns should be made hard errors, and it's probably a mistake that they're not by default. That error alone isn't enough to prevent all bugs here though--you can also just typo a variant name and then, if you're using the variant names unqualified, you've accidentally written a catchall pattern. Though if you then make CaseStyle violations hard errors too, you probably are fully covered.
@@_noisecode Pretty much, yes. And while I agree that wildcards aren't a good choice, I think if the matching syntax was a bit different nobody would have a reason to use them for enum variants in the first place. But then again the Rust devs aren't stupid and probably had reasons for this design.
Never realize these situations before, these may be part of "best practice".
I really appreciate these types of videos and definitely agree with all five.
Regarding your examples in other languages, I don't agree with changing `const unique_ptr &` to `T*`, I feel that should be `const T&` instead
While I generally agree with you, const T& isn’t a drop-in replacement for const unique_ptr&, which is why I used T* in the video. Like T*, unique_ptr can be null, and it doesn’t ensure that the T is const, both of which are different than the semantics of const T&.
#5 is the most interesting one to me. I've thought about it a bit in some of my unsafe C bindings. I'll have to try it
Oh wow, the unsafe_op_in_unsafe_fn lint and its implications totally flew under my radar. Apparently there was talk about making that the default in a future edition, which seems like a good idea, but apparently it didn't go anywhere. I wonder why.
Here's my favourite complaint about unsafe: Global assembly blocks not only don't need unsafe, but can't be marked with unsafe, and the only motivation I've seen is that they didn't want to add another context where the keyword made sense.
The effect is that you have to know about other keywords than unsafe to find things that can totally alter the behaviour of your programs in Rust source.
@@0LoneTech Isn't that just a misunderstanding? Declaring global assembly blocks doesn't require unsafe, calling into them does.
Just like declaring a raw pointer doesn't require unsafe, but dereferencing it does.
Either is perfectly safe - as long as you don't use it. And then you have to mark unsafe.
@@swapode Global assembly blocks can change behaviours without your Rust code specifically calling into them. As a proof of concept, I replaced stat64 within a hello world program using only text inside global_asm. I don't know why hello world contained a stat64 in the first place, but there we are.
Something that looks kind of like #4 is in Axum and is handled reasonably there. It's not quite the same because you generally won't be specifying an IntoResponse in your handler return types, unless what you're writing is some kind of wrapper around something else. But the interfaces you're coding to do so liberally under the covers.
So what's the reason for not putting a semicolon at 4:45? Is it because in this case, when we use the inner function trick, the intent is "we return unit here because our actual inner function returns unit"? As in, because the inner function trick is meant to be the thinnest possible wrapper around our monomorphic code?
That’s my reasoning, yup.
1:40 in addition to this there should be an option for enums to disallow use of "_ => {}" case completely.
I used Rust a long time ago, and now as Nim user this is funny to me. Nim forces you to either use or discard values (and I found that very useful to avoid a few mistakes) so 1 is irrelevant. Enums are always inferred from the type so there's no need to prefix them except when there's ambiguity, and there's no name for the default value (matching is done with case... of... and the default is else) so 2 is irrelevant. In Nim it's common to have "ref" part of the type, so 3 comes naturally as there's for example FooRef but not someref. You could always return "ref some[T]" but nobody does that.
4 and 5 are much more language dependent. Nim generics and concepts work differently than in Rust. In Nim you're allowed to use unsafe operations anywhere but they're so few and so rarely needed that in practice you can very easily find the parts that would have been marked as "unsafe" in Rust. Just search for /addr|ptr|pointer/. If you're not interfacing with C libraries and not doing some basic data type library, you shouldn't need to use unchecked memory.
this is a fantastic rust channel. keep up the great work
4:34 What do you mean by "monomorphization overhead"? The "inner function trick"?
Also, about use-ing the enum name with an alias: isn't there a syntax, e.g. like "with" in Python, that does this temporarily, over a single block of code? There should be, in my opinion. Then you could do (e.g.)
with use MyLongEnumName as E {
// stuff
}
// No longer bound to E after that
This should also be a zero-cost abstraction, since it should happen only internally while the compiler is considering each block.
Each time you call a generic function with a new argument type, the compiler has to stamp out a separate copy (monomorphization) for that argument type. This can lead to large amounts of code getting generated if your generic functions are long and/or complex. You can often do better; in the example from the video, the only part of the function that needs to be generic is the call to .into(), since the rest of the function afterward operates on the String. So--you can pull that String stuff into a separate, non-generic function, that you call after you've gotten the generic part over with. That means only the call to .into() needs to be stamped out for each parameter type, giving you minimal code duplication. The rest is a single, non-generic function (albeit defined inside the generic outer function for convenience).
As for the naming thing, Rust doesn't have special syntax for it, but you can just create a block and declare the alias inside it.
{
use MyLongEnumName as E;
// stuff
}
// E is gone
@@_noisecode I see! I didn't know it was called that, nor that you could put a function inside another function, nor that the use statement would take a block scope :) Thanks for your detailed answer!
2:50 underhanded rust competition when
Number 2 isn't a problem if your enum variants have fields. Even if they don't, you can just do `VarA {}` and it can never become a wildcard pattern.
until your coworker sees what appears to be useless `{}` and removes them!
I was thinking there might be a tiny reason to use &Vec sometimes because of the spare_capacity-functions, but there's only the mut versions. I guess that's reasonably, what would you even want to do with an &MaybeUninit.
Thanks for this EXCELLENT video! Very great points to think about!
Point 5 is a difficult one to form hard opinions on, but also really interesting. I tend to disagree with your opinion the following reason,
I expect the reason for NOT duplicating the unsafe keyword is that the unsafe function as a WHOLE should be considered unsafe and therefore ENTIRELY be kept as small as possible. However, my eventual opinion will also depend on whether someone can convince me that you have a case where you really need an unsafe function to be larger than the actual unsafe piece. I am curious.
Maybe the conclusion will really be that we simply need some new language elements for this. Maybe there actually should be a *safe* keyword, or something like that. Really interesting.
i really like the semicolon one, something i hadn't thought about
I used to do #2, never really had issues with it. But one day I decided that its more readable when fully-qualified, and saving 30 seconds of typing wasnt enough justification to keep doing it.
I would set up a lint so that leaving out a semicolon is always complained about, except maybe in in-line blocks.
2:50 You could do this which eliminates that problem while reducing the annoyance of typing a long enum name.
enum Enum {
A,
B,
}
fn main() {
let val = Enum::A;
type e = Enum;
match val {
e::A => {},
e::B => {},
}
}
Didn't see the part after, I am dumb
Hey, can you do a video on Ownership and Borrowing in Rust? Would love to know your thoughts.
Fantastic video as always.
What do you wanna know about em??
@@_noisecode All the different rules that the Rust compiler enforces and the rationale behind it
He's back at it y'all.
On #2, this can be an issue when matching on an integer against imported const values (typically in ffi-related code). I like to #![forbid(unreachable_patterns)] in all my projects, personally, and honestly believe that should be a hard error by default.
On #4, one exception I find acceptable in limited circumstances is returning 'impl Drop'. If the scope that the drop is relavant exceeds a module, then I'll probably give it a named type, but in a relatively local scope I think '-> impl Drop' can be okay for wrapping something that just needs to have some drop glue attached and nothing else.
Note that "impl Drop" and "has some drop glue attached" are entirely different things. (Well, the former implies the latter, but still..) Also, realistically, a `T: Drop` bound should never ever ever ever be written, because most people *would* misinterpret it, and it's a useless constraint anyways. Same goes thus for "impl Drop". The standard library doesn't come with any "implemented for all types" kind of dummy trait that you could use instead, but perhaps you should just quickly define and use such a trait instead.
Whenever I write code in a language that doesn’t have import/include/use aliases, I really miss it. Not just for things shown in the video but for other reasons like avoiding naming collisions etc.
im curious about the enum pattern matching issue, why didn’t rust choose to differentiate all bindings from types by making all types start with uppercase and viceversa? in haskell this issue is caught by the compiler, if you put an uppercase binding in a case, it would fail to compile complaining about an inexistent constructor…
Avoiding shortcuts is a must in my opinion. I rather write
let res_a = a();
res_a
Very insightful! Thanks for this
I actually setup my Clippy so that it requires the explicit `return` statement. Yes, it's more verbose, but I find it easier to read. Single semicolon is in my opinion too easy to miss for something as important as returning.
Even inside closures?
@@pcfreak1992if the closure is anything more than just one simple expression or function call, then yes.
Number 6: Always specify the type of unused variables.
Example: do you see the bug below?
```
async fn call_api() -> Result { /* ... */ }
async fn my_code() {
// we don't care about the error, so discard it
let _ = call_api();
}
```
You fix the bug by doing this:
```
let _: Result = call_api(); // compilation error
let _: Result = call_api().await; // now it's fixed
```
The resulting code looks ugly, but it saves you from a painful, hard to find bug.
Awesome protip and example. I like this one a lot.
@@_noisecode thanks. And btw I loved your video and happy to see content like this! Number 2 blew my mind.
I think #2 should just be generalized to: "Never use star imports." All that star imports do is introduce name shadowing down the line. Sometimes i even want to completely disable non qualified imports altogether.
Star imports should usually be avoided, but prelude modules are made to be star-imported. I think those are the one exception.
`#[deny(clippy::wildcard_imports)]`
@@thegreatb3 is anyone still usin those? I remember them being all over the place a few years a ago, but it's been a long time since I last saw a library exporting a prelude module.
Can you explain a usage of , in the last case of a match?
For number 2, is there any specific reason to use `use VerboseName as V;' instead of `type V = VerboseName;', or does Rust interpret these the same way?
The no semicolon indicating a returned value is a way too subtle way to indicate something important.
return x; #4eva
Valuable things to point out, thanks a lot!
What is monomorphisation overhead? Is that the increase in compile time caused by the compiler monomophising code?
the idiomatic ways to write discard in rust are:
let _ = a(); // or _something if you want it to drop at the end of the scope!
drop(a()); // explicit drop, same as _ (not _something), drop is the fn that converts anything -> (), so its convenient in match expressions for example
it is NOT just a semicolon, and if the behavior you are trying to differentiate is that important, please use drop() or even return. in most cases its not as important, so then people can use semicolon or none
Please don't call `drop()` explicitly as a habit. "I will convert anything this does into a unit" is a dangerous antipattern with a potential to miss unhandled errors (or any other `must_use` types) which will leave you with weird behavioural bugs. Use a semicolon to discard a return value you don't care about (so that rustc has a chance to warn you when the signature of `a()` changes), or handle the `must_use` value properly, even if it is just calling `.ok()` on a `Result` that legitimately does not matter.
Implicit return as shown in number 1 was a design mistake in Rust. Why load all this important semantic baggage onto something as ubiquitous and automatic as a semicolon? Would it really hurt that much to just explicitly invoke a return keyword? It strikes me as extra strange when Rust otherwise prides itself on how explicit it is in everything. In my own Rust code, I always return explicitly.
For 4: in axum there is returning impl IntoResponse which can help with flexibility working with HTTP responses. Is this considered a bad pattern?
I was thinking a lot about this exact case from axum while working on this video. I think it's fine to use impl IntoResponse in that case, since it's part of the core design of axum; axum is the one calling your function, and axum is specifically designed to call all the .into_response() stuff for you internally. In other words, the problem with 'infecting' call sites doesn't really apply, since axum already ships the full set of all those call sites as part of what it provides to you as a library, and it pre-pays the cost of calling .into_response() at all those call sites in order to make the library easier to use on the user's end. It would be annoying, on the flip side, if axum provided a bunch of public APIs that returned `impl IntoResponse`, since that would mean you would have to call `.into_response()` all the time in your code.
Not sure how well I articulated that, but I appreciate you bringing this up since it forced me to write out some thoughts that were nebulous before. TL;DR use IntoResponse with axum because axum is specifically designed with this pattern in mind.
@@_noisecode "if axum provided a bunch of public APIs that returned `impl IntoResponse`"
What I saw is at least one library returning "impl Future" which made me realize library implementors can't use async blocks if they want a reasonable API.
Also, warp, for example, uses opaque types so they might as well be impl Filter (it also uses private traits to define the bounds in public API which I'm surprised is allowed).
I just saw someone do { stuff; } instead of { stuff } and that's how they discovered that () implements IntoResponse. If you can spell out the type, try.
I didnt even know that method number 2 exist
I disagree with no. 4: you should return `impl Into` if you have that value anyway, because the `into()` might not be called. Similarly, an `impl FnOnce` might be called later or conditionally and not immediately after it was returned. Not returning `impl Into` will probably only hurt performance because an out-of-memory error is unlikely but not returning `impl FnOnce` actually limits the logic of the caller.
1. 100% agree
2. i agree that this is a problem, but i feel like your recommendation is the wrong way to solve the problem. the fact that this can even happen is a language design flaw, and the correct way to solve it would have been to use a different syntax for wildcard patterns so the problem cannot occur in the first place, but i know this is no longer possible as it would be a massive breaking change. i know its a very silly bandaid solution, but isnt there some #[deny()] that could be applied locally at match blocks that turns naming convention violations into hard errors? at least for scenarios where giving up on Enum::* would result in massive verbosity this could be an alternative worth considering
3. i agree with the point that this is trying to make, but the "'never do X" phrasing used here is overshooting the goal. there are definitely scenarios where it does make sense to use them, especially with generics like &T
4. 99% agree. in scenarios where a function performs a desired side effect but returns something you dont need this could in theory serve as a performance optimization as it gives you the option to not perform the conversion at all. but i admit this is very nieche and usually a result of insufficient separation of concerns
5. i'm not sure about this one. need to think about it for a while
Re: unsafe fn, I do like that lint, but I understand why it's an option and not the default. If you're writing an unsafe function, the expectation is that this function will contain unsafe calls (but not always!). If you're writing a library that, say, interacts with a massive external C library, requiring unsafe { ... } inside unsafe fn just adds a lot of noise inside every function.
However, I do still think it's the right thing to do, because I think the goal with Rust is to reach unsafe-zero, so every line of code that is pulled out of unsafe and into safe is a win.
May be we need lints for this?
I like how these vids are to the point informative and points out how much you need to refactor your shit
A better solution for 2 is to work out all the warnings.
Great content as usual.
3:42 you bastard 😂
Oddments that are not well explained and justified will eventually drive away usage, and stimulate someone to come up with a better answer.
programmers have such skill for professional bikeshedding huh
Емко, но по делу, братка, жму тебе руку и сердечно благодарю за столь качественный контент
Awesome content
You are the king💪
4:40 I'm no rust pro but wouldn't it be better and easier to just do let s: String = s.into();
In rust, functions are different from closures and so don't close over their environment. You need to pass the value in.
I get semi colons but Htf do our lsp’s not have auto correct and put them in on save! Drives me baaaanananananans. Only thing I like more in go just saying
Size doesn't matter; semicolons do
When would you ever want to use a wildcard? I respect that there could be a reason, such as if you were doing something funky with modules, but any pattern I can come up with would be better and more idiomatically handled without it. I've never seen someone suggest a good usage. For a language so dedicated to being exhaustive, what was the rationale for including this?
I think the best use case is typically with `use some_crate::prelude::*;`, where `prelude` is a module that the crate exposes that is specifically designed to be used with a wildcard. `use super::*` inside `mod test` can be nice too. Otherwise yeah, I think it's typically a bad idea.
@@_noisecodeI agree with your statement, but I think there might be a better way. Like maybe some way to mark particular modules as safe to wildcard on.
I've just spent 20 minutes fixing a bug caused by the absence of ";" 😮
Good vid
This is why I hate the "syntactic sugar" that allows returns without a return keyword and semicolon.
I have an issue with point 1. Even with semicolon ommited, you'd still have to specify the return type of a and b separately. If you really want to couple a() and b() together, you could put a() into a trait (say trait A) and have it return it's associated type A::Output, then b() could return A::Output and it would always be the same as output of a()
Why would you create a trait for a one-off thing like that? To silence the compiler? You'll probably _want_ it to yell at you if the return type of `a()` (but not `b()`) changes so that you have a chance to consider the implications rather than just blindly forwarding the result.
@@razielhamalakh9813 why not? Just make the trait private and it won't be a big burdain. To clarify, I don't think binding two functions together is a good idea, but in case we really need to, the semicolumn/no semicolumn thing is pretty useless as you'd still have to update the return type
@@BulbaWarrior why not is because you're adding complexity under the assumption that a() and b() should always return the same type and that b() is defined in terms of a(), rather than a() being an implementation detail of b(). When the return type of a() changes, while it's possible that just the return type of b() should change, it's far more likely that it needs a logic change as well, either in the setup before it calls a(), or perhaps it needs to process the a()'s return value now. And if b() is only there to call a() and return its result, b() probably shouldn't exist in the first place.
@@razielhamalakh9813 Once again, I am not saying that it is a good approach. All I'm saying is that the semicolmn/no semicolumn argument stated in the video makes no sence, and I show how the relationship between return types CAN be expressed in rust IF you actually really need it.
Based once again 😎
i wish i were good enough at rust to care about things this meaningless 💀
Call me old fashioned, but I’ve never really liked how Rust returns the last thing from a block if there’s no semicolon. In the example you give, explicit return statements make the behaviour and intent way clearer:
return a();
vs
a();
return;
Fair point! I like the "cuteness," conciseness, elegance of omitting the explicit return, but I can't argue against your reasoning here.
It's a lot better behaved than the language that inspired it, where every line gets _printed_ if it doesn't have a semicolon.
@@KaneYorkWhich language is that?
@@_noisecodePersonally (as a primarily non-rust person) I don't think having a random variable name at the end of the function is elegant at all. First of all instead of always being able to see what is being returned you now have to check "is there a semicolon or no?" (which is more error-prone). And second having a random "j" at the end of the block just looks goofy.
@@KaneYorkWhat? The idea is that you think of the semicolon as an operator just like +, -, &&, etc. So a;b is a left-associative operator with less priority than every other operator that returns b but first executing a. When you end a block with a semicolon or have two semicolons in a row, syntactically you are inserting empty statements which evaluate to ()
You listed a number of bad design choices. The one related to the semicolumn `;` is absolutely terrible, why in the hell did they do that? It is so misleading. I disagree with you about `unsafe`. If you declare a function unsafe, I don't see why you should repeat that in the code of the function.
unsafe function doesn't even need to contain unsafe code. All it has to do is break some invariant that some unsafe block somewhere else relies on.
I think that ret should be explicit in rust everyware, since ret is more visible and verbose than semicolon. it is even better for example to codereview
1. "Imagine if the return type of a changed to i32" is just bizarre. When would the return type of a function change like that? And if you change the return type of a function, you are changing its meaning so much that you should probably rename it and change every single call-site.
2. "Look what happens if you remove VarC". If you remove a variant from an enum, you should obviously search for all its uses first. Why would you remove a variant from an enum anyway? It's just an absurd suggestion. I don't think I have *ever* removed a variant from an enum. It's also clearly a bug in the compiler that it allows you to have a variable name in a match called 'VarC'. Haskell got it right when it made CapitalisedNames and unCapitalisedNames *syntactically* distinct. A variable *cannot* start with a capital letter in that language. It's not meant to in Rust either, so why allow it to? Bad decision.
This is what you sound like:
"Why would you ever edit your code? If you have to edit it, that means you shouldn't have written it that way to begin with!"
Specs change. People learn. Features get added and removed. Rustc provides a massive set of useful diagnostics specifically so that you don't have to search for every use of an enum variant or a function call. _The compiler will tell you._ All you have to do is not hinder it.
cant we just use return like normal people?
That works if your function just uses a single return statement. However, you could also do this "let value = if some_condition { "foo" } else { "bar" }". If you omit the return, you return the value from the block. If you use return you return from the enclosing function.
Forcing a "return" at the end of a function makes it more complicated. Use "return" to indicate an early return from a function, not just as a generic return.