Never do that in production if you use some kind of orchestration or target PC is unknown (like you're about to distribute your binary). Calling unknown instructions (say your rustc compiled a binary with AVX512 support) is UB You can definitely turn on AVX2 or some other old ISAs that are covered by 99.9%+ processors, but cpu-native enables every single feature for the compiler that YOUR processor supports
Mind you, optimizing for size *will* negatively impact your runtime speed. You're essentially telling the compiler "I don't care how fast it runs, I just want it to be small". So this may not be what you want. Stripping symbols from the library also means bug reports from your users will be very, VERY meaningless to you. This may be a worthwhile tradeof, or it means trading 600kb of storage size for 3 weeks of trying to figure out what exactly the issue is. My tip for you is to compile with different flags and see if the tradeoff is worth it for you.
I wonder if there's a way to split the debug information and process it separately. When i shipped a larger C++ application, its debug hunk was over 150MB in size, going back to 15 years ago when shipping over 70MB would have been outrageous, so of course we wouldn't ship it. We had a sever process crash reports first based on map files to annotate the stack trace; and eventually Google Breakpad which splits the debug hunk from the shipped software. For Visual Studio builds, which were 98% of the userbase and also most of the issues (Linux users could compile their application from source and produced bug reports which were basically conclusive), we could also just keep the PDB files on developer PC to inspect the minidump in a full debugger. Also debug hunks turned out to compress REALLY well, like 10:1. So we stored like a hundred relevant ones on the server all compressed and batched the hourly/nightly processing per crashed binary and kept only one at a time decompressed. While you might not want to compress your executable for several reasons while it's in use, compressing your debug hunk may quite make sense.
A server serving json payloads with serializing and input deserializing and binding to the port in less than a floppy disc. Also parallel request threads.
The trippiest part is when you start using optimizations in webassembly. Using no-std for your low level functions means you can get functions that are less than a kilobyte of size on your javascript.
I was just wondering that. I've been looking into wasm recently. Would it be because functions are linked by name, so stripping symbols would break that ?
have you done some bench-mark about speed for your project? I am curious to know what is the affect on the run-time if we optimize the size of the binary
Cargo can strip. I switched to it. There are a few more settings you can do: [profile.release] opt-level = "z" codegen-units = 1 lto = true panic = "abort" strip = "symbols" And if you want, you can also compress with UPX, which makes application about 300kb small. But UPX takes a few milliseconds or so hit on execution time, which is usually not a big deal. I decided to not use UPX anymore. 700kb are not bad and there is basically no benefit in making it smaller.
@@keineahnung1919 upx also has a feature to decompress the binary, which some/many anti-virus tools use to inspect the real binary. However, the Windows Defender does not and simply flags it as suspicious. Edit: If you're building you own server-side program it's still fine to use upx to reduce package/image size. If you have an AV in the toolchain it might need to be configured though.
It's in many situations a false saving. You ship the application compressed anyway, like a zip archive, compressed tarball, or compressed installer, and a UPX compressed binary cannot be compressed any further. It's also an overhead since an uncompressed executable can be loaded efficiently. It's not completely ingested by the operating system, instead the file is memory-mapped into the process addres space and will be background-streamed and will execute successfully well before the whole application has been ingested from disk, and the runtime linker does bare minimum fixup to make the application load the libraries and run using Copy on Write logic to make the changes, allocating only a handful pages of physical memory. When several instances of an application are running, they share the read-only pages of the executable as physical RAM. When memory pressure occures, rarely-touched read-only pages can be swapped out of physical RAM without writing them to swap storage, instead they can be re-read from disk on demand from their original location later. And yes this is why you can't modify or delete an executable on Windows while it's in use. On Linux you can delete the directory listing of a file but the file data remains locked on disk until all applications that map the file terminate. In turn the same mechanism as used for executables is used for DLLs or shared object files as well. A UPX application is just a stub several kilobytes in size from the point of view of the operating system, and then just some tail junk in the file. The UPX stub reconstructs the application in newly allocated memory by reading the tail of the exectable file, which increases the startup cost and means the application cannot be memory shared and the zero cost swapout is impossible as well.
Yes, it does. Only do that if you're 100% sure that you don't want any feedback on crashes though (i.e. if you're confident your code will never crash).
I guess --release flag does most of the jobs for me as I'm a web dev and I'll be majorly using Rocket for developing backend and I don't see a point to compromise on run-time in anyway to just get smaller binaries. maybe it might be important if I would develop something for tiny devices in IOT where storage space might be very less (I don't have experience in electronics or IOT but I'm assuming that our program needs to be stored somewhere in that device and the place where it's being stored has a limited storage space so there having smaller binaries makes sense)
Can someone explain to me what the rust str type actually is, I though it was just a struct or a trait, but I can't find a definition in the rust std library source code. I heard it being called a primitive, but is it really a compiler primitive like i32?
The information i've seen states that a str is basically identical to a [u8](a slice of u8s) with the added requirement that the slice must represent valid UTF8. Rust considers them a primitive type just like other slices but then it considers tuples and arrays primitive types so I feel like what they mean by primitive isn't necessarily what you mean. As I understand it they are a compiler builtin (which is to say, they're a special case) which is probably why there's no definition for them. Also note that they're considered an unsized type so you can only use them via a reference (hence why you usually see &str although Box or Rc would be legit too).
@@dantenotavailable thanks for the great explanation. I thought the struct looked something like this : struct str { len:u8, data:[u8] } Which is (as far as I know) valid rust and results in an unsized type where the size is defined at runtime when the array is defined in size. I wonder why they decided against such implementation, maybe the unsized struct feature didn't exist in 1.0?!
@@redcrafterlppa303 I couldn't say for sure but note that a [u8] is a slice of u8 (not an array, which would be [u8; $len]) which already contains a length value so the len:u8 should be unnecessary. It's difficult to find where slices become specifically defined like that but it's definitely pre 1.0, somewhere around 0.10 or 0.11 perhaps.
@@dantenotavailable ah yes I didn't realized i wrote array there probably because of the square brackets. I don't use slices very often. Thanks again for your help 😉👍
@@redcrafterlppa303 You basically described the `String` type. Strings are stored on the heap, whereas string slices `str` are stored on the stack. `str` is effectively an array of `u8`'s on the stack, and a String is a struct that stores the length and the pointer to the data on the stack and the actual data on the heap. They did not put `String` on the stack because it is way harder to make a dynamically sized data type that is constantly being rewritten on the stack, so it's much easier to keep it on the heap (think array vs Vec), and as for why they didn't make `str` like that, that's because having `str` be a struct would just add unnecessary bloat since the length is still known at compile time anyways
Because Rust doesn't have a stable ABI, dynamically linked libraries don't play well with generic code, and most systems don't have every version of std available for arbitrary Rust programs to link to without the application packaging it anyways.
@@dilawar_uchiha Even Python fake-compiled to a self running/extracting binary with the Python interpreter included is smaller. But really, the filesize is overrated in my opinion. There might cases and environments when this is important, but it shouldn't be too much of a reason to use or not use language in my opinion. But to be honest, as a modern language that Golang is, it is quite strange that the filesizes are that big.
📝Get your *FREE Rust cheat sheet* :
www.letsgetrusty.com/cheatsheet
thanks! I almost missed it in the video...
Bro, I don't get the email from your website. Yes, the email ✉️ is right and I checked in all folders 📁.
@@character640p never give your email like that, it's not safe! Here it's obviously just a trick to collect people's emails
you forgot about the "cpu-native" flag to compile the binary to a specific cpu architecture.
Ooh this is a good one
Holy shit
Very useful for aws lambdas
Never do that in production if you use some kind of orchestration or target PC is unknown (like you're about to distribute your binary). Calling unknown instructions (say your rustc compiled a binary with AVX512 support) is UB
You can definitely turn on AVX2 or some other old ISAs that are covered by 99.9%+ processors, but cpu-native enables every single feature for the compiler that YOUR processor supports
sending this to all my non-binary friends/enemies
You mean your fellow cult members.
@@Reichstaubenminister wanna join the cult?
A nazi saying someone else is in a cult sure is rich.
*scared*
Explain
Mind you, optimizing for size *will* negatively impact your runtime speed. You're essentially telling the compiler "I don't care how fast it runs, I just want it to be small". So this may not be what you want.
Stripping symbols from the library also means bug reports from your users will be very, VERY meaningless to you. This may be a worthwhile tradeof, or it means trading 600kb of storage size for 3 weeks of trying to figure out what exactly the issue is.
My tip for you is to compile with different flags and see if the tradeoff is worth it for you.
I wonder if there's a way to split the debug information and process it separately. When i shipped a larger C++ application, its debug hunk was over 150MB in size, going back to 15 years ago when shipping over 70MB would have been outrageous, so of course we wouldn't ship it. We had a sever process crash reports first based on map files to annotate the stack trace; and eventually Google Breakpad which splits the debug hunk from the shipped software. For Visual Studio builds, which were 98% of the userbase and also most of the issues (Linux users could compile their application from source and produced bug reports which were basically conclusive), we could also just keep the PDB files on developer PC to inspect the minidump in a full debugger.
Also debug hunks turned out to compress REALLY well, like 10:1. So we stored like a hundred relevant ones on the server all compressed and batched the hourly/nightly processing per crashed binary and kept only one at a time decompressed. While you might not want to compress your executable for several reasons while it's in use, compressing your debug hunk may quite make sense.
"Stripping symbols from the library also means bug reports from your users will be VERY meaningless for you"
Not if they give you a core dump.
A server serving json payloads with serializing and input deserializing and binding to the port in less than a floppy disc. Also parallel request threads.
The trippiest part is when you start using optimizations in webassembly. Using no-std for your low level functions means you can get functions that are less than a kilobyte of size on your javascript.
Be advised: "strip = true" will break wasm builds.
I was just wondering that. I've been looking into wasm recently. Would it be because functions are linked by name, so stripping symbols would break that ?
Interesting
I was aware of the release mode option but the rest of them were a godsend, went from 3.2mb to 977kb. Awesome stuff man! Thanks and keep it up
have you done some bench-mark about speed for your project? I am curious to know what is the affect on the run-time if we optimize the size of the binary
Don't use optimize for size (opt-level), as it makes the app run slower
@@obj_obj the difference is minimal anyway
If you kept it going till now you have all the respect that I can give
I was already compiling in release mode, but by using strip and lto my binary size went from 9.3M to 347K.
Great tips. I got our binary from 4.5Mb down to 1.5Mb. 33% the original size!
python developers: "hmm, my python venv has only 700MB, its nice" meantime rust developers: "Oh no this 1.5MB dependency is so big"
1.5MB is very, very big.
Python absolutely sucking ass does not mean 1.5MB for a small dependency is acceptable.
I would just recommend `sudo strip --strip-all `. Really makes a difference and seems to be way more easier than other options. Still, great video
Cargo can strip. I switched to it. There are a few more settings you can do:
[profile.release]
opt-level = "z"
codegen-units = 1
lto = true
panic = "abort"
strip = "symbols"
And if you want, you can also compress with UPX, which makes application about 300kb small. But UPX takes a few milliseconds or so hit on execution time, which is usually not a big deal. I decided to not use UPX anymore. 700kb are not bad and there is basically no benefit in making it smaller.
@@thingsiplay is cargo strip already in the stable release?
Why sudo?
@@sohn7767 Yes, I am using stable on Rust/Cargo.
Could you explain more about what striping symbols from the binary means and why someone would want that?
They are needed during debugging. After you strip the binary debugging symbols are deleted
You could also take out stack unwinding and some other safety features but idk if it would be substantial.
Using UPX (an executable packer) often lets you strip off another 30% or so.
Many antivirus programs gets triggered if you do that unfortunately
@@keineahnung1919 upx also has a feature to decompress the binary, which some/many anti-virus tools use to inspect the real binary. However, the Windows Defender does not and simply flags it as suspicious.
Edit: If you're building you own server-side program it's still fine to use upx to reduce package/image size. If you have an AV in the toolchain it might need to be configured though.
It's in many situations a false saving. You ship the application compressed anyway, like a zip archive, compressed tarball, or compressed installer, and a UPX compressed binary cannot be compressed any further.
It's also an overhead since an uncompressed executable can be loaded efficiently. It's not completely ingested by the operating system, instead the file is memory-mapped into the process addres space and will be background-streamed and will execute successfully well before the whole application has been ingested from disk, and the runtime linker does bare minimum fixup to make the application load the libraries and run using Copy on Write logic to make the changes, allocating only a handful pages of physical memory. When several instances of an application are running, they share the read-only pages of the executable as physical RAM. When memory pressure occures, rarely-touched read-only pages can be swapped out of physical RAM without writing them to swap storage, instead they can be re-read from disk on demand from their original location later.
And yes this is why you can't modify or delete an executable on Windows while it's in use. On Linux you can delete the directory listing of a file but the file data remains locked on disk until all applications that map the file terminate. In turn the same mechanism as used for executables is used for DLLs or shared object files as well.
A UPX application is just a stub several kilobytes in size from the point of view of the operating system, and then just some tail junk in the file. The UPX stub reconstructs the application in newly allocated memory by reading the tail of the exectable file, which increases the startup cost and means the application cannot be memory shared and the zero cost swapout is impossible as well.
Setting panic = "abort" in the release profile should reduce the size as well, no?
Yes, it does. Only do that if you're 100% sure that you don't want any feedback on crashes though (i.e. if you're confident your code will never crash).
@@carlosmspk I think you've already given that up if you're using strip. And if you're debugging a crash, you should really run in debug mode anyway
I've never seen so much advertising put into a cheat sheet
This is one of your best videos.
Great video! Packed with very useful info
if you use nightly rust, try adding a " -Z build-std=std --target your_target" flags
cargo build -Z build-std=std --target x86_64-pc-windows-msvc --release
Will switching from generics (that get monomorphized) to trait objects reduce binary size?
yes, but can affect the runtime performance
It depends, but in the common case reduction will be negligible. Also dynamic dispatch has runtime performance cost.
Still a massive problem, with release implementation just "fn main()" takes up 130kb with msvc, in cpp, it's only 11.
because c++ is linked to libc. Rust doesn't do that for the sake of portability.
Ya I noticed my little program that uses clap and reqwest very quickly and suddenly became 180MB (debug) lol
is the strip option the same as running the strip command on the compiled binary?
Pretty sure it is
Awesome!!! Thank you very much!!!)
Just getting into Rust APIs. Been pushing 600mb NODEJS docker containers to prod 🙈🙈
I guess --release flag does most of the jobs for me as I'm a web dev and I'll be majorly using Rocket for developing backend and I don't see a point to compromise on run-time in anyway to just get smaller binaries.
maybe it might be important if I would develop something for tiny devices in IOT where storage space might be very less (I don't have experience in electronics or IOT but I'm assuming that our program needs to be stored somewhere in that device and the place where it's being stored has a limited storage space so there having smaller binaries makes sense)
If your program gets very large binary size becomes important anyways. Instruction cache is not infinite nor very large.
Thank you so much bro. Sending virtual hugs. Worked like a charm ;-)
Can someone explain to me what the rust str type actually is, I though it was just a struct or a trait, but I can't find a definition in the rust std library source code. I heard it being called a primitive, but is it really a compiler primitive like i32?
The information i've seen states that a str is basically identical to a [u8](a slice of u8s) with the added requirement that the slice must represent valid UTF8. Rust considers them a primitive type just like other slices but then it considers tuples and arrays primitive types so I feel like what they mean by primitive isn't necessarily what you mean. As I understand it they are a compiler builtin (which is to say, they're a special case) which is probably why there's no definition for them. Also note that they're considered an unsized type so you can only use them via a reference (hence why you usually see &str although Box or Rc would be legit too).
@@dantenotavailable thanks for the great explanation. I thought the struct looked something like this :
struct str {
len:u8,
data:[u8]
}
Which is (as far as I know) valid rust and results in an unsized type where the size is defined at runtime when the array is defined in size. I wonder why they decided against such implementation, maybe the unsized struct feature didn't exist in 1.0?!
@@redcrafterlppa303 I couldn't say for sure but note that a [u8] is a slice of u8 (not an array, which would be [u8; $len]) which already contains a length value so the len:u8 should be unnecessary. It's difficult to find where slices become specifically defined like that but it's definitely pre 1.0, somewhere around 0.10 or 0.11 perhaps.
@@dantenotavailable ah yes I didn't realized i wrote array there probably because of the square brackets. I don't use slices very often. Thanks again for your help 😉👍
@@redcrafterlppa303 You basically described the `String` type. Strings are stored on the heap, whereas string slices `str` are stored on the stack. `str` is effectively an array of `u8`'s on the stack, and a String is a struct that stores the length and the pointer to the data on the stack and the actual data on the heap. They did not put `String` on the stack because it is way harder to make a dynamically sized data type that is constantly being rewritten on the stack, so it's much easier to keep it on the heap (think array vs Vec), and as for why they didn't make `str` like that, that's because having `str` be a struct would just add unnecessary bloat since the length is still known at compile time anyways
Loving the channel and videos
Здорова, Богдан. Спасибо за тутор
Why can't std be dynamically linked to librs?
Because Rust doesn't have a stable ABI, dynamically linked libraries don't play well with generic code, and most systems don't have every version of std available for arbitrary Rust programs to link to without the application packaging it anyways.
Why isn't stuff like LTO on by default for a release build? Compile time shouldn't matter for a release.
I downloaded the program but everytNice tutorialng still says trial version on it??
thanks for the info :)
it's not TINY it's average 🥺🥺🥺
That's what he said.
you need to reduce the bass in the voices, it is way too high.
Does Tommy have a youtube channel?
amazing!
Aweso tutorial but I dont have a snare anywhere on my list. Wtf
I tried these options but the binary just got 4kb larger
Seems like it works in reverse for you, try adding more bloat functions to the code
*laughs in golang binary sizes*
In what whey laughing? Are they bigger or smaller?
@@thingsiplay absolutely massive
@@dilawar_uchiha ugh bigger than Rust is quite an achievement.
@@thingsiplay in some cases i has go binaries going like 35mb, once it went like 50mb too but usually my project hovers around 13-25mb
@@dilawar_uchiha Even Python fake-compiled to a self running/extracting binary with the Python interpreter included is smaller.
But really, the filesize is overrated in my opinion. There might cases and environments when this is important, but it shouldn't be too much of a reason to use or not use language in my opinion.
But to be honest, as a modern language that Golang is, it is quite strange that the filesizes are that big.
2:00 The voice recording is way too compressed that it hurts no matter the volume setting...
Да ладно! Так можно было? )) Обалдеть.
superbe!
the whole tNice tutorialng but then you have a solid foundation.
The "where did it go" part is so ryt now
After all that, try to reduce even more with UPX (Packer for eXecutables). I’ve never used with rust but should also work.
W
Turns out they aren't for . SO I hear literally notNice tutorialng inside soft...
hello :)
thanks for your chee-chee
I do not use Cargo, so it makes sense to release a video based on rustc.
🤡
Yay you took my (amenesiaphotography) suggestion and now I see this video ❤️ this!
Just reduced my VLI tool binary by 10X and 21X with upx. H*** F**! This is awesome.
extensions/tutorial/free/most-popular