You could have said "Dynamic Library", i genuinely thought Rust Foundation messed up something big again about Libraries in Cargo like they did about the brand / logo last time.
I'm not even far into the video, and I realized he was talking about shared objects. If you link all your C code with the equivalent '.a' libraries, those programs all wind up pretty huge too.
Btw, unsafe Rust code does NOT disable borrow checker. It allows unsafe function calls, raw pointer dereference and stuff like that, but does not disable borrow checker
What he means is that if you wrote a safe function in Rust and I wanted to use it, it will only work if I have your code in my project. You obviously cant pass rust objects to foreign code like that in a C DLL
Specifically, the unsafe block allows: * Calling unsafe functions * Dereferencing raw pointers * Accessing a static mut * Accessing members of a union And I think that's all! (assuming I remember correctly 😅) Notably, however, unsafe blocks are not the only way to create unsafe Rust: implementing an unsafe trait, using the #[export_name] / #[no_mangle] attributes off the top of my head, but there's probably others.
@@SimonBuchanNz unfortunately currently you can produce unsafe code even without all that stuff if you abuse some of known compiler bugs. Although I consider them as not something you can face in the real production code
FYI, Rust *absolutely does* have support for dynamically linked libraries, including all Rust features. The problem is that the ABI is unstable across compiler versions, not that it doesn't exist.
@@falxonPSNnope, it actually means that rustc actually does things as opposed to cc. Also, have you _tried_ to move a C program compiled against a version of libc to another system with a different version?
No, the real real problem is that traits are monoporphized at compile-time. So you can't do that can't do that in a dynamic lib, since traits aren't really "real". The dynamic lib could include a fixed set of instantiated objects, but no more than that. C++ has the same "problem". Templates libraries have to be compiled in.
can confirm. the first build takes a minute because you have to compile the crates, but cargo only recompiles as far as what has changed since last time.
@@catlolisNope, just true, during development its almost unnoticible unless you're doing some kind o hot reloading like in the JS world. Its a bit misleading for someone who doesn't know Rust when hearing about compile times because they assume its slow all the time. The real problem with Rust compile times is when compiling to production, if you have some CI to run you're not going to have a good time.
"if i pass a mutable reference into a compiled binary, how is it possible for the borrow checker to make sure the reference is used in a way that is safe" the borrow checker would've already made sure of that when the library binary was compiled, then, the checker in your code would look at the function signature to enforce the rules at your side. if both sides were checked individually, and the interface contract (function signature) is respected, it should just works as if you checked them at the same time
@@stephenreaves3205 No it can’t. But that’d just be a compromise you’d have to make. Dynamic libraries are shipped as pure machine code with basic linking information. You couldn’t tell such a library was ‘safe’ much the same way you can’t read the source code of a proprietary DLL you load with your C++ program.
@@stephenreaves3205 I mean the whole idea of DLLs is that you have to trust the linked library, and that's why injecting / modifying DLLs has been a way for bad actors to hack programs. Trusting that the DLL's borrow checker did good job is functionally the same as trusting that the installed libc DLL won't blow up your PC when you use it
@@stephenreaves3205 That's the trade off, it's the same with foreign/precompiled C objects, you never know if the code is valid and UB-free. As C compilers don't check that as intensely as rustc it's not that apparent in C terms of course. GPG signed libraries/objects as with package managers on Linux could be a compromise to validate secure sources maybe.
@@baileyharrison1030 but I think Rust has pigeonholed themselves because C++ doesn't have a safe/unsafe mode and Rust. By their own logic, I think they'd have to make every dll or so as unsafe, which, at the very least, will be annoying as hell
I don't see why the borrow-checker would not be able to work across boundaries for other libraries written in Rust. If the functions within the Rust library violate borrowing rules then the library shouldn't compile in the first place, right?
Since there is no stable specification for how these things are compiled, you would get undefined behevior between the versions that they were compiled on.
Borrow checker works on function scope, it needs to see the function signature and all the signatures of functions/types used inside. So yes, it can work as long as full function signatures with full type information (mutability, lifetimes) are available. The hardest is generics, because any compilation that involves generics is deferred until the actual use site. I suspect it would make sense to severely limit the rust type system for this kind of abi, because otherwise you would need full blown rust compiler to parse all the type info.
One implication for ABI is the binary library you are calling isn't necessarily written in the same language, that's why there're a lot of languages that use C ABI. The task here is not only to make sure both binary can interface each other if they are both compiled from Rust, but essentially redefining what constitutes fundamental Rust typed interfaces even exists in binary form so that the higher order data types (like the UTF8 strings, and enums he mentioned) can be understood by each other. And it means one or both sides may not even be compiled from Rust at all. It may be a future rust compatible language. Like how we use C ABI with modern languages, instead of just C. And these languages may not have a borrow checker at all.
@@thisisnotok2100 What if we just requires that specification over rust libraries by passing a stable struct with the version information in the calls?
@@play005517Okay, that makes sense. I was only thinking about things in the context of linking against shared libraries written in Rust, but you can call any extern function that implements the ABI, regardless of what language it is written in, and the borrow-checker can't help you in those cases. Thanks.
Wait. C++ has the same problem and solved it in a pretty clean way: just encode type information in the ABI data. If you disassemble C++ library objects, you will notice that symbol names aren't clean function names from code and contain a lot of encoded information (such as constness of objects, calling convention and other). This technique is called name mangling. Can't Rust do the same? And regarding templates (static polymorphism): same state in C++ as in Rust: reevaluate every time and recompile. That's one of the reasons you still need header files for compiled libraries. Some libraries in C++ are so template-heavy that they go header-only and can not be built by themselves.
"Can't Rust do the same?" Rust does the same. Same deal for templates: rust .rlib's are basically like C++ header-only libraries, except that they are more like "precompiled headers" or the new C++ module format in that as much work that can be done is done up front but all the rich type system information is there for later monomorphization. The real difference is that the .rlib format is not *stable*. That is, the rust project explicitly permits itself the flexibility to change this format between versions of the compiler, and exactly because of this it doesn't get into the ABI backcompat hell that C++ has trapped itself in. The major challenge with "crabi" is not having a format (Rust has that already), it is providing the necessary support for forward compatibility and ABI versioning, because if you are not prepared then there is no "stability without stagnation".
ABI and name mangling are two very different things. Name mangling only solves problems like multiple instances of the same generic/template function in binary. ABI tells where arguments are located (register/stack), how they're ordered, order, sizes, alignment of fields in struct etc.etc. So far, only C has stable ABI. C++ just promises, "we won't break ABI!".
Didnt Google gave million to the Rust Foundation to improve C++ Interoperability, I think many more will ask and maybe donate to the Foundation to make devs there focus on this, so in near future i wouldn't be surprised to see good or somewhat good alternatives to improve this
Over in WebAssembly land, they're working on something you could call an "ABI" with the Component Model, where the interface a component (=library or application) imports/exports is defined in a separate file (a .wit file), and then bindings from that file are generated for whichever language the component is written in (C, Rust, Go, ...). If other libraries or applications then want to use this component, they also generate bindings for their language from the same .wit file.
But this is just like COM, WinRT, CORBA (anyone remember that?) Just like in C++ if I want to use your fancy safe amazing Rust code in my Rust app, I have to build your source code in
@@VivekNa Exactly. WinRT's ABI is defined in a library's companion ECMA-355 ".winmd" file. It's a slight improvement over COM's .tlb metadata file. Not elegant, but they work. I'm assuming Rust will need the same infrastructure.
@1:00 easy solution: c++ instead of c. use smart pointers and std::array over manual memory management and c-arrays, and you wipe out 95% of memory issues.
It’s is totally possible to put rust libraries in their own shared objects and link them at compile time. The problem is that the ABI isn’t stable across compiler versions which limits the usage of those libraries by a lot. The only application I can think of for this is an embedded OS environment with very limited ram where you can set the rust compiler of a project to a fixed state and compile one shared object that you use across projects. This way the shared object is only loaded once into memory
Shared libraries are one of the worst things that ever happened. I remember when we used to compile linux programs from source and you'd get a ton of errors since libraries weren't compatible. Or when a Windows app needs a specific dotnet version because they're not backwards compatible. Having an entire app package makes the most sense.
I don’t think you hate shared libraries, rather the systems that manage your dependencies. Ever since I started using Nix I realized package managers can actually be good.
@@aintnochange ???? How would standalone programs reduce the amount of dependencies you need to install? If anything it would increase the amount of dependencies you need to install because applications won't be able to re-use the existing libraries. The size of a single binary might be acceptable, but if you scale that to all the different binaries a system uses, and you want each one of them to have their own libc.so et al. that's gonna bloat the shit out of memory usage.
**unversioned** shared libraries were a mistake. declarative binary compatibility checking is one way this was resolved, along with interface version checking (if you remember the old days of OLE/COM). you make mistakes and you move forward with solutions. dll/dependency hell was an interesting time, lol
@@XxZeldaxXXxLinkxXPlan9 from Bell Labs proved that even without tree shaking (dead code removal, which all modern compilers have), things were not bloated. Plan9 does not have dynamic linking at all. Dynamic linking is a good thing, but its original use for saving memory on a 80386 with 8MB of RAM isn't valid anymore. Docker containers also prove that the memory saving argument is no longer valid. Dynamic libraries' main use is plugin functionality, and driver frontends, like Vulkan, as well as allowing call interception.
While the Rust ABI isn't stable, cargo still calls rustc for each crate separately, generating intermediate artifacts that get linked together in the end. While the ABI isn't stable between different versions of the compiler, it _does_ need to be stable across invocations of the same compiler. This reveals a very simple solution: separate library outputs by compiler rather just having a single version for each crate. Cargo actually has hashes for intermediate artifacts as well, encoding not just the crate and the compiler version, but several other build parameters as well. While these are currently local to each project, there's nothing stopping them from being global. Would this actually help? Honestly, probably not. This is because where as C projects specify the names of their dependencies and maybe a minimum version, Rust crates include a lock file that requests specific hashes of each dependency, and customization of the libraries' features is specified by the crate author as well rather than a third party packager. These, combined with distinguishing libraries by the compiler used for them, make the actual chance of two separate projects sharing the exact same configuration of the exact same dependency with the exact same compiler and the exact same compile flags is very slim. It really would be no different than it is right now, just with the overhead of dynamic linking to a library only actually used once. That hasn't stopped some people from treating each crate dependency as a separate package, but the only distro I know of where this isn't crazy is NixOS, since it already distinguishes packages by all of the factors mentioned previously and more. (In some ways, Nix acts more like a language package manager rather than a system package manager, despite generally being used as the latter.) This is still rather niche though, with most rust projects either being packaged as a single unit, or sometimes with all the dependencies as a single separate package linked into the final package.
ik I'm probably missing something but why didn't they just "half-compile" the packages? the lexer, parser, same application checker, and everything else that your code doesn't matter in; can just be done once right? or did they already do that?
This does happen, yes. The trouble is: it is rustc version specific and not stable or anything. So if you compile something on your machine and then compile the same thing on the same machine again it won't compile the same things again.
A similar issue can be found in C++ when using templates. If you write template code it needs to be recompiled for basically every use (you generally can't put it into the cpp file it needs to be in the h/hpp file). Working around that is possible and since it's only a part of the C++ language so I guess it's not that big of an issue as in Rust but still and issue IMO. I wonder if there are also options to pre-compile or at least pre-tokenize Rust code (this does exist for C & C++ headers although I don't see it used much). That would likely fix some of the compile time overhead but not really the binary size. Should also be noted that a lot of the binary size isn't just code. There is a lot of information regarding symbols and other stuff left in the binary which can be striped out (either via the cargo strip setting or using the strip command). Linktime optimization is also usually turned off to reduce compile times a bit and dead code elimination is less effective due to Rust running multiple compilation units (also to reduce time by doing some work in parallel). I generally change those settings for my release target which does add some compile time but can have a large impact on binary size (especially in an embedded program where everything is statically linked it gets the codesize pretty close to C)
Idk in c++ you can either have header-only libraries, and/or have compilation units that precompile those templated functions for specific types (type specialization) E.g. you could have a *.hpp with your structs and function declarations, a *_impl.hpp that has the full implementation, then your library may include precompiled specialisations for float/double in a *.cpp, and if you want to do some special stuff, you include the *_impl.hpp in the using application. IMO gives enough flexibility to use them in a library depending on the level of complexity of the code.
This is where I think Java and C# get their generic types right. Java generics undergo what’s called type-erasure which essentially treats the type argument as the same type. It’s not as powerful as C++ templates but at least it only compiles once.
@@RetroAndChill Java and C# are a different cup of coffee because they have their own runtime and have a lot more freedom to put info or custom features into their binary format. C, C++, Rust have limits because they need to compile and load as machine code. PS: In this case it's important because in machine code you need to know how to interpret and read memory, this is done by using instructions and functions based on the type. In other words type erasure is impossible in those languages without significant overhead (like putting everything into an object with a vtable, which is what java does but also optimizes for in the jvm)
@@Ether_Void Definitely part of the way Java can do what it does is because generics must take in a reference type and thus is doesn’t have to concern itself with the actual type. It basically can just treat everything as a pointer to some base type and then just static cast it when you’re about to return.
C++ modules are sort of solving that issue by basically "semi-compiling" modules into a kind of C++ "bytecode" (unique to each module interface, opening a new compatibility layer between API and ABI that so far only applies to each individual C++ compiler) however it eliminates the template problem while compiling it to a nearer low level language that can be more rapidly compiled down to a machine code level. There are a lot of nice things about this, like the separation of global scope for each module including the C preprocessor, and an even more performant incremental build system, but there are annoyances that it comes with too.
This video is evidence for my new hypothesis: Once a programmer starts learning about abstract syntax trees, and creating their OWN language, they start wanting to coble together programs with assembly from multiple compiled programs.
Idk about the making one's own language bit, but certain tasks seem to be better handled by language specific libraries and frameworks and they might only be good for their single purpose. An example would be a C audio processing and GUI framework that might not have a robust cryptography suite, and it might be easier to make a lightweight, app-specific crypto library in Rust than it would with C
@@somedooby That would be a case of creating better alternative and isolating similar code, but isolating code could be done also by sharing a source, instead of creating different binaries. My guess for that hypothesis is that, with experience one gains from creating a language, he sees some new benefits that were not self-evident before. Something that makes his life easier and reduces either time or amount of work. Like if his language cannot write fast lowlevel code, but RUST and C can do it.
IMHO the original K&R C was essentially a super macro assembler. A way to code in a higher level format, using a compiler for the parts that weren't that time sensitive, and in-line assembly code for those parts that were.
not to be that guy but the correct terminology is a high level assembler whereas a macro assembler lets you create super-instructions via "macros". C is neither of two things because it has a grammar to it whereas assembly language doesn't have a grammar.
That's not even true. C isn't a "macro assembler," it's a high level programming language. It may not feel that way by today's standards, but the definition of high level is simply that it's not architecture dependent, not that it has a billion features. C was originally created as a high level language to write Unix programs in, and eventually they ended up rewriting the entire Unix kernel in C. I don't think it was ever considered "slow" outside of like game dev. And even then, the Amiga had a C compiler way back in 1985.
Looks to me that this is just Rust being relatively new. C has had decades to try and perfect itself. I can't wait to see what they do with Rust, since it is unironically one of the best languages I've ever learned. That or someone makes a language like Rust that works around the mistakes mentioned.
Rust not having a stable ABI is not a mistake, it's forethought. Don't give promises you aren't ready to deliver on. Rust is thinking about how to deliver such a promise, but it's actually a lot of work to not screw up and you have to get it right the first time because of forward compatibility issues. (At least, you have to make sure that you leave space for v2.) C didn't do this so there are now a bunch of tricks people use to get stable ABI in the presence of some but not all changes, and of course there are a thousand ways this can fail horribly.
Even C++ has problems with ABI. GCC has in the past had to revise the C++ ABI in backward-incompatible ways. C will probably remain relevant as long as shared libraries are around.
Different from npm, cargo stores the source for each version of each library only once and reuses them between projects. But for Rust, the compile cache "target" directory is way bigger anyways.
@@peterixxxZig isn't an alternative to Rust; it's an alternative to C. It doesn't give you most of Rust's safety features, so if memory safety is the reason you were interested in Rust, Zig is not better than C in that regard.
I've seen some people talking about how this is a problem for a security standpoint as well (as if we ship our code statically linked with a library which contains safety vulnerabilities then all of our programs will need to be updated individually to fix it). But this appears to be a one-sided reasoning, because it can work the other way around as well. You can have a bunch of working programs that relies on the same safe package, then it became vulnerable and suddenly all the programs in the system that shares are in risk as well instead of just the ones that had it updated (which means more chances for a potential attacker to explore the vulnerability as most programs in a system tends to be sparsely used).
The security standpoint is preciselly the case you see as a one-sided reasoning. The problem is not that a particular statically app gets unsafe because it was compiled with an older version. That's not a problem, is the normal case, as any othe app with a bug. Just recompile the app and done. The security problem you are not seeing is... imagine glibc has a bug, and all the apps on the system are statically compiled. You are forced to recompile and reinstall everything. Using shared libraries, you just need to compile and reinstall glibc... (and restart the apps). With one compilation you fix everything. Life for sysadmins would be very difficult without shared libraries... and security fixes would took a lot longer to fix.
@@framegrace1 I think you just ignored everything I said to resort to the same thing I'm saying is one-sided. I recognize the fact that if a dependency is vulnerable and you have static linking with all the apps with that vulnerability that it is a big problem. I'm saying that if you have a global dependency that suddenly become vulnerable you also have a because you don't necessarily that the global dependency has a vulnerability in time to just update it (and neither the developers would necessarily know / what we have more is security vulnerabilities that had existed for years and years without being perceived for example).
@@framegrace1 Nope, I only need to recompile the programs facing the attackers, for the case of the attacker doesn’t have shell access on the system. If they have shell access then shared linking won’t help 90% of the time because LD_PRELOAD negates any patches to the system provided libraries. Right now if the attacker has local/shell access only a well built Mandatory Access Control system will help you. Also most vulnerabilities aren’t in system wide libraries like the c library or zlib, they are either in the application or one or two dependencies off.
@@diadetediotedio6918 There's a massinve paradox in your reasoning. If you maintain your statically linked system up to date, you're exposed to the same vulnerabilities as a dynamically linked system. Either you're arguing that a fully out-of-date system is safer, or that a partially out-of date system is safer (do you decide which packages to update or not at random?), I'm not too sure, but both seem nonsensical to me.
@@user-qn9ku2fl2b It is not a paradox, that's why I said this here: ["But this appears to be a one-sided reasoning, because it can work the other way around as well."] For me, this whole "it is safer" point is a problem on itself because it is one-sided (it has problems both with statically linked and dynamic linked). And my point is that a partially out-of-data system safer, not that , neither a necessity. We've seen something very relatable to this some time ago with the xz thing, think about the time you need to just update a distro version of your xz containing the terrible vulnerability versus the time each individual package would had taken to update their own statically linked version of the xz package if it was the case it was statically linked, and you can see part of my point. Obviously, my original point says "one-sided" because I also recognize the other way around, where the vulnerable xz was linked statically to many packages and then you are now vulnerable until they update their version (or you rollback yours). Summarizing, my vision here is much more "this is all nonsensical bullshit" than "you should prefer X to Z (pun intended)".
@@LowLevelTV Yes, I definitely agree with that. However, though it has a fixed initial learning "tax", it offers a significant recurring return. There are some friction points, which may be in different places compared to other languages, but I've found that the net gain in developer productivity and velocity is significantly higher. I've also noticed that the whole language model encourages me to think about my code design in a more robust way. I'm not exactly sure how to explain it, but code I've [re]written in Rust enables me to dive back in much faster for fixing issues or adding new features. This is why we're transitioning our entire team to Rust for development (both cloud with Axum and device/desktop with Tauri). But I totally agree, higher entry point, but higher return (IMO). Great content, I really appreicate your videos.
Shared objects are cool for the reasons stated, but compiling everything into a monolithic binary allows your program to run without dependencies. Dependencies suck (for many reasons).
The more I discover Rust, the more I think it's one of those ideas that seem brilliant on paper, but turn out to be hellish in reality. Despite the efforts made to promote Rust, this language will never be used on a large scale because it will crash headfirst into the wall of reality.
This is a valid problem to work towards but static binaries should def still be the default, or at least the default for release builds. Packaging for different OS's is just so much more simple
4:30 static analysis has nothing to do with why the ABI isn't stable. Borrow checker would still work with compiled lib and whatever the equivalent to .h files will be in Rust. Lifetimes don't require sources, function signature is enough. The ABI may never exist because it locks down the stdlib and prevents impactful changes in a way that's more restrictive than a stable API.
Its often presented as a bug, but for most of my projects its actually a feature that everything is in one place (static). Plus are the compile times really that bad? On my 10 y old laptop a 100k project compiles under a minute just fine. I save time on compiling by having clear and actionable errors compared to c.
I've made dynamic libraries in Rust multiple times. These days you can easily reach for crates like stabby and abi-stable. The bigger question is why you would want to do that when static linking is better in most cases.
If we use Rust in Linux for more stuff, it would be nice if some system calls and other system utilities were wrapped in a Rust library, which you can link with dynamically. If you make internal company tools, sometimes it makes sense to write them in one place, install them and have other apps dynamically link to them. That's all I can really think of.
@@PhthaloJohnson Just write* your application in C already. Stable ABI, stable API, fast, no long compile times, no billions of complex features. *If you are worried about system calls. If not, use Rust.
Rust is a young language. I am a C++ veteran and not really fluent in Rust yet, but I can see the potential. And we need something better than old, unsafe languages. If Rust gains momentum, the library problem will be solved eventually. We'll see.
Does the Rust compiler actually have to inspect the source code of a function ever time the function is used? And more important, does it actually have to compile the function every time it is used? Otherwise I don't see how any of this stops libraries from existing in Rust once an ABI has be defined.
I'm maybe mislead by i heard that C++ have simillar problem with more advanced features, so binary libaries are at least limited ? Maybe it is unavoidable price of advancement ? I'm not sure why compiling from a source must increase binary size, i woudl say contrary, At first glance it allow to limit instantiations of generic functions, to actually used, also allow futher optimizations and better unsed code elimination. Other thing is that C programs depend on libc beeing on system already, so size will be biger, but app will be more independent on libc version changes etc.
The problem is that ABI is an OS spec target, and OS targets were built with C ABI (data types and limitations included). What we need is a new ELF/.so dll and dylib specification where the ABI can represent Rust (and type-safe) entities. Also, I wonder if WASM or LLVM IR could be used as rich bytecodes for Rust's dynamic libraries to be compiled/cached on demand?
Feeling very smug as a C++ enthusiast hearing about rust not having an abi because it has more complex types and generics. But also, C++ still has slow compile times and linker errors are pain C++ also invented ownership and borrowing, Rust just made it mandatory
C++ also does not have an abi. And If you use Templates you are more or less in Rust territory where you have to ship your whole template sources using header files.
@@Benhoof81 The need to ship templates is a compromise, but it already gives you more than 90% of the effect in real codebases and if you can guarantee the same compiler version you can just instantiate common ones beforehand as well. Also - you don't pay for templates you don't need to instantiate unless there is a gigantic amount of them to parse in the includes. Again, if that is a problem extract their definitions to different headers or find some other convoluted way. In C++ we have 30 years of solutions for every single of those problems, it is a good thing that solutions exist, but also bad thing that we needed them. C++ is no longer a first choice for my hobby stuff anymore and neither is C if that says anything. Not sure about an ABI. I know someone broke ABI something somewhere, but on the other hand - The only time I needed to I was able to link successfully with C++ DLLs made 25 years ago with modern compiler just fine (Windows/MSVC 2019) and I didn't even have the headers for them, I just demangled the symbols. Might be less okay on Linux, but I mostly never needed to do that (things were either freshly built or they didn't exist at all or were purely statically linked). Shipping dynamically built binaries between Linux distros or even versions of one distro - yeah I tried, that one NEVER turns out well.
The best thing about this channel is he makes videos a lot shorter than the other channels. He does not use storytelling to make a simple thing turn into a 15-minute video.
I like that programs does not depend on precompiled libraries. Like that, you don't have to care about which version is installed on your system by your package manager, you always have the most up to date version (or even an outdated one if you need it), and as everything is managed by cargo you don't need to compile and install the library (and it's dependancies) yourself if you have an exotic setup.
Maybe the next big step in static analysis (like borrow-checking) is inventing a practical way to NOT cross module boundaries. To somehow record all the relevant information about a module in its interface, which can be then saved into an _auto-generated header file,_ which then will be used for checking the program that includes it.
In short: 1. rust does not have a stable ABI 2. Using shared libraries will break the static safety analysis. These things make shared libraries impossible in rust.
Devs hate shared libraries, I get it. BUT... As a user, I will NOT want 30 executables that use 50 statically linked libraries each, which might have another 50 statically linked libraries packaged as an "application". That is NOT AN APPLICATION, that is a piece of storage and memory mallware. Exponential memory requirements for a working system is not acceptable.
It's M×N complexity. M apps have on average N deps. Like Facebook friends which should be N^2 but is actually just number of people times the average number of friends. But I prefer sharing too.
As a user I hate shared libraries. I hate having 2 executables relying on 2 different versions of a shared library when my package manager just has 1 on the current release of my OS. I usually compile statically if I can. Makes life for everyone easier at the cost of some disk space.
Thank you for this video. It took me some effort to find any mention about the Rust ABI and I was able to only guess why the ABI is not such a big thing as in case of the C language. It makes more sense now to me
What I wonder is how would it be possible to maintain/extend an ABI after being established such that it doesn't invalidate code compiled with older specs of the ABI.
Compiling static binaries in golang is orders of magnitudes faster than compiling rust. I don't think libraries are (solely) to blame for compile times. (Definitely for the large binaries, though)
Seems to me the idea of source maps as found in javascript bundles would kinda allow the rust features to cross binary boundaries by allowing an abi to source mapping so that the borrow checker could use the source to verify the code calling into the precompiled binary, crucially, without recompiling all of the foreign code.
I found this video confusing. Although I am very new to Rust. You start by talking about long compile times and the lack of libraries in rust, however, rust supports both "lib" and "rlib" types which precompiles libraries for use in creating later static binaries, which you did cover. I guess you are saying there is no shared libraries with PIC unless they are only supporting the C calling convention and that is the only way to communicate with any other language in process. But what would really be the value of that? For example swift code that knows how to communicate with Rust instead of C? Would that Swift code participate in the borrow checker? Would strings really be handled the same way? What are the permutations of language to language mapping where some use GC, some use smart counters, and others use the borrow checker?
Can you have object caching like blaze/bazel? Then you only need to compile a piece of a package the first time someone on your team's code uses it in a particular way. But you don't need the lock-in that creating an ABI causes. (Please pardon my ignorance.)
Afaik it already does that, your debug build is stored as a bunch of .o files in target/debug/incremental If it doesn't seem to need to update the .o (file unchanged, rust version not changed) it won't recompile the .o That's why cargo run works instantly if you didn't make any change since last time. Presumably, if you enforce the rust version just sharing the target folder should prevent having to compile from scratch. Have not tried that myself though.
for making games thats a big issue for modding of either game or engine. c++ has an abi and high level constructs can pass between dynamic lib and executable. and templates are solved just by compiling them again yourself through importing the header ^^ but at least there is a way, even if janky. and in rust there isnt really a way besides going to C ABI
In web dev , when JS files get mangled by a bundler, a map file is created to recover/unmagle the script . So if the C object is not enough to represent everything in rust, why not add a map file to allow the same object to be used simultaneously in rust and C programs?
I think that dynamic linking is often problematic. Would static linking be a solution? Even if it makes big libraries, it allows to have smaller compile times...
Sometimes I wonder if it's better to not even try being efficient... ie, why a shared library if you can dedup at file level. Large binaries wouldn't matter if they were structured in a way that a) identical blocks of bytes can be shared among executables b) the binaries can be downloaded in chunks. If you have a bunch of shared libraries, how is that different from a lot of shared chunks of bytes? However, this is obviously non-trivial. It would require: aligning structures inefficiently (no packing), and agreeing on a minimum chunk size (extra padding). It would require every compiled blob to to align somewhat to the source code so that a minor change in code (as in small number character change or small number line change) to align to some compiled artefact. Perhaps compiling each function to a blob could achieve this. Once you have already compiled a function, and you reference it (by content like sha256 or something) then you could download it and or reference it elsewhere by content. Not shared libraries, but a huge, world wide repository of compiled function blobs that can be referenced in code by name, and reverse mapped to corresponding content through an central index. Maintaining such a repo would be a nightmare... maybe the a block chain could handle it....
technically a lot of the libc stuff don't implement the hard things, they are just wrappers around the syscalls, though they do make it more convenient to do stuff.
“When you build anything in rust, you’ve to download a 1000 crates in cargo” I wish there were 1000 crates on cargo, and when you build you get raided. lol how can a programming language have the same lingo as my favourite video game?
while the compile time is high, i bet the second time you try to compile the code with little changes, it will be faster, since most of the code is already compiled, right? i know other languages (or maybe the IDEs) do that optimization. does rust have this feature?
When you work on your project in an IDE, it works just like that. You have a local cache of the build for all dependencies. Plus, all the advantages of the incremental build of the current project. However, when updating the version of the compiler, the entire project along with dependencies will need to be rebuilt completely. In cases where you have many projects or an entire (Linux) distribution and many target platforms, the build usually takes place not locally but somewhere in a CI pipeline/on a corresponding stand, where connecting a large cache might not be cost-effective or easy to set up. Here, everything depends on the available resources - both hardware and personnel.
Doesn't C++ have a similar problem, although not as bad? WG21 doesn't have an ABI standard for C++. This means C++ libraries have to be compiled with the same compiler, be it Clang/LLVM, GCC, MSVC, Intel, etc, etc.
In theory yes, in practice no. There is no ABI defined in the C++ standard (beyond C-compatibility). But there is the Itanium C++ ABI standard used by all compilers on all platforms for decades, and then, there is MSVC. So, there is a stable ABI, it's just a separate standard (just like C, DWARF, ELF, POSIX, SystemV, etc., are all separate standards). A standard-compliant C++ compiler can pick whatever ABI it wants to, but in practice, it's been pretty stable. Even MSVC has stabilized even though they didn't really have to considering they control their entire tool-chain (and OS). The main ABI-related issues that C++ has had problems with are standard library components (notably std::string) that cannot easily be changed internally (e.g., small-string optimization) without introducing a mismatch between the header files one is compiling against and the libraries one is linking with. That is the bigger problem with ABIs for more type-rich languages like C++ or Rust, it makes it hard to change the function signatures or internal representations of types. And having pre-built libraries for any arbitrary set of dependencies that all must have consistent versions for all their shared dependencies makes for a combinatorial problem that isn't easy to solve. This is basically what C++ modules try to solve.
It doesn't take for ever to compile where did you get that from? It takes time first build, but that's it and that is not a problem. Maybe if you're on windows, I could imagine that. On Linux, using dynamic linking and the mold linker, it's blazingly fast.
I wonder if this approach of creating an ABI layout for a type could also enable more interopability with languages other than C by enabling a crate to generate wrapping code for a language such as Python to be able to import the compiled Rust library.
It's perhaps a good thing to not have dynamic libraries. It can be really limiting to try to maintain backwards compatible ABI. Force the developers to recompile everything all the time and I think it will promote libraries being updated to new compiler versions faster. Rather than being forced on some old version because some library doesn't get updates anymore.
The question is, do we really need a stable ABI? The mean reason for DLLs / libc-like libraries mentioned are file size and compile time. The compile time part only affects initial compilation so it's not as bad when actually developing. The file size on the other is something that in my opinion is not really an issue. In a world where a single photo is in the tens of megabytes, games are multiple hundreds of gigabytes, etc I don't think it matters that you might have binaries with duplicated code that are a few megabytes each... Of course that's just my opinion on all of this ^^"
It doesn't seem that bad with an individual program, but if every program on your computer was statically compiled, the bloat might be noticeable, though LTO could reduce this. Checking online (since I'm on my phone, not laptop), I've seen glibc be 2, 5, and 8 megabytes in size. Multiply that by however many programs you have for a measure of bloat that is probably incorrect due to ignorance or optimizations.
@@kuhluhOG 100% agree. At the same time it's also its downfall cause if a patch has a breaking change that will (unintentionally) break a bunch of applications in the process. There's was a recent libc update that did exactly something like that IIRC. ^^
@@GeckoEidechsewell, the update you mean was INTENTIONALLY a breaking change, yes, really, the glibc devs sometimes think that's a good idea... nonetheless, going via dynamic libs has an additional upside for some operating systems: you can only allow certain system calls from specific code; e.g. OpenBSD only allows system calls coming from its c library, you aren't allowed to do them yourself (and if you do, you get killed) that has the advantage that you can ensure certain checks are being done before you get into a privileged context
@@kuhluhOG And trivial to bypass using the dynamic linker. Never let an adversary have local access. I get the historical context but where the most severe bugs happen at has changed in the last decade. Libc and zlib aren’t the normal vectors now. Most issues are in the program or a direct dependency now not a system wide library.
On server/desktop linux the trend is flatpack, doing less dynamic linking, as it is a constant issue with software distribution. In nixos dynamic linking is expressed as static linking, only yielding advances in saving ram. So far I feel that, not having dynamic linking in rust, is not an issue. Thanks for the video! Love your channel!
@@SomeRandomPiggo It absolutely does, Snaps and Flatpaks (hell, even AppImages) can be gigantic. Same with distros doing static linking. The belief is that RAM and storage capacity, and network speeds, are good enough and will keep improving to support this, so we're willing to trade file sizes off for other benefits.
It's not like you couldn't do it, it's "just" not as simple. You would need a compile time part and a runtime part, similar to C++ shared libs and headers. However the compile time part would need to contain a lot more information than a header file. For now, it would basically have to be an intermediate representation of the full code that static checks can run on, as well as information about the binary layout of data. However, a better approach for the future would be to establish language features for explicit guarantees that can be checked at library compile time and assumed at library use time.
Isn't WASM currently attempting to solve the same issue? I heard about the WASM Components Draft and that it basically defined an ABI across all languages by using only a simple but powerful common subset of most languages
LLMs will be writing/checking all the code in the future so what does the language matter? I'm seriously asking. Will the AI even use code in future or is everything going to be generative? And if there is code, wouldn't an AI just write in whatever the lowest level code is in context?
If rust used pre-compiled libraries instead of source crates, you wouldn't be easily able to use a crate that does a specific thing you need on many different platforms. Also, precompiled libraries kinda exist, rustc will compile every crate into rlib first, then compile the final binary. But rlib format is compiler version specific, there's even an error for incompatible metadata
@@YandiBanyu Ah, I was wondering if I would see another Nix user. It (and its sibling Guix) is pretty much the only system package manager that I could think of where packaging Rust libraries actually seems possible. It basically comes down to reimplementing Cargo in the Nix language, which is easier said than done, though people have tried, like with crate2nix.
why not just create 2 crates, one for the actual library that does most of the work and exposes functions through the C ABI, and a second one a wrapper that uses the C ABI and has wrapper functions to convert it back to rust? you can distribute a binary of the first, and compile just the wrapper when building your rust program
I always find myself thinking about dynamic ABIs that are basically typelevel functions and generate portable struct layout specifiers, invariant checks, and suchlike.
There has to be a way to compile binaries in a way that documents its side effects so that these documented side effects may be used for compile-time checks without needing to recompile the binary. Unless some of these properties aren't guaranteed in the end product even if all components satisfy them? I also imagine it would introduce some interesting security considerations, like signing side effect documentation to prevent spoofing.
The build times can be annoying but surely recompiling especially with semantic versioning allows bug fixes to be applied that would not exist in the pre-built binary. Building rust at the point of use with regular rebuilds should be the standard right? That kinda relies on devs using semantic versioning correctly though.
If it is just about compile times, it is possible to create precompiled binaries though. Like dtolnay did with serde. Only the method has to be well documented. As far as shared objects are concerned, it should be possible once the ABI is standardized. Rust shared objects need not work like C shared objects. Additional information for static analysis can be embedded within those objects in order to help the borrow checker. It is not a trivial task for sure, but I think that it is definitely possible.
What if when the rust compiler compiled code to work with a library, it downloads the source code of that library and compiles its own code using the library source code to statically analyse the library to ensure that the code obeys the borrow checking rules. It wouldn't fix the compile time issues but would be a good first step for allowing rust codebases to take advantage of libraries
C/C++ needs multiple files to compile and work: library itself (and not dynamic necessarily, one may use static libraries too) and header files which are used by compiler to understand interface of functions. For example, compiler could use it for lifetime checks. May be rust's lifetime are too complex and depend on the definition of functions too, then it is not possible to have only declarations for compilation. When it comes to generics, that is true that compiler needs definition of function, but c++ has templates that are similar to rust's generics. When they are used, obviously, the whole definition of template/generic class/function are visible for compiler and compiled in each application etc. (actually, when it comes to c++ template function with the same template parameter = type for which template is instantiated may be compiled multiple times in one application because there are multiple translation units and they know nothing about each other. Therefore they do not know that compiler created template instantiation for the concrete type)
To be honest, the compile times never bothered me with rust. I don't think that it is actually such a big problem as we always read online. It is only the first compilation of a project that takes this long. Also, I LOVE that Rust builds everything into a single file. I was previously coding my hobby projects in C# and I voluntarily started building standalone binaries instead of the library-dependent ones to prevent dependency issues on the end device. I know that we now have solutions like flatpak for this (and I love flatpak!), but standalone binaries also work for the projects I did, with less hassle to deal with. And in my 5 years of using Linux, I've been there multiple times that my distro's libc updates and a bunch of programs break because they haven't been updated yet. Of course, this approach also isn't perfect, as we see with filesize and compile times. Also, from a security standpoint, a program could ship with an old, vulnerable library, and it's the developer's job instead to update it. Still, I have enough time and disk space, so this approach works better for me.
I think it is possible, yes, but with some concessions. For example, while you cannot ensure the safety of other languages libraries, you can "kind of" ensure at some level the safety of library Rust code by requiring external functions to do "transpose as runtime" and instead of relying just on the compiler to do the safety checks, rely on runtime types that should be stable between Rust versions (or at least embed a version property so each caller can check if caller/called functions has the appropriate compatible versions). So: * The library function requires a moved struct? Rust will treat it as a normal function and let the choice of how to dispose the value for the called function * The lib function requires an immutable reference to the struct? Fine, just pass the reference and believe the safety of the caller * The lib function requires a mutable reference to the struct? The same as before, maybe make a custom mutable reference struct to check if it was really released externally (or just not, optimistic reliance can be a thing if your project and the library agrees on the stable version) * How do we agree on the version? We pass a special struct containing the informations of the compiler we both used, if it is compatible (i.e. no breaking changes had occurred between the versions) we let this call happens, if not it panics at runtime (or any other way to do it). It appears to work at least in theory to solve the problem .
*** Of course this will not work for other languages than rust, but if what you said about compile times and binary sizes is real then you probably could use these tricks to make it work.
A major feature of rust compared to other languages is that it in fact does *not* use runtime types, all of that is *checked at compile time*. This is an important thing there.
@@Sandromatic This is not strictly true, Rust runtime checking in types like Rc or RefCell (which implements a runtime borrow checking mechanism), you just don't need to use it everywhere because the compiler can do it as well in many cases where statically checking this is possible. So, it would not be a "break of feature" to do what I'm saying here.
What? Rust syntax is not messy. I find it much tidier than C++ although that can be down to personal preferences. I don't know how it could be objectively described as messier than C++ though.
That's questionable. Rust is absolutely unreadable unless you have thoroughly learned it, or you only look at trivial examples. C++ is not as bad, although if you use all of the more advanced features that crept up over years, it can be also weird to read for sure. I'd say, tough call here, although if you stick to a simpler subset of C++, it's much more readable than any Rust. That said, C++ has grown into a monster for sure. But we don't need a competition of unreadable programming languages.
@@joseoncrack Oh come now. I could just turn that around with equal validity: "C++ is absolutely unreadable unless you have thoroughly learned it, or you only look at trivial examples. Rust is not as bad," No really, this must be a familiarity problem. Pretty much all of my Rust applications are only using a subset of the language, as was the case during my C++ career. Most of it as readable as anything. Even my Python head colleagues understand it. What I began to loath about C++ is that it adds more and more features and complexity without ever fixing the robustness problem I have hatred about it since forever.
Rust's static analysis requires having all of the semantic and structural parts of the program available so it can ensure that everything is consistent. It should be possible to do this using an intermediate form built from the type information and abstract syntax trees, thus not requiring the full source for the things that are now in crates. Over time, this representation can be made more compact. This probably rules out compatibility between the output of different compilers, and even different versions of the same compiler, but if the pre-parsed crate code is stored with the crate, an incompatible compiler can simply compile the source and ignore the short cut. Binary libraries are a different issue. There must be a way to define a common representation of types exported by a library such that all programs that depend on the library will agree on its structure in memory. Don't allow generic parameters to library functions. The problems described in this video are only a subset of the reasons I can't use the language in my embedded project. The biggest hurdle is that Rust is unavailable. Now for my obligatory rant: Due to the NSA having become brainwashed into believing C cannot be trusted under any circumstances, I am forced to reinvent the wheel and write a full network stack including TLS, HTTPS, and PTP client in Ada, which is the only approved language available for the MicroBlaze soft microcontroller. This new rule alone has added about 4 million dollars to the cost of developing this product. In the past 10 years, no piece of C code I have released has had any sort of buffer overflow or memory leak. This is provable. There's no dynamic allocation, all structures and arrays that are not on the stack are statically allocated. The code is efficient, deterministic, and robust. It takes a lot of discipline to write this sort of code in any language, Rust included.
Does Rust at least tree shake to compile only the code in crates that is actually used? It would be really nuts if it compiles a lot of stuff not even used. At that point an interpreter might be nicer development experience by far. And from what was compiled by JIT interpreter it could have a leg up on tree shaking perhaps.
what are the changes you would make if you change from ELF to PE/COFF or Mach-O? Do these other binary formats present features that would be useful in accessing the semi-random data that the Linux ABI doesn't? Also if you're making a single binary why can't (in Windows) it compile each binary individually and not to a single binary but, make instead place each binary into an NTFS stream. This should also work in the Mach-O format by using forks, I think. Cause ADS has something to do with cross compatibility with MAC. bringing ADS was brought to Linux is controversial due to the 255 byte limit for Extended Attributes.
The reason nobody uses alternate streams is that they're NTFS only, if you move the file to a non-NTFS drive (like a FAT32 usb stick, or a NAS running linux) then you just lose all streams except the main one.
I have an ideia for lib management that would not only fix the lib installation issue, but help to use them in the first place. C programs could compile with some clib files. clib files are bytecode with the pre compiled library and the header info. When you consume the library, you use #include "foo.clib" and done. Then the compiler should compile the lib to an so file, and do the usual compile and link process we all use today.
I'm a noob, but bringing everything into itself to compile and check sounds great to me, letting the compiler gain full insight and control over what is and can be done :) Also it sounds like it would remove all forms of external dependency, since it's all in the binary, everything.
So rust is currently ideal for use cases with "small projects" that don't rely on a bunch of external code where this is a non-issue? Like embedded etc. And less ideal for massive projects, especially if you want to be able to compile in separate parts.
if you have a function prototype and its address, you can call functions by pointer as in cpp hehe but you need to use unsafe mem::transmute to interpret ptr into a prototype, i experimented with low-level calls to "external" functions some time ago and in rust it's works better than in cpp, even without asm macros lol
i’ve thought about this a bit before, i use nim not rust, but it would be nice if the safer, easier, higher level types in both languages and many others could be more easily connected. it would reduce the world’s dependence on c/c++, which i think is a good thing. and before you say nim doesn’t count here because it compiles thru c (and others), there is a third party llvm backend which could someday become the default, and have no dependence on c
while we wait for rust libraries, go learn to code in C at lowlevel.academy :)
Propaganda!!!!!! Rust is the way to enlightenment.
There's actually a talk that the Rust Foundation uploaded on one project to create an ABI (Zenoh):
th-cam.com/video/qkh8Fs2c4mY/w-d-xo.html
I'm pretty sure you can dynamically link pre-compiled rust libraries. The bevy game engine recommends use of it to substantially reduce compile times.
@@c4explosivesinyourcartrunk817 Nah
C and C++ kinda works
Please can you give us a feedback about Odin language
You could have said "Dynamic Library", i genuinely thought Rust Foundation messed up something big again about Libraries in Cargo like they did about the brand / logo last time.
I had the same confusion initially.
I'm not even far into the video, and I realized he was talking about shared objects. If you link all your C code with the equivalent '.a' libraries, those programs all wind up pretty huge too.
this is intentional to "maximize engagement"
Seems entirely deliberate.
Time to unsub and click "Don't recommend channel"
ELF ABI isn’t actually a term. The ELF Format supports multiple ABIs, for example my gcc compiler uses the System V ABI
good catch, ty
@@LowLevelTVyou are scholar and a gentleman, sir. Though we disagree, I am impressed at you reaction to a mistake.
@@tiranito2834 Would you rather have him try to gaslight the guy into thinking he’s wrong and cause a huge fight?
@ike1000Back in the 1990's, we called that "being an adult".
If the bar were any lower, four years-olds would have run up quite a tab.
@@kayakMike1000 it's a normal reaction.
Btw, unsafe Rust code does NOT disable borrow checker. It allows unsafe function calls, raw pointer dereference and stuff like that, but does not disable borrow checker
What he means is that if you wrote a safe function in Rust and I wanted to use it, it will only work if I have your code in my project.
You obviously cant pass rust objects to foreign code like that in a C DLL
Specifically, the unsafe block allows:
* Calling unsafe functions
* Dereferencing raw pointers
* Accessing a static mut
* Accessing members of a union
And I think that's all! (assuming I remember correctly 😅)
Notably, however, unsafe blocks are not the only way to create unsafe Rust: implementing an unsafe trait, using the #[export_name] / #[no_mangle] attributes off the top of my head, but there's probably others.
@@SimonBuchanNz unfortunately currently you can produce unsafe code even without all that stuff if you abuse some of known compiler bugs.
Although I consider them as not something you can face in the real production code
@@arjentix while that's more accurately "unsafe rustc" than "unsafe Rust", it's a bit academic without a spec. They're finally working on one though!
true, but you can go around the borrow checker with raw pointer dereference.
FYI, Rust *absolutely does* have support for dynamically linked libraries, including all Rust features. The problem is that the ABI is unstable across compiler versions, not that it doesn't exist.
So, in other words, not ready for primetime. Effectively that means it's not there.
@@falxonPSN No? It means that the ABI is unstable.
@@falxonPSNnope, it actually means that rustc actually does things as opposed to cc. Also, have you _tried_ to move a C program compiled against a version of libc to another system with a different version?
No, the real real problem is that traits are monoporphized at compile-time. So you can't do that can't do that in a dynamic lib, since traits aren't really "real". The dynamic lib could include a fixed set of instantiated objects, but no more than that. C++ has the same "problem". Templates libraries have to be compiled in.
@@lucass8119but C++ libraries still exist anyway.
- Wait ?? It's all C/C++ ????
- Always has been *error: no matching function for call to ‘typename std::enable_if::type>::type>::value’ to ‘int’*.
I am NOT reading all that
@@c4explosivesinyourcartrunk817 Too many people said that to me the past hour💀
error: Segmentation Fault
what in the holy name of Christ is that monstrosity?
I've been having a lovely time with C not having to see those horrifying STL errors.
It's important to point out that compilation is only long on clean builds. Incremental builds are okay-ish, during development.
AHAHAHAHAHAHAHAHA COPE
can confirm.
the first build takes a minute because you have to compile the crates,
but cargo only recompiles as far as what has changed since last time.
@@catlolisNope, just true, during development its almost unnoticible unless you're doing some kind o hot reloading like in the JS world. Its a bit misleading for someone who doesn't know Rust when hearing about compile times because they assume its slow all the time. The real problem with Rust compile times is when compiling to production, if you have some CI to run you're not going to have a good time.
DUDE rust compile time is slow af even on a 3070 spec out laptop @@jonnyso1
@@jonnyso1 recompile speeds are also not appropriate, it’s 2024. It shouldn’t take 10-15 seconds to recompile my program
"if i pass a mutable reference into a compiled binary, how is it possible for the borrow checker to make sure the reference is used in a way that is safe"
the borrow checker would've already made sure of that when the library binary was compiled, then, the checker in your code would look at the function signature to enforce the rules at your side. if both sides were checked individually, and the interface contract (function signature) is respected, it should just works as if you checked them at the same time
Can my borrow checker assume that the borrow checker used for the library is valid and not otherwise compromised?
@@stephenreaves3205 No it can’t. But that’d just be a compromise you’d have to make.
Dynamic libraries are shipped as pure machine code with basic linking information. You couldn’t tell such a library was ‘safe’ much the same way you can’t read the source code of a proprietary DLL you load with your C++ program.
@@stephenreaves3205 I mean the whole idea of DLLs is that you have to trust the linked library, and that's why injecting / modifying DLLs has been a way for bad actors to hack programs. Trusting that the DLL's borrow checker did good job is functionally the same as trusting that the installed libc DLL won't blow up your PC when you use it
@@stephenreaves3205 That's the trade off, it's the same with foreign/precompiled C objects, you never know if the code is valid and UB-free. As C compilers don't check that as intensely as rustc it's not that apparent in C terms of course.
GPG signed libraries/objects as with package managers on Linux could be a compromise to validate secure sources maybe.
@@baileyharrison1030 but I think Rust has pigeonholed themselves because C++ doesn't have a safe/unsafe mode and Rust. By their own logic, I think they'd have to make every dll or so as unsafe, which, at the very least, will be annoying as hell
I don't see why the borrow-checker would not be able to work across boundaries for other libraries written in Rust. If the functions within the Rust library violate borrowing rules then the library shouldn't compile in the first place, right?
Since there is no stable specification for how these things are compiled, you would get undefined behevior between the versions that they were compiled on.
Borrow checker works on function scope, it needs to see the function signature and all the signatures of functions/types used inside. So yes, it can work as long as full function signatures with full type information (mutability, lifetimes) are available. The hardest is generics, because any compilation that involves generics is deferred until the actual use site. I suspect it would make sense to severely limit the rust type system for this kind of abi, because otherwise you would need full blown rust compiler to parse all the type info.
One implication for ABI is the binary library you are calling isn't necessarily written in the same language, that's why there're a lot of languages that use C ABI. The task here is not only to make sure both binary can interface each other if they are both compiled from Rust, but essentially redefining what constitutes fundamental Rust typed interfaces even exists in binary form so that the higher order data types (like the UTF8 strings, and enums he mentioned) can be understood by each other. And it means one or both sides may not even be compiled from Rust at all. It may be a future rust compatible language. Like how we use C ABI with modern languages, instead of just C. And these languages may not have a borrow checker at all.
@@thisisnotok2100
What if we just requires that specification over rust libraries by passing a stable struct with the version information in the calls?
@@play005517Okay, that makes sense. I was only thinking about things in the context of linking against shared libraries written in Rust, but you can call any extern function that implements the ABI, regardless of what language it is written in, and the borrow-checker can't help you in those cases. Thanks.
Wait. C++ has the same problem and solved it in a pretty clean way: just encode type information in the ABI data. If you disassemble C++ library objects, you will notice that symbol names aren't clean function names from code and contain a lot of encoded information (such as constness of objects, calling convention and other). This technique is called name mangling. Can't Rust do the same?
And regarding templates (static polymorphism): same state in C++ as in Rust: reevaluate every time and recompile. That's one of the reasons you still need header files for compiled libraries. Some libraries in C++ are so template-heavy that they go header-only and can not be built by themselves.
"Can't Rust do the same?" Rust does the same. Same deal for templates: rust .rlib's are basically like C++ header-only libraries, except that they are more like "precompiled headers" or the new C++ module format in that as much work that can be done is done up front but all the rich type system information is there for later monomorphization.
The real difference is that the .rlib format is not *stable*. That is, the rust project explicitly permits itself the flexibility to change this format between versions of the compiler, and exactly because of this it doesn't get into the ABI backcompat hell that C++ has trapped itself in. The major challenge with "crabi" is not having a format (Rust has that already), it is providing the necessary support for forward compatibility and ABI versioning, because if you are not prepared then there is no "stability without stagnation".
ABI and name mangling are two very different things. Name mangling only solves problems like multiple instances of the same generic/template function in binary. ABI tells where arguments are located (register/stack), how they're ordered, order, sizes, alignment of fields in struct etc.etc. So far, only C has stable ABI. C++ just promises, "we won't break ABI!".
As an embedded linux integrator (Yocto), this makes all our applications +10MB in size which will become a problem very soon with our 256B of RAM.
Great video
Didnt Google gave million to the Rust Foundation to improve C++ Interoperability, I think many more will ask and maybe donate to the Foundation to make devs there focus on this, so in near future i wouldn't be surprised to see good or somewhat good alternatives to improve this
some problems cannot be fixed no matter how much money you throw at it. Just ask or look at the US government
I think they're making their own Carbon language
@@SharunKumar They are but did recently give a donation to the rust foundation to increase interoperability
That money was probably spent writing and enforcing their COC and witch hunting people using the "Rust trademark"
@@LeviShawando 💀thats crazy
Over in WebAssembly land, they're working on something you could call an "ABI" with the Component Model, where the interface a component (=library or application) imports/exports is defined in a separate file (a .wit file), and then bindings from that file are generated for whichever language the component is written in (C, Rust, Go, ...). If other libraries or applications then want to use this component, they also generate bindings for their language from the same .wit file.
But this is just like COM, WinRT, CORBA (anyone remember that?)
Just like in C++ if I want to use your fancy safe amazing Rust code in my Rust app, I have to build your source code in
@@VivekNa Exactly. WinRT's ABI is defined in a library's companion ECMA-355 ".winmd" file. It's a slight improvement over COM's .tlb metadata file. Not elegant, but they work. I'm assuming Rust will need the same infrastructure.
@1:00 easy solution: c++ instead of c. use smart pointers and std::array over manual memory management and c-arrays, and you wipe out 95% of memory issues.
Smart pointers in c++ have a runtime cost but rust borrow checking only has compile time cost.
Yknow if there was a universal easy solution, we probably wouldn’t have this discussion
It’s is totally possible to put rust libraries in their own shared objects and link them at compile time. The problem is that the ABI isn’t stable across compiler versions which limits the usage of those libraries by a lot.
The only application I can think of for this is an embedded OS environment with very limited ram where you can set the rust compiler of a project to a fixed state and compile one shared object that you use across projects. This way the shared object is only loaded once into memory
He literally says that at 3:10
Shared libraries are one of the worst things that ever happened. I remember when we used to compile linux programs from source and you'd get a ton of errors since libraries weren't compatible. Or when a Windows app needs a specific dotnet version because they're not backwards compatible. Having an entire app package makes the most sense.
I don’t think you hate shared libraries, rather the systems that manage your dependencies. Ever since I started using Nix I realized package managers can actually be good.
@@aintnochange
???? How would standalone programs reduce the amount of dependencies you need to install? If anything it would increase the amount of dependencies you need to install because applications won't be able to re-use the existing libraries.
The size of a single binary might be acceptable, but if you scale that to all the different binaries a system uses, and you want each one of them to have their own libc.so et al. that's gonna bloat the shit out of memory usage.
@@aintnochangeEverything adds up. It would also lead to quicker compile times and dependency installation that you complained about
**unversioned** shared libraries were a mistake. declarative binary compatibility checking is one way this was resolved, along with interface version checking (if you remember the old days of OLE/COM). you make mistakes and you move forward with solutions. dll/dependency hell was an interesting time, lol
@@XxZeldaxXXxLinkxXPlan9 from Bell Labs proved that even without tree shaking (dead code removal, which all modern compilers have), things were not bloated. Plan9 does not have dynamic linking at all. Dynamic linking is a good thing, but its original use for saving memory on a 80386 with 8MB of RAM isn't valid anymore. Docker containers also prove that the memory saving argument is no longer valid. Dynamic libraries' main use is plugin functionality, and driver frontends, like Vulkan, as well as allowing call interception.
The long compile times are actually not only compling but also linking. Loved this video, good work!
While the Rust ABI isn't stable, cargo still calls rustc for each crate separately, generating intermediate artifacts that get linked together in the end. While the ABI isn't stable between different versions of the compiler, it _does_ need to be stable across invocations of the same compiler. This reveals a very simple solution: separate library outputs by compiler rather just having a single version for each crate. Cargo actually has hashes for intermediate artifacts as well, encoding not just the crate and the compiler version, but several other build parameters as well. While these are currently local to each project, there's nothing stopping them from being global.
Would this actually help? Honestly, probably not. This is because where as C projects specify the names of their dependencies and maybe a minimum version, Rust crates include a lock file that requests specific hashes of each dependency, and customization of the libraries' features is specified by the crate author as well rather than a third party packager. These, combined with distinguishing libraries by the compiler used for them, make the actual chance of two separate projects sharing the exact same configuration of the exact same dependency with the exact same compiler and the exact same compile flags is very slim. It really would be no different than it is right now, just with the overhead of dynamic linking to a library only actually used once.
That hasn't stopped some people from treating each crate dependency as a separate package, but the only distro I know of where this isn't crazy is NixOS, since it already distinguishes packages by all of the factors mentioned previously and more. (In some ways, Nix acts more like a language package manager rather than a system package manager, despite generally being used as the latter.) This is still rather niche though, with most rust projects either being packaged as a single unit, or sometimes with all the dependencies as a single separate package linked into the final package.
ik I'm probably missing something
but why didn't they just "half-compile" the packages?
the lexer, parser, same application checker, and everything else that your code doesn't matter in; can just be done once right?
or did they already do that?
This does happen, yes. The trouble is: it is rustc version specific and not stable or anything. So if you compile something on your machine and then compile the same thing on the same machine again it won't compile the same things again.
@@Sandromatic Source code is also a lot smaller than any intermediate state tree as well.
@@Sandromatic That's insane!!!
A similar issue can be found in C++ when using templates. If you write template code it needs to be recompiled for basically every use (you generally can't put it into the cpp file it needs to be in the h/hpp file).
Working around that is possible and since it's only a part of the C++ language so I guess it's not that big of an issue as in Rust but still and issue IMO.
I wonder if there are also options to pre-compile or at least pre-tokenize Rust code (this does exist for C & C++ headers although I don't see it used much). That would likely fix some of the compile time overhead but not really the binary size.
Should also be noted that a lot of the binary size isn't just code. There is a lot of information regarding symbols and other stuff left in the binary which can be striped out (either via the cargo strip setting or using the strip command). Linktime optimization is also usually turned off to reduce compile times a bit and dead code elimination is less effective due to Rust running multiple compilation units (also to reduce time by doing some work in parallel). I generally change those settings for my release target which does add some compile time but can have a large impact on binary size (especially in an embedded program where everything is statically linked it gets the codesize pretty close to C)
Idk in c++ you can either have header-only libraries, and/or have compilation units that precompile those templated functions for specific types (type specialization)
E.g. you could have a *.hpp with your structs and function declarations, a *_impl.hpp that has the full implementation, then your library may include precompiled specialisations for float/double in a *.cpp, and if you want to do some special stuff, you include the *_impl.hpp in the using application.
IMO gives enough flexibility to use them in a library depending on the level of complexity of the code.
This is where I think Java and C# get their generic types right. Java generics undergo what’s called type-erasure which essentially treats the type argument as the same type. It’s not as powerful as C++ templates but at least it only compiles once.
@@RetroAndChill Java and C# are a different cup of coffee because they have their own runtime and have a lot more freedom to put info or custom features into their binary format.
C, C++, Rust have limits because they need to compile and load as machine code.
PS: In this case it's important because in machine code you need to know how to interpret and read memory, this is done by using instructions and functions based on the type. In other words type erasure is impossible in those languages without significant overhead (like putting everything into an object with a vtable, which is what java does but also optimizes for in the jvm)
@@Ether_Void Definitely part of the way Java can do what it does is because generics must take in a reference type and thus is doesn’t have to concern itself with the actual type. It basically can just treat everything as a pointer to some base type and then just static cast it when you’re about to return.
C++ modules are sort of solving that issue by basically "semi-compiling" modules into a kind of C++ "bytecode" (unique to each module interface, opening a new compatibility layer between API and ABI that so far only applies to each individual C++ compiler) however it eliminates the template problem while compiling it to a nearer low level language that can be more rapidly compiled down to a machine code level. There are a lot of nice things about this, like the separation of global scope for each module including the C preprocessor, and an even more performant incremental build system, but there are annoyances that it comes with too.
This video is evidence for my new hypothesis: Once a programmer starts learning about abstract syntax trees, and creating their OWN language, they start wanting to coble together programs with assembly from multiple compiled programs.
🤔 Interesting hypothesis. Is the main motive for that behavior a wish to optimize compile time or something else?
Idk about the making one's own language bit, but certain tasks seem to be better handled by language specific libraries and frameworks and they might only be good for their single purpose. An example would be a C audio processing and GUI framework that might not have a robust cryptography suite, and it might be easier to make a lightweight, app-specific crypto library in Rust than it would with C
Yeap. That's my case.
@@somedooby That would be a case of creating better alternative and isolating similar code, but isolating code could be done also by sharing a source, instead of creating different binaries. My guess for that hypothesis is that, with experience one gains from creating a language, he sees some new benefits that were not self-evident before. Something that makes his life easier and reduces either time or amount of work. Like if his language cannot write fast lowlevel code, but RUST and C can do it.
IMHO the original K&R C was essentially a super macro assembler. A way to code in a higher level format, using a compiler for the parts that weren't that time sensitive, and in-line assembly code for those parts that were.
not to be that guy but the correct terminology is a high level assembler whereas a macro assembler lets you create super-instructions via "macros". C is neither of two things because it has a grammar to it whereas assembly language doesn't have a grammar.
Of course it does. It's just not a recursive grammar.
That's not even true. C isn't a "macro assembler," it's a high level programming language. It may not feel that way by today's standards, but the definition of high level is simply that it's not architecture dependent, not that it has a billion features. C was originally created as a high level language to write Unix programs in, and eventually they ended up rewriting the entire Unix kernel in C. I don't think it was ever considered "slow" outside of like game dev. And even then, the Amiga had a C compiler way back in 1985.
@@minneelyyyy C with inline assembly isn't a high level language, I think that's what they were getting at
@@seeibe and inline assembly isn't even standard C... They didn't use it in the early days.
Looks to me that this is just Rust being relatively new. C has had decades to try and perfect itself.
I can't wait to see what they do with Rust, since it is unironically one of the best languages I've ever learned.
That or someone makes a language like Rust that works around the mistakes mentioned.
Rust not having a stable ABI is not a mistake, it's forethought. Don't give promises you aren't ready to deliver on. Rust is thinking about how to deliver such a promise, but it's actually a lot of work to not screw up and you have to get it right the first time because of forward compatibility issues. (At least, you have to make sure that you leave space for v2.) C didn't do this so there are now a bunch of tricks people use to get stable ABI in the presence of some but not all changes, and of course there are a thousand ways this can fail horribly.
Even C++ has problems with ABI. GCC has in the past had to revise the C++ ABI in backward-incompatible ways. C will probably remain relevant as long as shared libraries are around.
very interesting topic and well explained too! thank you!
You could say that binary interface is a bit.
Rusty.
; )
WHY ARE YOU BOOING ME I AM RIGHT?!
we let bro cook for a reason, keep doing what you do
so, in the end, Rust is just like Nodejs and it's lovecraftian sized node_modules/?
🚀
Yep. The builds on my last rustproject needed 32GB of RAM and 90GB of 'stuff' on disk to complete with tests.
It's ridiculous.
Now I'm learning zig.
Different from npm, cargo stores the source for each version of each library only once and reuses them between projects. But for Rust, the compile cache "target" directory is way bigger anyways.
@@peterixxxZig isn't an alternative to Rust; it's an alternative to C. It doesn't give you most of Rust's safety features, so if memory safety is the reason you were interested in Rust, Zig is not better than C in that regard.
I've seen some people talking about how this is a problem for a security standpoint as well (as if we ship our code statically linked with a library which contains safety vulnerabilities then all of our programs will need to be updated individually to fix it). But this appears to be a one-sided reasoning, because it can work the other way around as well. You can have a bunch of working programs that relies on the same safe package, then it became vulnerable and suddenly all the programs in the system that shares are in risk as well instead of just the ones that had it updated (which means more chances for a potential attacker to explore the vulnerability as most programs in a system tends to be sparsely used).
The security standpoint is preciselly the case you see as a one-sided reasoning. The problem is not that a particular statically app gets unsafe because it was compiled with an older version. That's not a problem, is the normal case, as any othe app with a bug. Just recompile the app and done.
The security problem you are not seeing is... imagine glibc has a bug, and all the apps on the system are statically compiled. You are forced to recompile and reinstall everything.
Using shared libraries, you just need to compile and reinstall glibc... (and restart the apps). With one compilation you fix everything.
Life for sysadmins would be very difficult without shared libraries... and security fixes would took a lot longer to fix.
@@framegrace1
I think you just ignored everything I said to resort to the same thing I'm saying is one-sided. I recognize the fact that if a dependency is vulnerable and you have static linking with all the apps with that vulnerability that it is a big problem. I'm saying that if you have a global dependency that suddenly become vulnerable you also have a because you don't necessarily that the global dependency has a vulnerability in time to just update it (and neither the developers would necessarily know / what we have more is security vulnerabilities that had existed for years and years without being perceived for example).
@@framegrace1 Nope, I only need to recompile the programs facing the attackers, for the case of the attacker doesn’t have shell access on the system. If they have shell access then shared linking won’t help 90% of the time because LD_PRELOAD negates any patches to the system provided libraries. Right now if the attacker has local/shell access only a well built Mandatory Access Control system will help you.
Also most vulnerabilities aren’t in system wide libraries like the c library or zlib, they are either in the application or one or two dependencies off.
@@diadetediotedio6918 There's a massinve paradox in your reasoning. If you maintain your statically linked system up to date, you're exposed to the same vulnerabilities as a dynamically linked system.
Either you're arguing that a fully out-of-date system is safer, or that a partially out-of date system is safer (do you decide which packages to update or not at random?), I'm not too sure, but both seem nonsensical to me.
@@user-qn9ku2fl2b
It is not a paradox, that's why I said this here:
["But this appears to be a one-sided reasoning, because it can work the other way around as well."]
For me, this whole "it is safer" point is a problem on itself because it is one-sided (it has problems both with statically linked and dynamic linked).
And my point is that a partially out-of-data system safer, not that , neither a necessity. We've seen something very relatable to this some time ago with the xz thing, think about the time you need to just update a distro version of your xz containing the terrible vulnerability versus the time each individual package would had taken to update their own statically linked version of the xz package if it was the case it was statically linked, and you can see part of my point. Obviously, my original point says "one-sided" because I also recognize the other way around, where the vulnerable xz was linked statically to many packages and then you are now vulnerable until they update their version (or you rollback yours).
Summarizing, my vision here is much more "this is all nonsensical bullshit" than "you should prefer X to Z (pun intended)".
I'd love to know what you think is messy about the syntax. If you've got any specifics, that'd be awesome.
Not necessarily messy, just more complex for new programmers. Way higher learning curve
@@LowLevelTV Yes, I definitely agree with that.
However, though it has a fixed initial learning "tax", it offers a significant recurring return.
There are some friction points, which may be in different places compared to other languages, but I've found that the net gain in developer productivity and velocity is significantly higher.
I've also noticed that the whole language model encourages me to think about my code design in a more robust way. I'm not exactly sure how to explain it, but code I've [re]written in Rust enables me to dive back in much faster for fixing issues or adding new features.
This is why we're transitioning our entire team to Rust for development (both cloud with Axum and device/desktop with Tauri).
But I totally agree, higher entry point, but higher return (IMO).
Great content, I really appreicate your videos.
Shared objects are cool for the reasons stated, but compiling everything into a monolithic binary allows your program to run without dependencies. Dependencies suck (for many reasons).
And in the perfect world you can do both. With C you can choose between static or dynamic linking, with rust you don't have the luxury yet
it also makes compilation process to last forever and binaries to bloat
It's really useful to know that Rust is only currently practical for static compiled binaries. Somehow, I had heard anyone mention that before.
The more I discover Rust, the more I think it's one of those ideas that seem brilliant on paper, but turn out to be hellish in reality. Despite the efforts made to promote Rust, this language will never be used on a large scale because it will crash headfirst into the wall of reality.
This is a valid problem to work towards but static binaries should def still be the default, or at least the default for release builds. Packaging for different OS's is just so much more simple
yeah that will finally solve the linux dependency hell.
4:30 static analysis has nothing to do with why the ABI isn't stable. Borrow checker would still work with compiled lib and whatever the equivalent to .h files will be in Rust. Lifetimes don't require sources, function signature is enough.
The ABI may never exist because it locks down the stdlib and prevents impactful changes in a way that's more restrictive than a stable API.
Please add url to the pull request I want to read it 🙏
Its often presented as a bug, but for most of my projects its actually a feature that everything is in one place (static). Plus are the compile times really that bad? On my 10 y old laptop a 100k project compiles under a minute just fine. I save time on compiling by having clear and actionable errors compared to c.
I've made dynamic libraries in Rust multiple times. These days you can easily reach for crates like stabby and abi-stable. The bigger question is why you would want to do that when static linking is better in most cases.
If we use Rust in Linux for more stuff, it would be nice if some system calls and other system utilities were wrapped in a Rust library, which you can link with dynamically.
If you make internal company tools, sometimes it makes sense to write them in one place, install them and have other apps dynamically link to them.
That's all I can really think of.
@@PhthaloJohnson Just write* your application in C already. Stable ABI, stable API, fast, no long compile times, no billions of complex features.
*If you are worried about system calls. If not, use Rust.
Rust is a young language. I am a C++ veteran and not really fluent in Rust yet, but I can see the potential. And we need something better than old, unsafe languages. If Rust gains momentum, the library problem will be solved eventually. We'll see.
Does the Rust compiler actually have to inspect the source code of a function ever time the function is used? And more important, does it actually have to compile the function every time it is used? Otherwise I don't see how any of this stops libraries from existing in Rust once an ABI has be defined.
I'm maybe mislead by i heard that C++ have simillar problem with more advanced features, so binary libaries are at least limited ?
Maybe it is unavoidable price of advancement ?
I'm not sure why compiling from a source must increase binary size, i woudl say contrary,
At first glance it allow to limit instantiations of generic functions, to actually used, also allow futher optimizations and better unsed code elimination.
Other thing is that C programs depend on libc beeing on system already, so size will be biger, but app will be more independent on libc version changes etc.
The problem is that ABI is an OS spec target, and OS targets were built with C ABI (data types and limitations included).
What we need is a new ELF/.so dll and dylib specification where the ABI can represent Rust (and type-safe) entities.
Also, I wonder if WASM or LLVM IR could be used as rich bytecodes for Rust's dynamic libraries to be compiled/cached on demand?
Feeling very smug as a C++ enthusiast hearing about rust not having an abi because it has more complex types and generics.
But also, C++ still has slow compile times and linker errors are pain
C++ also invented ownership and borrowing, Rust just made it mandatory
rather have everything be mandatory and simple.
C++ also does not have an abi. And If you use Templates you are more or less in Rust territory where you have to ship your whole template sources using header files.
@@Benhoof81 The need to ship templates is a compromise, but it already gives you more than 90% of the effect in real codebases and if you can guarantee the same compiler version you can just instantiate common ones beforehand as well. Also - you don't pay for templates you don't need to instantiate unless there is a gigantic amount of them to parse in the includes. Again, if that is a problem extract their definitions to different headers or find some other convoluted way. In C++ we have 30 years of solutions for every single of those problems, it is a good thing that solutions exist, but also bad thing that we needed them. C++ is no longer a first choice for my hobby stuff anymore and neither is C if that says anything.
Not sure about an ABI. I know someone broke ABI something somewhere, but on the other hand - The only time I needed to I was able to link successfully with C++ DLLs made 25 years ago with modern compiler just fine (Windows/MSVC 2019) and I didn't even have the headers for them, I just demangled the symbols. Might be less okay on Linux, but I mostly never needed to do that (things were either freshly built or they didn't exist at all or were purely statically linked). Shipping dynamically built binaries between Linux distros or even versions of one distro - yeah I tried, that one NEVER turns out well.
The best thing about this channel is he makes videos a lot shorter than the other channels. He does not use storytelling to make a simple thing turn into a 15-minute video.
Maybe I'm naive, but couldn't you use LLVM to build a Rust compiler that ultimately generates an ELF output?
the rust compiler is LLVM based
@@somenameidk5278 So, why are libraries not possible in Rust? The archive (.a) file format should be generatable.
@@drfrancintosh he meant dynamically linked libraries that use a Rust ABI (the title is a bit misleading)
@@somenameidk5278 thank you for that clarification… That makes more sense
I like that programs does not depend on precompiled libraries. Like that, you don't have to care about which version is installed on your system by your package manager, you always have the most up to date version (or even an outdated one if you need it), and as everything is managed by cargo you don't need to compile and install the library (and it's dependancies) yourself if you have an exotic setup.
And this is bad for large applications. Imagine rebuilding 1 millions lines of code.
Maybe the next big step in static analysis (like borrow-checking) is inventing a practical way to NOT cross module boundaries. To somehow record all the relevant information about a module in its interface, which can be then saved into an _auto-generated header file,_ which then will be used for checking the program that includes it.
In short:
1. rust does not have a stable ABI
2. Using shared libraries will break the static safety analysis.
These things make shared libraries impossible in rust.
And thus rust is useless for what it tries to do
Devs hate shared libraries, I get it. BUT... As a user, I will NOT want 30 executables that use 50 statically linked libraries each, which might have another 50 statically linked libraries packaged as an "application". That is NOT AN APPLICATION, that is a piece of storage and memory mallware. Exponential memory requirements for a working system is not acceptable.
It's M×N complexity. M apps have on average N deps. Like Facebook friends which should be N^2 but is actually just number of people times the average number of friends. But I prefer sharing too.
As a user I hate shared libraries.
I hate having 2 executables relying on 2 different versions of a shared library when my package manager just has 1 on the current release of my OS.
I usually compile statically if I can. Makes life for everyone easier at the cost of some disk space.
Thank you for this video. It took me some effort to find any mention about the Rust ABI and I was able to only guess why the ABI is not such a big thing as in case of the C language. It makes more sense now to me
Im doing my Graduate Project around Cybersecurity thanks you to you!
Reading, "For instance, passing a string" as "For example, passing an instance" amuses me. 😄 6:31
What I wonder is how would it be possible to maintain/extend an ABI after being established such that it doesn't invalidate code compiled with older specs of the ABI.
Did you figure it out yet? There is a reason we've been using the C ABI by default for like 50 years
Compiling static binaries in golang is orders of magnitudes faster than compiling rust. I don't think libraries are (solely) to blame for compile times.
(Definitely for the large binaries, though)
Unsafe doesn't disable the borrow checker. Even Rust pointers are annotated on whether they're mutable or immutable.
Seems to me the idea of source maps as found in javascript bundles would kinda allow the rust features to cross binary boundaries by allowing an abi to source mapping so that the borrow checker could use the source to verify the code calling into the precompiled binary, crucially, without recompiling all of the foreign code.
Neat idea
I found this video confusing. Although I am very new to Rust. You start by talking about long compile times and the lack of libraries in rust, however, rust supports both "lib" and "rlib" types which precompiles libraries for use in creating later static binaries, which you did cover. I guess you are saying there is no shared libraries with PIC unless they are only supporting the C calling convention and that is the only way to communicate with any other language in process. But what would really be the value of that? For example swift code that knows how to communicate with Rust instead of C? Would that Swift code participate in the borrow checker? Would strings really be handled the same way? What are the permutations of language to language mapping where some use GC, some use smart counters, and others use the borrow checker?
Can you have object caching like blaze/bazel? Then you only need to compile a piece of a package the first time someone on your team's code uses it in a particular way. But you don't need the lock-in that creating an ABI causes. (Please pardon my ignorance.)
Afaik it already does that, your debug build is stored as a bunch of .o files in target/debug/incremental
If it doesn't seem to need to update the .o (file unchanged, rust version not changed) it won't recompile the .o
That's why cargo run works instantly if you didn't make any change since last time.
Presumably, if you enforce the rust version just sharing the target folder should prevent having to compile from scratch. Have not tried that myself though.
for making games thats a big issue for modding of either game or engine. c++ has an abi and high level constructs can pass between dynamic lib and executable.
and templates are solved just by compiling them again yourself through importing the header ^^
but at least there is a way, even if janky. and in rust there isnt really a way besides going to C ABI
Yeah, for game dev that's a huge no-no
where do you take the 70% memory related figure from?
In web dev , when JS files get mangled by a bundler, a map file is created to recover/unmagle the script .
So if the C object is not enough to represent everything in rust, why not add a map file to allow the same object to be used simultaneously in rust and C programs?
"Sometimes the compiler gets mad at you" - understatement of the century
I think that dynamic linking is often problematic. Would static linking be a solution? Even if it makes big libraries, it allows to have smaller compile times...
Sometimes I wonder if it's better to not even try being efficient... ie, why a shared library if you can dedup at file level. Large binaries wouldn't matter if they were structured in a way that a) identical blocks of bytes can be shared among executables b) the binaries can be downloaded in chunks. If you have a bunch of shared libraries, how is that different from a lot of shared chunks of bytes?
However, this is obviously non-trivial. It would require: aligning structures inefficiently (no packing), and agreeing on a minimum chunk size (extra padding).
It would require every compiled blob to to align somewhat to the source code so that a minor change in code (as in small number character change or small number line change) to align to some compiled artefact. Perhaps compiling each function to a blob could achieve this. Once you have already compiled a function, and you reference it (by content like sha256 or something) then you could download it and or reference it elsewhere by content.
Not shared libraries, but a huge, world wide repository of compiled function blobs that can be referenced in code by name, and reverse mapped to corresponding content through an central index.
Maintaining such a repo would be a nightmare... maybe the a block chain could handle it....
technically a lot of the libc stuff don't implement the hard things, they are just wrappers around the syscalls, though they do make it more convenient to do stuff.
“When you build anything in rust, you’ve to download a 1000 crates in cargo” I wish there were 1000 crates on cargo, and when you build you get raided. lol how can a programming language have the same lingo as my favourite video game?
What is the difference between an ABI like crabi and the existing dylib option in rustc?
When you mentioned about generics, what did you mean by saying statically dispatched?
while the compile time is high, i bet the second time you try to compile the code with little changes, it will be faster, since most of the code is already compiled, right? i know other languages (or maybe the IDEs) do that optimization. does rust have this feature?
When you work on your project in an IDE, it works just like that. You have a local cache of the build for all dependencies. Plus, all the advantages of the incremental build of the current project. However, when updating the version of the compiler, the entire project along with dependencies will need to be rebuilt completely. In cases where you have many projects or an entire (Linux) distribution and many target platforms, the build usually takes place not locally but somewhere in a CI pipeline/on a corresponding stand, where connecting a large cache might not be cost-effective or easy to set up. Here, everything depends on the available resources - both hardware and personnel.
Doesn't C++ have a similar problem, although not as bad? WG21 doesn't have an ABI standard for C++. This means C++ libraries have to be compiled with the same compiler, be it Clang/LLVM, GCC, MSVC, Intel, etc, etc.
In theory yes, in practice no. There is no ABI defined in the C++ standard (beyond C-compatibility). But there is the Itanium C++ ABI standard used by all compilers on all platforms for decades, and then, there is MSVC. So, there is a stable ABI, it's just a separate standard (just like C, DWARF, ELF, POSIX, SystemV, etc., are all separate standards). A standard-compliant C++ compiler can pick whatever ABI it wants to, but in practice, it's been pretty stable. Even MSVC has stabilized even though they didn't really have to considering they control their entire tool-chain (and OS). The main ABI-related issues that C++ has had problems with are standard library components (notably std::string) that cannot easily be changed internally (e.g., small-string optimization) without introducing a mismatch between the header files one is compiling against and the libraries one is linking with. That is the bigger problem with ABIs for more type-rich languages like C++ or Rust, it makes it hard to change the function signatures or internal representations of types. And having pre-built libraries for any arbitrary set of dependencies that all must have consistent versions for all their shared dependencies makes for a combinatorial problem that isn't easy to solve. This is basically what C++ modules try to solve.
@@mike200017 Interesting. I didn't know about just about all of that. Also weird about std::string.
It doesn't take for ever to compile where did you get that from? It takes time first build, but that's it and that is not a problem. Maybe if you're on windows, I could imagine that. On Linux, using dynamic linking and the mold linker, it's blazingly fast.
I wonder if this approach of creating an ABI layout for a type could also enable more interopability with languages other than C by enabling a crate to generate wrapping code for a language such as Python to be able to import the compiled Rust library.
It's perhaps a good thing to not have dynamic libraries. It can be really limiting to try to maintain backwards compatible ABI. Force the developers to recompile everything all the time and I think it will promote libraries being updated to new compiler versions faster. Rather than being forced on some old version because some library doesn't get updates anymore.
Could (should) Cargo ship the Rust IR for its packages to reduce local compile times?
The question is, do we really need a stable ABI?
The mean reason for DLLs / libc-like libraries mentioned are file size and compile time.
The compile time part only affects initial compilation so it's not as bad when actually developing.
The file size on the other is something that in my opinion is not really an issue. In a world where a single photo is in the tens of megabytes, games are multiple hundreds of gigabytes, etc I don't think it matters that you might have binaries with duplicated code that are a few megabytes each...
Of course that's just my opinion on all of this ^^"
there is another reason for dynamically linked libraries: it's easier to patch one system-wide library than many programs individually
It doesn't seem that bad with an individual program, but if every program on your computer was statically compiled, the bloat might be noticeable, though LTO could reduce this.
Checking online (since I'm on my phone, not laptop), I've seen glibc be 2, 5, and 8 megabytes in size. Multiply that by however many programs you have for a measure of bloat that is probably incorrect due to ignorance or optimizations.
@@kuhluhOG 100% agree. At the same time it's also its downfall cause if a patch has a breaking change that will (unintentionally) break a bunch of applications in the process. There's was a recent libc update that did exactly something like that IIRC. ^^
@@GeckoEidechsewell, the update you mean was INTENTIONALLY a breaking change, yes, really, the glibc devs sometimes think that's a good idea...
nonetheless, going via dynamic libs has an additional upside for some operating systems: you can only allow certain system calls from specific code; e.g. OpenBSD only allows system calls coming from its c library, you aren't allowed to do them yourself (and if you do, you get killed)
that has the advantage that you can ensure certain checks are being done before you get into a privileged context
@@kuhluhOG And trivial to bypass using the dynamic linker. Never let an adversary have local access.
I get the historical context but where the most severe bugs happen at has changed in the last decade. Libc and zlib aren’t the normal vectors now. Most issues are in the program or a direct dependency now not a system wide library.
On server/desktop linux the trend is flatpack, doing less dynamic linking, as it is a constant issue with software distribution. In nixos dynamic linking is expressed as static linking, only yielding advances in saving ram. So far I feel that, not having dynamic linking in rust, is not an issue. Thanks for the video! Love your channel!
Won't that lead to massive binaries though?
@@SomeRandomPiggo It absolutely does, Snaps and Flatpaks (hell, even AppImages) can be gigantic. Same with distros doing static linking.
The belief is that RAM and storage capacity, and network speeds, are good enough and will keep improving to support this, so we're willing to trade file sizes off for other benefits.
@@kyonblack7901 Hardware improving just for software to get worse at the same (or faster) of a rate is just sad tbh
@@SomeRandomPiggo True, I'm all in for leaner software with few dependencies, but the tech world is moving in the opposite direction.
@@SomeRandomPiggo Ngl I'd happily accept that if it means my system and programs are rock solid stable.
It's not like you couldn't do it, it's "just" not as simple. You would need a compile time part and a runtime part, similar to C++ shared libs and headers. However the compile time part would need to contain a lot more information than a header file. For now, it would basically have to be an intermediate representation of the full code that static checks can run on, as well as information about the binary layout of data. However, a better approach for the future would be to establish language features for explicit guarantees that can be checked at library compile time and assumed at library use time.
Isn't WASM currently attempting to solve the same issue? I heard about the WASM Components Draft and that it basically defined an ABI across all languages by using only a simple but powerful common subset of most languages
LLMs will be writing/checking all the code in the future so what does the language matter? I'm seriously asking. Will the AI even use code in future or is everything going to be generative? And if there is code, wouldn't an AI just write in whatever the lowest level code is in context?
If rust used pre-compiled libraries instead of source crates, you wouldn't be easily able to use a crate that does a specific thing you need on many different platforms. Also, precompiled libraries kinda exist, rustc will compile every crate into rlib first, then compile the final binary. But rlib format is compiler version specific, there's even an error for incompatible metadata
It's the shared library problem that exists and I think is being worked on by NixOS/Nix package manager.
@@YandiBanyu Ah, I was wondering if I would see another Nix user. It (and its sibling Guix) is pretty much the only system package manager that I could think of where packaging Rust libraries actually seems possible. It basically comes down to reimplementing Cargo in the Nix language, which is easier said than done, though people have tried, like with crate2nix.
why not just create 2 crates, one for the actual library that does most of the work and exposes functions through the C ABI, and a second one a wrapper that uses the C ABI and has wrapper functions to convert it back to rust? you can distribute a binary of the first, and compile just the wrapper when building your rust program
I always find myself thinking about dynamic ABIs that are basically typelevel functions and generate portable struct layout specifiers, invariant checks, and suchlike.
There has to be a way to compile binaries in a way that documents its side effects so that these documented side effects may be used for compile-time checks without needing to recompile the binary. Unless some of these properties aren't guaranteed in the end product even if all components satisfy them? I also imagine it would introduce some interesting security considerations, like signing side effect documentation to prevent spoofing.
The build times can be annoying but surely recompiling especially with semantic versioning allows bug fixes to be applied that would not exist in the pre-built binary. Building rust at the point of use with regular rebuilds should be the standard right? That kinda relies on devs using semantic versioning correctly though.
If it is just about compile times, it is possible to create precompiled binaries though. Like dtolnay did with serde. Only the method has to be well documented.
As far as shared objects are concerned, it should be possible once the ABI is standardized. Rust shared objects need not work like C shared objects. Additional information for static analysis can be embedded within those objects in order to help the borrow checker. It is not a trivial task for sure, but I think that it is definitely possible.
What if when the rust compiler compiled code to work with a library, it downloads the source code of that library and compiles its own code using the library source code to statically analyse the library to ensure that the code obeys the borrow checking rules. It wouldn't fix the compile time issues but would be a good first step for allowing rust codebases to take advantage of libraries
C/C++ needs multiple files to compile and work: library itself (and not dynamic necessarily, one may use static libraries too) and header files which are used by compiler to understand interface of functions. For example, compiler could use it for lifetime checks. May be rust's lifetime are too complex and depend on the definition of functions too, then it is not possible to have only declarations for compilation.
When it comes to generics, that is true that compiler needs definition of function, but c++ has templates that are similar to rust's generics. When they are used, obviously, the whole definition of template/generic class/function are visible for compiler and compiled in each application etc. (actually, when it comes to c++ template function with the same template parameter = type for which template is instantiated may be compiled multiple times in one application because there are multiple translation units and they know nothing about each other. Therefore they do not know that compiler created template instantiation for the concrete type)
You only need the header files if you're compiling, though. For deployment only the library is required.
There is a very hacky thing called stabby that does some fascinating stuff with the Rust type system to enforce a representation on structs and enums.
To be honest, the compile times never bothered me with rust. I don't think that it is actually such a big problem as we always read online. It is only the first compilation of a project that takes this long. Also, I LOVE that Rust builds everything into a single file. I was previously coding my hobby projects in C# and I voluntarily started building standalone binaries instead of the library-dependent ones to prevent dependency issues on the end device. I know that we now have solutions like flatpak for this (and I love flatpak!), but standalone binaries also work for the projects I did, with less hassle to deal with. And in my 5 years of using Linux, I've been there multiple times that my distro's libc updates and a bunch of programs break because they haven't been updated yet.
Of course, this approach also isn't perfect, as we see with filesize and compile times. Also, from a security standpoint, a program could ship with an old, vulnerable library, and it's the developer's job instead to update it. Still, I have enough time and disk space, so this approach works better for me.
I think it is possible, yes, but with some concessions.
For example, while you cannot ensure the safety of other languages libraries, you can "kind of" ensure at some level the safety of library Rust code by requiring external functions to do "transpose as runtime" and instead of relying just on the compiler to do the safety checks, rely on runtime types that should be stable between Rust versions (or at least embed a version property so each caller can check if caller/called functions has the appropriate compatible versions). So:
* The library function requires a moved struct? Rust will treat it as a normal function and let the choice of how to dispose the value for the called function
* The lib function requires an immutable reference to the struct? Fine, just pass the reference and believe the safety of the caller
* The lib function requires a mutable reference to the struct? The same as before, maybe make a custom mutable reference struct to check if it was really released externally (or just not, optimistic reliance can be a thing if your project and the library agrees on the stable version)
* How do we agree on the version? We pass a special struct containing the informations of the compiler we both used, if it is compatible (i.e. no breaking changes had occurred between the versions) we let this call happens, if not it panics at runtime (or any other way to do it).
It appears to work at least in theory to solve the problem .
*** Of course this will not work for other languages than rust, but if what you said about compile times and binary sizes is real then you probably could use these tricks to make it work.
A major feature of rust compared to other languages is that it in fact does *not* use runtime types, all of that is *checked at compile time*. This is an important thing there.
@@Sandromatic
This is not strictly true, Rust runtime checking in types like Rc or RefCell (which implements a runtime borrow checking mechanism), you just don't need to use it everywhere because the compiler can do it as well in many cases where statically checking this is possible.
So, it would not be a "break of feature" to do what I'm saying here.
What? Rust syntax is not messy. I find it much tidier than C++ although that can be down to personal preferences. I don't know how it could be objectively described as messier than C++ though.
That's questionable. Rust is absolutely unreadable unless you have thoroughly learned it, or you only look at trivial examples. C++ is not as bad, although if you use all of the more advanced features that crept up over years, it can be also weird to read for sure. I'd say, tough call here, although if you stick to a simpler subset of C++, it's much more readable than any Rust. That said, C++ has grown into a monster for sure. But we don't need a competition of unreadable programming languages.
@@joseoncrack Oh come now. I could just turn that around with equal validity: "C++ is absolutely unreadable unless you have thoroughly learned it, or you only look at trivial examples. Rust is not as bad,"
No really, this must be a familiarity problem. Pretty much all of my Rust applications are only using a subset of the language, as was the case during my C++ career. Most of it as readable as anything. Even my Python head colleagues understand it.
What I began to loath about C++ is that it adds more and more features and complexity without ever fixing the robustness problem I have hatred about it since forever.
Rust's static analysis requires having all of the semantic and structural parts of the program available so it can ensure that everything is consistent. It should be possible to do this using an intermediate form built from the type information and abstract syntax trees, thus not requiring the full source for the things that are now in crates. Over time, this representation can be made more compact. This probably rules out compatibility between the output of different compilers, and even different versions of the same compiler, but if the pre-parsed crate code is stored with the crate, an incompatible compiler can simply compile the source and ignore the short cut.
Binary libraries are a different issue. There must be a way to define a common representation of types exported by a library such that all programs that depend on the library will agree on its structure in memory. Don't allow generic parameters to library functions.
The problems described in this video are only a subset of the reasons I can't use the language in my embedded project. The biggest hurdle is that Rust is unavailable.
Now for my obligatory rant:
Due to the NSA having become brainwashed into believing C cannot be trusted under any circumstances, I am forced to reinvent the wheel and write a full network stack including TLS, HTTPS, and PTP client in Ada, which is the only approved language available for the MicroBlaze soft microcontroller. This new rule alone has added about 4 million dollars to the cost of developing this product.
In the past 10 years, no piece of C code I have released has had any sort of buffer overflow or memory leak. This is provable. There's no dynamic allocation, all structures and arrays that are not on the stack are statically allocated. The code is efficient, deterministic, and robust. It takes a lot of discipline to write this sort of code in any language, Rust included.
Does Rust at least tree shake to compile only the code in crates that is actually used? It would be really nuts if it compiles a lot of stuff not even used. At that point an interpreter might be nicer development experience by far. And from what was compiled by JIT interpreter it could have a leg up on tree shaking perhaps.
what are the changes you would make if you change from ELF to PE/COFF or Mach-O? Do these other binary formats present features that would be useful in accessing the semi-random data that the Linux ABI doesn't?
Also if you're making a single binary why can't (in Windows) it compile each binary individually and not to a single binary but, make instead place each binary into an NTFS stream. This should also work in the Mach-O format by using forks, I think. Cause ADS has something to do with cross compatibility with MAC.
bringing ADS was brought to Linux is controversial due to the 255 byte limit for Extended Attributes.
The reason nobody uses alternate streams is that they're NTFS only, if you move the file to a non-NTFS drive (like a FAT32 usb stick, or a NAS running linux) then you just lose all streams except the main one.
Rust library = memory safe wrapper on C library
I have an ideia for lib management that would not only fix the lib installation issue, but help to use them in the first place.
C programs could compile with some clib files. clib files are bytecode with the pre compiled library and the header info. When you consume the library, you use #include "foo.clib" and done. Then the compiler should compile the lib to an so file, and do the usual compile and link process we all use today.
I'm a noob, but bringing everything into itself to compile and check sounds great to me, letting the compiler gain full insight and control over what is and can be done :)
Also it sounds like it would remove all forms of external dependency, since it's all in the binary, everything.
So rust is currently ideal for use cases with "small projects" that don't rely on a bunch of external code where this is a non-issue? Like embedded etc.
And less ideal for massive projects, especially if you want to be able to compile in separate parts.
Nice video! However last time I checked C does not define an ABI?
if you have a function prototype and its address, you can call functions by pointer as in cpp hehe
but you need to use unsafe mem::transmute to interpret ptr into a prototype, i experimented with low-level calls to "external" functions some time ago and in rust it's works better than in cpp, even without asm macros lol
i’ve thought about this a bit before, i use nim not rust, but it would be nice if the safer, easier, higher level types in both languages and many others could be more easily connected. it would reduce the world’s dependence on c/c++, which i think is a good thing. and before you say nim doesn’t count here because it compiles thru c (and others), there is a third party llvm backend which could someday become the default, and have no dependence on c