Having done some research about the speed claim, decompiling both of the rust and mojo binaries, I saw that mojo optimized out everything, making the resulting binary call main and returning without any work; this was proven by making a program with only a main function, and running it through the same benchmark test, which lead to the exact same results. Rust does the same optimization in any version equal or newer to 1.75.0, and its execution is so much faster that it cannot be properly calculated. In older rust versions, the compiler would create the entire recursion, and the content inside the loop looks something like: allocate 336 bytes, deallocate 336 bytes, check branch condition, call recursion function; this is slower than mojo because it is actually doing work by having to call the allocator multiple times, and having to manage a branch; also note that the compiler did do TCO, shown by having the deallocation happen before the branch condition check. This is a case where the benchmark was cherry-picked to something rust did not optimize at compile time, while also lying about rust not doing TCO, as well as using non-ideomatic rust (as shown in the video, `vec![0; size]` is preferred over `Vec::with_capacity(size)`, and results in a faster execution time than mojo). If they cannot properly explain why their language has the faster execution in the benchmark, having to assume things about how rust works to make something believable, then I do not see how any claims they make will be trusted in the foreseeable future. I am willing to change my mind in the matter if the article writer is able to explain themselves about this discrepancy, but for now, here's the truth about those benchmarks.
@@deiminator2 No engineer worth their time in dolla bills should reach for recursion where they don't need the full stack trace, because even in Mojo you cannot rely on the optimizer in all cases, that's an NP complete problem, not happening. (a side note, a surprising number of optimizations are NP complete problems, we've always just optimized for the most common case) Mojo is targeting the AI Python dev who does not think about these things, so they optimized for their target use case, it's really just that simple. Rust targets syst... well everything really. Unfortunately, neither Rust nor Python nor Mojo is going to edge C++ within the -real- technical AI space. (it's a joke, calm down, I'm a theoretician too). Also, another side note, but MLIR (IR optimized for tensor/AI use case) can be used by Rust as well, it'd not be harder than the GCC back end. I'd be concerned about the creator getting his ego all up in him and gating MLIR optimizations behind Mojo, but I *really* don't think they will. This article is probably just marketing towards those same AI Python devs who don't understand LLVM vs MLIR or SIMD in the first place.
9:20 C# is in an even weirder spot, its between 1, 2 and 3 tier. It can compile to native code, it can run without a garbage collector and it can run in IL mode. its insane
@@Mr.BinarySniper ? haven't written c# since 3 years, just following its progress. No need for assumptions. Its still shit since Microsoft cant get their asses together to get a fucking UI system that will not get deprecated within a week by Microsoft itself.
@@plaintext7288 i'm talking about pure C# not even unity or anything like that. since .NET 8 has AOT support which unity effectively does with their IL2CPP
26:13 you typically see a chip wide downclock when running AVX instructions on a lot of chips. You also have an overhead on the loading the input/fetching the results in many cases. In my experiments I typically see a 40X improvement, not a 64X improvement but it's constantly creeping towards that 64 number by each new architecture.
IIRC, the biggest problems with AVX512 appeared when you mixed them with scalar (or less vectorized) instructions but it could've gotten better since then.
That’s not true anymore. It happened on a range of Intel CPU’s only, but not for newer ones. As Turalcar says, it was the mix of AVX and scalar, because AVX-512 running at those clocks was much faster than scalar at normal clocks. It was also for AVX-512 only, but AVX-512 has been expanded massively since that, thanks to Intel again 😂
I think it was Cloudflare that posted in their blog, btw. They were doing AVX-512 encryption at the proxies, which was slowing down all the other things the proxy does, basically all routing and HTTP.
9:25 What is kind of missing in this pyramid are insane languages like haskell, that explicitly don't surface things like execution order and other details to the user and therefore are able to do aggressive optimizations. Haskell can often get within ~5% of C performance, which is kind of insane for a high-level language.
Looking into some of the internals for Haskell is kind of crazy. The more I learn about it the less I'm surprised that it beats C in some cases. There are definitely workloads not well suited for Haskell, but its semantics allow for some very aggressive optimization that C cannot allow.
as a AI researcher using python all the time... my compute heavy workloads are already running on C++ under the hood. dataset.map isn't the same as a for loop... and if I use a for loop or two --- it takes a few second once or twice. Sure thing, that is good enough. The developers who integrate a c library below python via cffi are the real heros. That stuff is tough - and I am also working on something like that.
This is true. The reason I never found Python to be slow for most tasks is because all the libraries which do common heavy lifting tasks are written in some lower level language by folks who are experts in the domain, so the code ends up running faster than I would be able to do it myself in a lower level language anyway.
@@AggressivesnowmaN For today, Cython 3.1 is very fast for extension libraries and Nim also has great Python interoperably and speed very close to C. Mojo will take a while.
I’m keen to see what a really capable Mojo dev vs a really capable Rust dev can build in a fixed time window and the performance of the 2 solutions. Hell, throw a C++ and Zig dev in there too. Effort is the biggest constraint in my work.
TCO does not "unroll it into a loop" it "reuses the stack frame". When you call a non-inlined function you allocate a stack frame (by incrementing the stack pointer), this holds space for the arguments and the return value and possibly a little other bookkeeping. TCO reuses all the values in the stack frame, including the return address, and as a side effect never has to increment the stack. It can only be used in very exacting conditions (the conditions related to everything the function does must be able to be aggregated into a return value and/or argument variables, and the stack frame must be identical in size). Its real benefit is not so much performance (though it does give some) , its that it prevents a stack overflow from happening (which happens when your stack pointer overflows the preallocated stack space of your application. In this case, the vec initialization creates memory on the heap. If the next call were to rewrite the contents of that variable, the underlying vector on the heap would be leaked. Mojo is greedy in that since _stuff is never used and deleted on its "last use" there is nothing that is not stack allocated, whereas in rust it cannot. Rust can use TCO, just not with heap allocated variables. Granted, I would try an explicit drop in there after the vec init and see what happends.
The observation here would be that TCO was effective on mojo, TCO was not effective on rust, but its smart about stack sizes and had enough to not have a stack overflow, and javascript did not have TCO at all.
Somewhat off topic but a big thing here is that python also doesn't have TCO (at least not out of the box), so dynamic programming is better to do in mojo than python for essentially only some keywords added.
This article is pure marketing. These guys should have just taken the L and walked away. They make a lot of false arguments and they cherry picked that final benchmark, with something that was completely simulated. In the mojo example, the compiler actually just calls the main function and then returns because no work is being done in the program. The same happens in the rust variant when you use the vector macro instead of the with capacity call. Rust can use the same back end as mojo and it also can be optimized for SIMD. This idea that somehow a Python dev is going to have zero friction learning mojo and also get the better performance that rust is absurd. With the straight up false statements that they made in this article, I'm not going to believe anything that these people write in the future.
that level of skill issue when running a benchmark is indicative of one of two things, 1. incompetence when evaluating the performance of optimized code or 2. dishonesty Both make me incredibly skeptical they have the capacity to deliver on their claims.
40:22 Vec::new simply doesn't allocate. Unless you push to it, it will simply point to null. That's really helpful, as that way it can just simply be used in Default impls without having to do any allocations.
I found that helpful when using `std::mem::take` to get around a borrow checker issue without a needless allocation. The current borrow checker might not need a workaround for whatever it was that I was doing.
Contrived use case of the recursive example. In production Rust you'd use a for loop and allocate the Vec once outside and not destroy it for each iteration.
That is fair but the point is that someone in the AI space newly learning Rust will need some level of understanding these things, not to mention there possibly being other cases where the lack of tail call optimization can lead to performance issues. It's true that you still need knowhow to create performant Mojo but someone new to Mojo is less likely to fall into these obscure and minor pitfalls sprinkled throughout the journey of learning something new.
45:51 Doing SIMD manually in rust really isn't fun, but often you really don't need to do it anyway. If you structure your code correctly (i.e. have it have the right alignments and sizes and stuff), LLVM is *really* good at optimizing code for your platform. So instead of doing SIMD manually you structure your code *as if* you want to use SIMD and let the compiler do the rest.
@@stevenhe3462 @stevenhe3462 It's more about how "press" (or a dude/dudette with a blog) makes things up. One Primeagen is now multiple "Rust experts". This will get repeated more and more, and next thing you know people believe all kinds of stuff said by "Rust experts", very few of them actually hauling their arse to verify what was said and if it's actually true and for which values of "win" it is true. At some iteration, someone will seriously ascribe this quote to Klabnik.
I don't like that "owned" thing in Mojo, because the caller of the function may not be aware that a copy of the string is being made! That's why I like that in Rust you have to explicitly use foo clone() at the caller site, making it immediately obvious to who reads that code for the first time what is going on.
They said that it transfers ownership but reverts to clone under the hood if you try to modify it, unless you use the caret in the function call, then it strictly transfers ownership (at least that's what I understood). So it seems like a quality-of-life thing, and it's up for debate for implicit vs explicit is better here
It's always explicit when a copy isn't being made because you have to use the transfer (^) operator to move ownership. It is less clear when a copy is made vs when it's an immutable borrow, you'd have to check the original function definition
31:17 The reason why rust uses drop flags is actually really interesting: If a value gets dropped conditionally, (for example when one branch has a drop(value) and another does not) rust still has to keep track of whether the value was dropped or not at the end of the scope. The solution to remove this overhead would be, to rather than keeping track of whether the value was dropped, to always drop the value at the earliest point of time that it is not used. So in the other branch that does *not* call drop(value) (As a partially deinitialized can't be used anyway, that would be totally doable.) The big downside would be that that would make the deinitialization of values slightly non-deterministic, which is also the reason this method was decided against. (Though sometimes I think it would have been nice if rust had just decided to just do the deinitialization always as soon as possible, but it might also have some other considerations like code size.)
@@nii-san5485 Not necessarily. As long as you don't drop/move the guard (for example by giving away ownership to another function), there is not really any reason for rust to drop the guard early. But *if* you really want to always just insert drops into the program as soon as a value is no longer used (I would call that non-lexical drops), you could just have an explicit drop at the end of the guard to make sure that you explicitly use it and it doesn't get dropped early. If you want to know more about this whole topic, there is a blog article on faultlore called "Destroy All Values: Designing Deinitialization in Programming Languages" going in depth on this topic.
@remrevo3944 I never considered how conditional drops worked under the hood thx for linking that article. But I'm saying with your suggestion of dropping after the last use, a guard would instantly be dropped always unless an explicit drop was added or they're special cased to drop at end of lexical scope.
Which you did consider now that I'm reading back. I still like "lexical" drop as it's a bit more intuitive , e.g. the value just "goes out of scope" which most programmers will already have a feel for 😁
I like Mojician. Also, not surprising that Chris Lattner is iterating on the intermediate language design from LLVM seeing as he was a core (founding?) dev on it for a long time.
There is an additional difference between semaphore and mutex. Semaphore can be controlled by anyone who got access to it, while mutex can only be released by the thread that locked it. Mutex is more of a monitor rather than semaphore.
That TCO example they did is so bad, Rust do have the issue of not being able to do TCO, but that example just does it, because stuff is never used or black_boxed, to demonstrate this you will need a function that counts sheep down starting at n, prints a message for each sheep counted, and returns the total number of sheep that have been counted. This function does not TCO, because the addition operation is performed after the recursive call and the result being used, so the compiler does not automatically optimize it.
For those that don’t know C# can compile to native code just like Go. In terms of speed C#, Java and Go are very similar for many tasks. For pure synthetic benchmarks it’s C# > Go > Java. You have greater memory control in C# than the competitors.
Umm no. Is not that optimized compared to Go. You can not use a lot of libraries due to libraries not being AOT compatible. Like bro, I love C# but I’m not afraid of using another language.
@@peanutcelery neither am I but C# can give u full memory control while Go just can’t. As time passes many libs will support AOT. Minimal APIs already do it in .NET 8
C# is a failure as a language, it's barely floats and only because business adaption and Microsoft sales department. You are kinda right about Go > Java if you ignore most things that makes Java interesting. And comparison about GO and Java is also kinda level of understanding before junior. Did you know that you actually can tweak both GO and Java for your exact application. Discord team actually wrote a guide about optimizing GO garbage collection, and you also have CGO to go beyond that. The idea to think about C# like something more than just Office work and games for Windows is something that only C# developers can have.
@@d1namis Your take about C# is very misleading. I don't know which type of field you work on but C# is everywhere but in ML. Yes that's right, everywhere. There are many places in the world where there is more demand for C# devs than Java. That should tell you something. Go, while nimble , "easy" and easy to grok, it has major problems specially around FFI and FIPS compliance. It would be a fool's mistake to write any high performance application that needs interops with libs already written in C or C++ with Go. C# was made with this in mind ! JNI is a terrible mess, which Java added JNA. C#'s threading model is just like Rust which aligns with C. Have you ever written software that needs to be audited by the goverment? Well guess which lang is easiest to pass? All because MS has first party support for SO many thing that makes the std in Go laughable. You rarelly need third part libs. I'm telling you this as I have writen software which run on Air Force One. Web Dev is not all Dev. Anyone who thinks C# is "dying" is living under a rock for the past decade. I'm also aware of C#'s shortcommings too
27:00 i believe odin does that with their built vector classes and matrices. able to get the power out of simd but have it implicitly built in. Zig is also getting them as well, but Odin is technically 1.0 already.
@@PRIMARYATIAS Copy on write is a convenience to the programmer that has runtime performance implications. It is not a bad tradeoff for many uses, but does impact ultimate performance.
I'd love to see Prime react to an interview Chris Lattner did on how Mojo works and what's happening behind the scenes. I don't think most people realize that Mojo is simply using Python as the syntactical glue for a completely different set of backend processes. To use a car analogy, if Python is a Toyota Camry it's syntax is the paint job and decaling on the outside of the car. Mojo is a Formula 1 car that uses the same paint color as Python's Toyota Camry but it also has cool racing stripes and decals for the superset features/syntax. When I hear people talk about Mojo it's as if they think the language is still a Python Toyota Camry with some aftermarket mods to make it faster which leads them to think it can't possibly be as fast as something like Rust which to them is racecar. Nope Mojo is an actual Formula 1 car with a familiar Python paint job. That's about all they have in common. Which clears up why there's so much appeal. Mojo is basically telling Python programmers you just need to learn a few new concepts and some additional syntax and you'll be able to drive this Formula 1 car. What's even better is that it'll feel almost as easy to drive as your Toyota Camry which you can still drive whenever you want.
He's saying tail call optimization isn't possible because of the scope level deferred memory managment. Meaning for that specific use tail call optimization isn't possible for Rust. Whereas for Mojo, given they are not deferring scope cleanup at the end of each call (end of scope), it easily facilitates tail call optimization. In other words Mojo will provide tail call optimization for generalized use whereas Rust only provides it in subsets where scope does not require memory allocation clean up until scope destruction. This is also why the "drop" didn't fix the issue because it's not changing the deferred scope destruction. In Rust, "drop" flags the allocation for scope destruction. Effectively changing nothing.
1. Rust is definitely doing TCO in the example he show since there's no way that the program will create 1 billion stack without hitting the limit. 2. I don't see any reason that allocating heap memory in recursive function will make TCO impossible in rust, since converting TCO-able function into loop is literally just declaring argument as normal variable, then put the function body into the while loop, and compiler can just put the clean-up step before jumping to next loop.
@@OnFireByte I agree, I would very much like to better understand what's going on here. It could be the author is falsely attributing TCO failure to some underlying semantic implementation detail within Rust. I guess a review of the generated code is the only way to know for sure what's actually going on there.
3. Drop flags only show up when you have a drop that can't be statically determined (e.g. if a variable is only dropped when a runtime condition is true). There are not going to be any drop flags compiled into that code, and explicitly adding an unconditional drop before the recursive call _should_ cause the implicit drop at the end of scope to be omitted.
@@GrantGryczan My understanding is the drop always adds a bit flag and nothing more. The drop is then evaluated at end of scope. Which means in this case, the drop remains regardless of the drop flag. Resulting in no change as the drop is already deferred from scope destruction. In other words, the drop is saying I want you to do what you're already planning on doing.
@@justanothercomment416 As I said in my last comment, that is incorrect; the drop flag is only needed in very few cases. Take a look at the official nomicon documentation on drop flags. It explains this well and is very short and easy to understand.
One thing that doesn't seem to get mentioned at all - mojo is proprietary and not open source? That alone means it's a non-starter for so many projects and use cases that I highly doubt it will reach any kind of critical mass as a language to replace and / or supplement Python / C++ / Rust in the ML space (or any other space for that matter).
Oh, yeah that's a deal breaker imo. But considering that most (afaik) Ai statisticians don't care too much about what lies under the hood (their code is abysmal at times), they might not really care about whether it's open source, but that also means they might not care about using MOJO either
I think they promised to open source it later. I'm not holding my breath. Also if it compile to executable, it's bad thing for AI. In python I can take code of llama and hack around it. When I download LLM models I always review implementation(as most of them are (a) copies of llama, (b) small), if models were delivered as exe it would be terrible: I run linux, many AI researchers run linux, but many LLM users run windows so who knows what executable should be uploaded on HF and .exe can't be reviewed as easily as python code.
@@XxZeldaxXXxLinkxX they might not care but the one funding them will care the fact they won't have SWE by their side to help them as those cares also matters a lot
I work with AI, I use both Python and Rust. I don't know Mojo (yet). This debate irritated me quite a bit - good debate, but I mildly disagree 🙂 We don't use python for its speed! Python is good language to "configure" frameworks like Keras, Torch, Tensorflow or Scikit - that are implemented in c++. Rust is a great replacement for that c++, not for python. Will Mojo be that c++ replacement? I have doubts. Can you trust a language rooted in Python-like prototyping to write hardcore numerical libraries? I would need some more convincing. When somebody says that it is 50% faster than Rust, that does not elicit trust - it just creates hype. On the other hand, to replace Python, Mojo would need to have a library support comparable to python - why would you use it otherwise? Again - we don't use python for its speed... Funny enough - Rust's speed or safety may as well not be the main reason to use it. I have started to rewrite some of my python code to Rust not to gain speed, but mainly for the excellent type-system and secondary for its ability to compile to wasm.
agree on the type system, and add traits and pattern matching for me (I know, Python >3.11 has it too, but it feels like it was an afterthought). I like Rust approach to writing software more simply because of these language design choices (plus testing and examples). In addition, I get amazing speed and memory safety, which I welcome.
10:35 actually got me there prime, damn, ''ruby is not skill issue, it's just slow'' Phew xD, really messed up with my head, got me in the first half, had to recheck the video for the graph xD
If the argument for migrating from python adding 15% learning to get 100x performance would be an irresistible value proposition, then everyone would be writing Nim or Julia
Yeah, I’m quite lost in this Mojo vs Rust discussion. Which usecases we’re talking about, which developers? Say, we take the claim that Mojo has the hardware level of performance seriously. Should BLAS and TensorFlow be reimplemented in Mojo? In that case, I don’t think the familiarity would be a strong selling point. If it’s on the Python side of things, then the most of runtime is spent inside libraries anyway, so what kind of performance gain we are talking about here: instead of 7154 seconds, it will take 7127 (if we are generous)?
So the problem with tail-call optimization in this instance is that they added an extra semicolon, that's it. That's a feature that's in C as well, tail-call optimization only happens when you're returning the final expression. Also, the reason Vector::new() is faster is the allocation gets optimized away.
Why do you harp about Arc over and over, it is a way for multiple threads/tasks to safely share same recourse, it is not specific to Rust, other languages have that too.
It's because it's a tool that effectively steamrolls over the borrow checker. Yeah there's legitimate uses, but you can just use it as a "fuck it just take the damn variable". Using it introduces overhead and performance reduction
@@XxZeldaxXXxLinkxX You use it when you need to use it, if you want to access same recourse from multiple threads/tasks. Yes you can misuse it but in most technologies there are tools that can be misused. My comment reflected constant Primeagen's harping about Arc like that is Rust's way of doing most of data flow/access, which is not.
@@maniacZesci he's not harping on Rust, he's harping on the the people that do that (as a crutch ) . Like harping on the people that use "as any" in typescript. Just memeing pretty much
If you compile targeting a native CPU typically rust will auto-generate SIMD code for you, which you can see on compiler explorer with quite simple code. It becomes more fiddly if you want something that is more platform independent, or if you have dynamic input sizes which always mean you get a couple of items left at the end of the array, the remainder from array_size / sims_block_size, then you need to write painful hand-cranked stuff, but if you know what platform you are running on and compile for it you get most of the benefit without writing specialist code, just as MOJO does.
I don't understand why anyone would learn Mojo when you have Nim. I mean it's the same idea- python-like syntax but compiled. It's even faster than Rust though in many benchmarks and the ecosystem is more mature.
47:30 I think "Would you switch to typescript if introducing this new syntax would allow it to run 100x faster than javascript?" would be even closer analogy. And even I would stop writing vanilla JavaScript if TypeScript were actually faster.
28:57 What he might have omitted there is that RAII does not necessarily means heap allocation. Basically, if you are not using new, the memory for the object will be allocated in the stack. And allocating in the stack is one instruction, freeing is one instruction, no matter how many objects are allocated in the function. So this is far better than GC (if you forget that C# allows to put structs in the stack). On the other hand, yeah, malloc/free can feel slower than a GC in many cases.
Correct C++ is Garbage collected, kinda... There's these things called smart pointers that use the constructor/destructor paradigm to automatically delete on out scope
Rust also has that, it’s RC/ARC. Definitely not a traditional GC like GC tier language (mark and sweep that need to stop the world) but yeah you could say that
@@OnFireByte I'm not say rust doesn't have it I'm just saying calling C++ a manual memory language is wrong if you follow best practices (which is to not use raw pointers unless you don't transfer ownership).
That's a lie. If C++ had GC it would be possible to make equivalent of python's class GraphNode: linked: List["GraphNode"] Impossible in C++, You need to come up manually with explicit strategy on who owns what in a graph and clear up memory. * You can't use unique_ptr's because many nodes can link the same node. * You can't use shared_ptr because graphs have cycles which means if you have A->BC and pass A and A goes out of scope, B and C survive * It's impossible to use weak_ref because somebody need to have non-weak ref. So you need to manually make graph class to handle ownership because C++ has no GC. You don't need to do anything of that in GC. "My program doesn't leak memory kinda" doesn't count. when it ~kinda does.
@@AM-yk5yd it’s just how you define GC, many people consider reference counting as GC because they define GC as just a system that automatically and safely deallocating memory at runtime, but yeah RC isn’t GC if you wanna say that GC need be able to deallocate every data thats doesn’t get referenced by root node (tracing GC). It’s just definition anyway
If you are worried about speed the language is probably the last place you should be looking. Especially as a web developer. If you are that concerned you arent going to be swapping out languages as a fashion statement especially when 50 year old languages will do the job and have been doing the job for those with those concerns. I will never understand some developers. I sometimes think that they really would be more comfortable in a congregation than pretending to be an engineer.
@@sacredgeometry Yes Prime does webdev. Mojo is exciting for me though as a physicist because I have a lot of gripes with the current tooling at our dispense. Numpy, Numba etc are all excellent but I believe that Python is not the right tool for high performance scientific software. That has always been C/C++ and of course, Fortran. So when I'm asked to build complex models in Python from scratch (because that's what community is accustomed to) it's a pain to make it as performant as those compiled languages. That's why I started looking towards Julia and intend to use it as my primary language for my own scientific development until Mojo becomes widely available/open-source. And when it does, we'll see if it is indeed better or not. But if it goes the MATLAB proprietary way, then Julia is our best bet.
If you are worried about raw performance/latency you ARE limited to high performance languages like C/C++/Rust/... If you are programming tight real time control loops or even a game engine you just can't afford running a garbage collector(java) or a slow interpreter(python). Python is awesome but if I can i will use the c backend of a library as it can be a 100x faster (protobuf is a good example)
@@robstamm60 Absolutely. Time and performance critical software exists but as I said: The people writing game engines aren't constantly hunting for new languages. Almost all of the embedded developers I know think the overhead/ abstraction of C++ is too much and that C is perfectly well suited to their jobs. They aren't looking to replace 30+ years of experience every few months to hop on the new hype train.
I honestly LOVE that Mojo programmers are called Mojicians. It just sounds cool. A programming magician. Honestly sounds better than Rustaceans and Pythonistas.
GC requires indirect access. Direct allocation/deallocation can cause fragmentation. Rust tends to have larger continuous struts than copy on write memory management. Explicit memory management can run in far smaller memory usage.
One more Important thing to realise, If you know python and have learned Rust, you are more close to learn Mojo. Because Mojo also introducing features from Rust like Ownership and Borrowing etc. Adding such features will have a skill issue impact on Python developer interested in learning Mojo. Because al-least you need to learn those concept before using them.
Forget about the AI buzzword bingo, but if Mojo becomes a general purpose Language which can be compiled an still interact with the Python ecosystem (even if the library calls have to be interpreted and GC obviously), it would still be a win for me! Yes, maybe their claims about performance are false, but if it is good enough, at least as fast as Go and supports all normal Python features (even if for Example Structs are typed while Python classes are untyped but can still have things like inheritance), it would still be the optimal language, maybe not for AI developers, but for the average Web/Backend/Enterprise Developer.
About vec![0; 42], it actually memsets the first 42 elements. So it allocates and sets, so it might allocate on the first push. With capacity only allocates, so as long as you push less than capacity, you're guaranteed to not allocate.
"Future proof for 50 years" sounds like a dumb prediction, that includes a ramp up and ramp down of usage like we've seen with C, and by the time those 50 years are reached (if that even happens) a new and better language will have been developed. There is nothing lost by learning the language, especially if you already use python, but it's a hyperbolic statement.
the point prime made at @19:24 about the ownership being orthogonal to the type is actually quite good. I wish rust did this the other way round. It seems the fell into a trap trying to make references similar to C++ references. the could required you to say things like `ref` and `owned/copy/clone`. and also remove the idea of implicit copies and require you to always .clone() something.
1. you can just quote the command `hyperfine "node src/index.js"` 2. The point about dropping I think is because the drop for the Vec is put after the recursive call, so it's preventing the TCO?
To prove the TCO example, you could write a for loop that allocates the vector the same number of times. I mean if the idea of TCO and TCE is to make recursive algorithms work as iterative then this should be a fair example of the advantages of having that optimization. My understanding is that since stack variables are eagerly destructed, every time you stop using a variable, the stack pointer decrements so when you get at the end of the funcion, your next stack starts where the old one was. This improves locality and you can work exclusively in cache, making the mojo version significantly faster, your playing with registers at that point.
If we start to account for skill issues, then Java can be as fast as Rust/C++ or even faster (after warmup), because having enough skill you can write garbage-free code and make mnual memory allocations/deallocations. And the part that can make it faster is JIT optimizations, which can be done in current specific usecase, like look-unwinding or operation reordering, which C++ or Rust simply cannot do, because they don't know how the code they produce will be used every time you run a program.
If Mojo can have Pydantic data structs with validation, HTTP libs for serving and posting, database connectors and a kafka connector or something in addition to the AI stuff on the standard library, it could potentially be THE lang for AI powered web
One thing i don’t like about rust is it is not predictable what the compiler does in many cases in terms of optimization. I frequently find some weird not-optimized-away reason here or there in the user forum. But i guess this improves over-time.
Mojo is proprietary and not python. Codon has the same issue. The skill issue with effective SIMD programming is not a syntax issue. If a programmer has the intelligence to program with SIMD, GPU, lifetimes, manual memory management effectively, they can certainly overcome superficial syntax differences such as indention vs curly braces. When thinking of memory management techniques for AI, borrow checking seems like a general bad fit since it is often paired with the general heap allocator. Arena (bump) allocation probably makes more sense for performance. Languages like Zig/Odin/Jai have better deterministic memory management control. Rust is not flexible when it comes to manual memory management, although certainly just a few "unsafe" blocks away from hacking something together.
If safety was the only concern then they wouldn’t be trying to replace C for decades. It’s a little bit more complicated than that. The best joke about that is that the essence of computing is about sepatation of church and state.
I'm excited about when mojo's research will affect other languages to rethink their foundational types. Like rust did to c++. It's not like Mojo has access to a super computer, at the end of the day all langauges have to compile to same asm, and Mojo's "research" and their new ideas will benefit all langauges in future. I think mojo is going to enjoy for being new for sometime, but later C++ will catch on to it by embracing only the good experiments that mojo showed is worth implementing.
For profit company with a bunch of cliché hype-generating moves and "future plans"? Stroking the ego of industry influencers without really saying anything? They're just trying to attract investors, lol.
@@evergreen- They shouldn't lean on misleading advertising if they want to remain creditable. It makes my vaporware detector go off with big red flashing lights.
@@experimentalcyborg you wanna tell me that Java was advertised honesty? Or JavaScript (ecmascript)? What about “blazingly fast”(C) Rust? No, Modular are upselling their Mojo language to get people to try it. Eventually, Mojo 1.0 will be released, people will use it and we will see what it’s actually worth.
RAII was something that I fell in love with in the early 2000's, but I've now grown more skeptical of it. It tends to mix the responsibility of ownership with other responsibilities / operations. A simple example would be sockets, do you want a wrapper that handles the lifetime of and operations on a socket, or do you want them separate, and then take advantage of select/poll/etc. I'm more and more leaning towards more explicit lifetimes and using "views" for operations. A string should be a view, a stringbuilder is the explicit lifetime handler.
The "people writing Python aren't gonna move to rust if mojo becomes a thing" isn't true I think (saying that as one of the people in that domain that actually writes rust right now). Sometimes the problem with python isn't speed but correctness - there's definitely been insitances where I couldn't be confident in the python code doing the right thing; that I haven't missed some edge cases etc., and from what I heard mojo does hardly improve on python in that domain. Mojo may take some use away from rust but it can't replace it - even in the ML / AI domain
What leads to this correctness? Obviously not memory management (cause python has automatic memory managment). So it isn't rusts ownership and borrowing. Is it simply the existence of strong typing? Mojo has strong typing if wanted (and it is often required for high performance mojo code). Is it the more ML features of rust (it powerful enum type and pattern matching)? Genuinely curious what you think leads to this gain in correctness.
@@brendanhansknecht4650 Rust has inherited a lot of ML-isms (as in SML not AI), basically stuff like algebraic dt, hindley-milner types, optionals etc allow you to encode lot of extra information and guard rails into the type system. Mojo can’t have this because it would break compatibility with python esque stuff on fundamental level.
@@brendanhansknecht4650 I think it's that it's generally very explicit and ekes out edge cases - and that it's strongly and *statically* typed yes; and that it has quite an expressive typesystem. I'm not going to accidentally put a a "regular" unsigned into a place where a nonzero one is required for example; I can make algorithms that fail with nans take a floating point type that doesn't have NaNs, can use sum types where they're a good fit, ... Python is of course also strongly typed but the dynamicism takes away a lot. Regarding the memory management: if you get into writing more optimized python you actually start to care about memory management even in Python. I feel like there's not really a lot - if anything - gained here with python over rust.
@@SVVV97 cool. So those are the same reason why I would say I prefer rust over python. That said, people who write python is a gigantic market. Most of them aren't in the same boat. I think for most people who write python, Mojo is much more interesting. Assuming mojo is complete, it would give them: 1. Instant performance gains without changing their code at all 2. A way to add strong static types. On top of that, adding types increases the performance even more. 3. To the python people I interact with, they don't understand the benefits that come from ML. They have never used a nice sum type. So they don't know what they are missing in rust and other ML descendant languages. That said, I do hope that mojo adds good ml style types and pattern matching to python. Would be super happy if they just copy the rust enum type or similar. 4. Assuming modular as a company is successful, it also gives the access to state of the art machine learning tooling All of this with only incrementally changing their python code. I think for most people I know that program in python that is a way bigger sell than rust. Rust isn't something they are considering learning. It is just something they hope someone else learns to make them nicer libraries. Anyway, all this really just to point out the target market of mojo, which is quite large (cause the python ecosystem is huge). I think it only lightly overlaps with the rust market. Aside, I don't full understand Mojo's memory model, but it has ownership, borrowing, and no GC. That said, if I understand correctly, it will have to fall back on reference counting more often than rust.
@@brendanhansknecht4650 "1. Instant performance gains without changing their code at all" Doesn't seem to be that true, at least not unqualifiedly true. There might be some cases where that happens, particularly where python's design leads to things being excruciatingly slow (e.g. loops) but all the examples they have of mojo going blazingly fast (TM) are using the new syntax. "2. A way to add strong static types. On top of that, adding types increases the performance even more." That seems to be built upon python's type annotations, which is understandable, but those are kind of a bad fit for python in general due to their strongly nominal nature in a language that's structural to an extreme. Getting those type annotations right is often non-trivial for this reason, and I don't see what mojo is doing to improve on that. They should've gone with something like C++ concepts or Rust traits instead, that is, syntactic and semantic constraints on types rather than explicitly named types, in most cases. "3. (...) They have never used a nice sum type." Related to the above, seems like a bad fit for such a structural-heavy language. "cause the python ecosystem is huge" I think it remains to be seen how much of an advantage that really is in the end. I suspect people will find that python's dynamic features will make moving to Mojo harder than might've been anticipated from the sales pitch.
The Vec with capacity allocates the vector whereas the new Vec is removed by the compiler because it's never used. I do not know Rust but from a general compiler viewpoint, this would be logical. Rust might even "zero" out the memory allocated to the Vec of capacity. Mojo seems to just wait to allocate until a value is pushed to the vector meaning it never allocates any memory for the Vector in the given example.
honestly, I would say 70% of python developers don't want to use or learn another language and don't have the skills to code in C++, C, Rust, C# or Java.
If there is one person qualified to make a new language it's chris lattner. These discussions are better approached with a very fresh perspective + very open mind. that's the only way to properly evaluate them and truly understand the essence behind the point that the other side is making
@@serena_m_ which in my humble opinion is worse. Sorry but making two different ways to write functions will be so confusing. They also have two types of objects, the standard class and structs. I think this is messy and will make things more difficult, for people coming from python.
I do actually like Mojo's philosophy, particularly when it comes to ML. The ability to load Python's modules is also quite a strong selling point. I'm going to give it a try for a project I'm working on.
At 17:25 did the article confuse move semantics and copy semantics? Rust moves by default to avoid copying the string. In case of copy the original foo would be available for dbg!(foo) and there wouldn't be a compiler error. Primagen should point this out.
I don't why all the talk about this language focuses on perpetuating the lie. "It's Python but fast." It's probably fast, but it's not Python. It's the skeleton of Rust with the skin of a certain snake...
Frankely the whole who is faster debate: Dont give a flying fuck. As someone who was probably going to be programming in python or MatLab for her entire career i just see Mojo 🔥 as an absolute win. And that's really what their pitch should be: "hey python devs, ready for a language thay is written the same way as the one you already use and is 8x faster with exactly the same code and could do more once you learn the arcane runes?"
mojo is going to end up like julia where it’s mostly a meme and you only end up getting fast code if you spend a bunch of time fussing around trying to wrangle the runtime to do what you want no such thing as free lunch
also the focus on tail call optimization as a selling point is kinda meme-worthy in and of itself nobody who’s serious about performance is using recursion and relying on TCO to begin with, and if they are it’s because what they’re doing couldn’t be translated to a for loop without extra memory anyway
@@yevgeniygrechka6431 I agree. Right now Mojo is doing it right. A slower REPL for dev, and static compilation for running the code. No idea why Julia devs bet in pure dynamic language with JIT. Sure it gives you nice features, but makes it worse than Python for small task, that are most of the tasks you do.
~40:00 Maybe rust in current iteration of the compiler in release mode can detect that vec is not used and DCE it out of existence. More proper test would be to do something with vec. Honestly, at that time it'd be necessary to look at assembler code. (I'm not fan of their copies, it seems it can create a lot of headaches, if some things will be copied deeply, some not, but hopefully they thought of it and there will be no auto_ptr 2.0 but with every copyable type)
42:52 my guess is that, because they delete objects as soon as they arent in use, and becausethe vec‘s never in use, that they never alocate until u write code that uses it.
As a mainframe coder I used Java/several other packages/ assemble/cobal (note it can talk to rust code)/etc./c/c++(overloaded, virtual memory problems) ... Java to talk to Rust, Go to manage HTML NONSENSE, ETC., ...
Whenever you consider performance, you also have to consider the optimizations you never had time to do - and the slow "good enough" implementations that you don't have time to improve. This is why I advocate for Python and am super excited about Mojo. Mojo will be 10x faster than Rust/Go (in individual real-world applications) if it improves the speed of iteration, the number of developers who can contribute, and the readability of the code.
From the official Mojo manual: "Mojo uses a third approach called “ownership” that relies on a collection of rules that programmers must follow when passing values. The rules ensure there is only one “owner” for each chunk of memory at a time, and that the memory is deallocated accordingly. In this way, Mojo automatically allocates and deallocates heap memory for you, but it does so in a way that’s deterministic and safe from errors such as use-after-free, double-free and memory leaks. Plus, it does so with a very low performance overhead." So it's much closer to Rust than Java or J# or JS.
Two words: Time Dilation. There's always a sense of being relative. Use what works. The time variants between Rust and Mojo are going to be too close and you'll not really lose unless maybe if you are targeting a process that has an advantage. Mojo will probably target ML solutions and solve them with simple solutions. Whatever you use the other can be it's good looking sibling.
JUNIOR MOJO DEVELOPER REQUIRED. Must have 15 years MOJO development experience. Apply within.
I had this same experience someone in LinkedIn said he has 20 years of exp in ReactJS.
@@mac.ignacio JS makes sense, but not React LOL.
🤣@@mac.ignacio
has to pay minimum wage for EXPOSURE.
Dude, i've got a 40yo experience of Basic. That counts !
The article: *quotes Prime's own points back at him"
Prime: "I totally agree"
Chadagen
Greenhairgen
You have to love yourself before you can love others
nailed it
Needs to add "I stand by my words" to his vocabulary. But for now he agrees with himself.
Having done some research about the speed claim, decompiling both of the rust and mojo binaries, I saw that mojo optimized out everything, making the resulting binary call main and returning without any work; this was proven by making a program with only a main function, and running it through the same benchmark test, which lead to the exact same results. Rust does the same optimization in any version equal or newer to 1.75.0, and its execution is so much faster that it cannot be properly calculated. In older rust versions, the compiler would create the entire recursion, and the content inside the loop looks something like: allocate 336 bytes, deallocate 336 bytes, check branch condition, call recursion function; this is slower than mojo because it is actually doing work by having to call the allocator multiple times, and having to manage a branch; also note that the compiler did do TCO, shown by having the deallocation happen before the branch condition check.
This is a case where the benchmark was cherry-picked to something rust did not optimize at compile time, while also lying about rust not doing TCO, as well as using non-ideomatic rust (as shown in the video, `vec![0; size]` is preferred over `Vec::with_capacity(size)`, and results in a faster execution time than mojo).
If they cannot properly explain why their language has the faster execution in the benchmark, having to assume things about how rust works to make something believable, then I do not see how any claims they make will be trusted in the foreseeable future. I am willing to change my mind in the matter if the article writer is able to explain themselves about this discrepancy, but for now, here's the truth about those benchmarks.
type shit
thanks for the explanation, I was wondering if some compile-time shenanigans were happening
And all of that to justify an L
@@deiminator2 No engineer worth their time in dolla bills should reach for recursion where they don't need the full stack trace, because even in Mojo you cannot rely on the optimizer in all cases, that's an NP complete problem, not happening. (a side note, a surprising number of optimizations are NP complete problems, we've always just optimized for the most common case)
Mojo is targeting the AI Python dev who does not think about these things, so they optimized for their target use case, it's really just that simple. Rust targets syst... well everything really. Unfortunately, neither Rust nor Python nor Mojo is going to edge C++ within the -real- technical AI space. (it's a joke, calm down, I'm a theoretician too).
Also, another side note, but MLIR (IR optimized for tensor/AI use case) can be used by Rust as well, it'd not be harder than the GCC back end. I'd be concerned about the creator getting his ego all up in him and gating MLIR optimizations behind Mojo, but I *really* don't think they will. This article is probably just marketing towards those same AI Python devs who don't understand LLVM vs MLIR or SIMD in the first place.
Expected
9:20 C# is in an even weirder spot, its between 1, 2 and 3 tier.
It can compile to native code, it can run without a garbage collector and it can run in IL mode. its insane
Yeah C# is hard to classify b/c there's a project for seemingly everything. .NET can't do X...or can it?
a c# fanboy detected 😁
@@Mr.BinarySniper ? haven't written c# since 3 years, just following its progress.
No need for assumptions. Its still shit since Microsoft cant get their asses together to get a fucking UI system that will not get deprecated within a week by Microsoft itself.
Iirc it has two entries in the first tier - one which you described + Unity C#-to-C++ so it truly is a magical language 😂
@@plaintext7288 i'm talking about pure C#
not even unity or anything like that. since .NET 8 has AOT support which unity effectively does with their IL2CPP
Green hair = rust expert
Green is what color your white sink will turn if there is Rust in your water ! 🤔 🤔 I think you on to something
Actually patina expert
Cyan hair 😫
Pretty much. It's just called Rust dev hair.
@@mattmmilli8287That’s verdigris, not rust.
I'm totally learning Mojo to be able to call myself a Mojito
lol, but i do love Mojician much more better.
26:13 you typically see a chip wide downclock when running AVX instructions on a lot of chips. You also have an overhead on the loading the input/fetching the results in many cases.
In my experiments I typically see a 40X improvement, not a 64X improvement but it's constantly creeping towards that 64 number by each new architecture.
IIRC, the biggest problems with AVX512 appeared when you mixed them with scalar (or less vectorized) instructions but it could've gotten better since then.
That’s not true anymore. It happened on a range of Intel CPU’s only, but not for newer ones. As Turalcar says, it was the mix of AVX and scalar, because AVX-512 running at those clocks was much faster than scalar at normal clocks. It was also for AVX-512 only, but AVX-512 has been expanded massively since that, thanks to Intel again 😂
I think it was Cloudflare that posted in their blog, btw. They were doing AVX-512 encryption at the proxies, which was slowing down all the other things the proxy does, basically all routing and HTTP.
9:25 What is kind of missing in this pyramid are insane languages like haskell, that explicitly don't surface things like execution order and other details to the user and therefore are able to do aggressive optimizations.
Haskell can often get within ~5% of C performance, which is kind of insane for a high-level language.
With only the downside that you're writing Haskell 😢
sometimes beats C in a few benchmarks...
@@funprogif something beats C in a benchmark, the C code is poorly written
Looking into some of the internals for Haskell is kind of crazy. The more I learn about it the less I'm surprised that it beats C in some cases. There are definitely workloads not well suited for Haskell, but its semantics allow for some very aggressive optimization that C cannot allow.
Highly optimized Haskell looks funny, but yeah the amount of good work that went into optimization side of things in GHC is crazy.
as a AI researcher using python all the time... my compute heavy workloads are already running on C++ under the hood. dataset.map isn't the same as a for loop... and if I use a for loop or two --- it takes a few second once or twice. Sure thing, that is good enough.
The developers who integrate a c library below python via cffi are the real heros. That stuff is tough - and I am also working on something like that.
True, I figure maybe they want to win performance over bindings overhead? Because most AI code in Python is just calling C / CUDA libs anyway
This is true. The reason I never found Python to be slow for most tasks is because all the libraries which do common heavy lifting tasks are written in some lower level language by folks who are experts in the domain, so the code ends up running faster than I would be able to do it myself in a lower level language anyway.
But you may find the need for a custom task that runs quickly. Then you may want to pick a simple / fast language like Mojo or GO
That's what mojo is all about. Making the world under the hood pythonic by building around the new compiler.
@@AggressivesnowmaN For today, Cython 3.1 is very fast for extension libraries and Nim also has great Python interoperably and speed very close to C. Mojo will take a while.
I’m keen to see what a really capable Mojo dev vs a really capable Rust dev can build in a fixed time window and the performance of the 2 solutions. Hell, throw a C++ and Zig dev in there too.
Effort is the biggest constraint in my work.
TCO does not "unroll it into a loop" it "reuses the stack frame". When you call a non-inlined function you allocate a stack frame (by incrementing the stack pointer), this holds space for the arguments and the return value and possibly a little other bookkeeping. TCO reuses all the values in the stack frame, including the return address, and as a side effect never has to increment the stack. It can only be used in very exacting conditions (the conditions related to everything the function does must be able to be aggregated into a return value and/or argument variables, and the stack frame must be identical in size). Its real benefit is not so much performance (though it does give some) , its that it prevents a stack overflow from happening (which happens when your stack pointer overflows the preallocated stack space of your application.
In this case, the vec initialization creates memory on the heap. If the next call were to rewrite the contents of that variable, the underlying vector on the heap would be leaked. Mojo is greedy in that since _stuff is never used and deleted on its "last use" there is nothing that is not stack allocated, whereas in rust it cannot. Rust can use TCO, just not with heap allocated variables.
Granted, I would try an explicit drop in there after the vec init and see what happends.
The observation here would be that TCO was effective on mojo, TCO was not effective on rust, but its smart about stack sizes and had enough to not have a stack overflow, and javascript did not have TCO at all.
Somewhat off topic but a big thing here is that python also doesn't have TCO (at least not out of the box), so dynamic programming is better to do in mojo than python for essentially only some keywords added.
Mojo programmers should clearly be called Jojos
Any compiler errors return KONO DIO DA
@@Jackovasaur I think he meant Mojo Jojo from powerpuff girls but that still works
if they make an IDE, they should call it Dojo
if you can't find a job with it, you will be a hobo
If they make a database in Mojo, they can call it Gojo.
This article is pure marketing. These guys should have just taken the L and walked away. They make a lot of false arguments and they cherry picked that final benchmark, with something that was completely simulated. In the mojo example, the compiler actually just calls the main function and then returns because no work is being done in the program. The same happens in the rust variant when you use the vector macro instead of the with capacity call. Rust can use the same back end as mojo and it also can be optimized for SIMD. This idea that somehow a Python dev is going to have zero friction learning mojo and also get the better performance that rust is absurd.
With the straight up false statements that they made in this article, I'm not going to believe anything that these people write in the future.
that level of skill issue when running a benchmark is indicative of one of two things, 1. incompetence when evaluating the performance of optimized code or 2. dishonesty
Both make me incredibly skeptical they have the capacity to deliver on their claims.
40:22 Vec::new simply doesn't allocate. Unless you push to it, it will simply point to null. That's really helpful, as that way it can just simply be used in Default impls without having to do any allocations.
I found that helpful when using `std::mem::take` to get around a borrow checker issue without a needless allocation. The current borrow checker might not need a workaround for whatever it was that I was doing.
Contrived use case of the recursive example. In production Rust you'd use a for loop and allocate the Vec once outside and not destroy it for each iteration.
That is fair but the point is that someone in the AI space newly learning Rust will need some level of understanding these things, not to mention there possibly being other cases where the lack of tail call optimization can lead to performance issues. It's true that you still need knowhow to create performant Mojo but someone new to Mojo is less likely to fall into these obscure and minor pitfalls sprinkled throughout the journey of learning something new.
45:51 Doing SIMD manually in rust really isn't fun, but often you really don't need to do it anyway. If you structure your code correctly (i.e. have it have the right alignments and sizes and stuff), LLVM is *really* good at optimizing code for your platform.
So instead of doing SIMD manually you structure your code *as if* you want to use SIMD and let the compiler do the rest.
And writing SIMD friendly code in Rust is super easy, e.g. chunks/chunks_exact.
Prime is called "Rust Experts" now.
Totally disagree. Can't even read a simple missing generic argument error message Lol.
@@stevenhe3462 @stevenhe3462 It's more about how "press" (or a dude/dudette with a blog) makes things up. One Primeagen is now multiple "Rust experts". This will get repeated more and more, and next thing you know people believe all kinds of stuff said by "Rust experts", very few of them actually hauling their arse to verify what was said and if it's actually true and for which values of "win" it is true.
At some iteration, someone will seriously ascribe this quote to Klabnik.
@@stevenhe3462Now he switches to Go lol xD I call him a language hopper (similar to distro hopper)
Rust definitely has TCO in that example, no way that it can create like 1 billion stack without hitting stack limit if they didn’t unroll it into loop
I don't like that "owned" thing in Mojo, because the caller of the function may not be aware that a copy of the string is being made! That's why I like that in Rust you have to explicitly use foo clone() at the caller site, making it immediately obvious to who reads that code for the first time what is going on.
They said that it transfers ownership but reverts to clone under the hood if you try to modify it, unless you use the caret in the function call, then it strictly transfers ownership (at least that's what I understood).
So it seems like a quality-of-life thing, and it's up for debate for implicit vs explicit is better here
@@XxZeldaxXXxLinkxX this
It's always explicit when a copy isn't being made because you have to use the transfer (^) operator to move ownership. It is less clear when a copy is made vs when it's an immutable borrow, you'd have to check the original function definition
31:17 The reason why rust uses drop flags is actually really interesting:
If a value gets dropped conditionally, (for example when one branch has a drop(value) and another does not) rust still has to keep track of whether the value was dropped or not at the end of the scope.
The solution to remove this overhead would be, to rather than keeping track of whether the value was dropped, to always drop the value at the earliest point of time that it is not used. So in the other branch that does *not* call drop(value) (As a partially deinitialized can't be used anyway, that would be totally doable.)
The big downside would be that that would make the deinitialization of values slightly non-deterministic, which is also the reason this method was decided against.
(Though sometimes I think it would have been nice if rust had just decided to just do the deinitialization always as soon as possible, but it might also have some other considerations like code size.)
you would also have to special case guard patterns (e.g. MutexGuard)
@@nii-san5485 Not necessarily. As long as you don't drop/move the guard (for example by giving away ownership to another function), there is not really any reason for rust to drop the guard early.
But *if* you really want to always just insert drops into the program as soon as a value is no longer used (I would call that non-lexical drops), you could just have an explicit drop at the end of the guard to make sure that you explicitly use it and it doesn't get dropped early.
If you want to know more about this whole topic, there is a blog article on faultlore called "Destroy All Values: Designing Deinitialization in Programming Languages" going in depth on this topic.
@remrevo3944 I never considered how conditional drops worked under the hood thx for linking that article. But I'm saying with your suggestion of dropping after the last use, a guard would instantly be dropped always unless an explicit drop was added or they're special cased to drop at end of lexical scope.
Which you did consider now that I'm reading back. I still like "lexical" drop as it's a bit more intuitive , e.g. the value just "goes out of scope" which most programmers will already have a feel for 😁
I like Mojician. Also, not surprising that Chris Lattner is iterating on the intermediate language design from LLVM seeing as he was a core (founding?) dev on it for a long time.
Didnt Chris invent/ created the Swift programming language?
@@vectoralphaSec Yep. It's hard to imagine anyone on the planet who is more qualified for doing this than him.
There is an additional difference between semaphore and mutex. Semaphore can be controlled by anyone who got access to it, while mutex can only be released by the thread that locked it. Mutex is more of a monitor rather than semaphore.
That TCO example they did is so bad, Rust do have the issue of not being able to do TCO, but that example just does it, because stuff is never used or black_boxed, to demonstrate this you will need a function that counts sheep down starting at n, prints a message for each sheep counted, and returns the total number of sheep that have been counted. This function does not TCO, because the addition operation is performed after the recursive call and the result being used, so the compiler does not automatically optimize it.
I'm sorry but Mojician is hilariously awesome
Hell yeah thats what im saying. Mojicians just sounds cool at least to me. I like it better than Rustaceans and Pythonistas.
For those that don’t know C# can compile to native code just like Go. In terms of speed C#, Java and Go are very similar for many tasks. For pure synthetic benchmarks it’s C# > Go > Java. You have greater memory control in C# than the competitors.
C# made colored functions
Umm no. Is not that optimized compared to Go. You can not use a lot of libraries due to libraries not being AOT compatible. Like bro, I love C# but I’m not afraid of using another language.
@@peanutcelery neither am I but C# can give u full memory control while Go just can’t. As time passes many libs will support AOT. Minimal APIs already do it in .NET 8
C# is a failure as a language, it's barely floats and only because business adaption and Microsoft sales department. You are kinda right about Go > Java if you ignore most things that makes Java interesting. And comparison about GO and Java is also kinda level of understanding before junior. Did you know that you actually can tweak both GO and Java for your exact application. Discord team actually wrote a guide about optimizing GO garbage collection, and you also have CGO to go beyond that. The idea to think about C# like something more than just Office work and games for Windows is something that only C# developers can have.
@@d1namis Your take about C# is very misleading. I don't know which type of field you work on but C# is everywhere but in ML. Yes that's right, everywhere. There are many places in the world where there is more demand for C# devs than Java. That should tell you something. Go, while nimble , "easy" and easy to grok, it has major problems specially around FFI and FIPS compliance.
It would be a fool's mistake to write any high performance application that needs interops with libs already written in C or C++ with Go. C# was made with this in mind ! JNI is a terrible mess, which Java added JNA. C#'s threading model is just like Rust which aligns with C.
Have you ever written software that needs to be audited by the goverment? Well guess which lang is easiest to pass? All because MS has first party support for SO many thing that makes the std in Go laughable. You rarelly need third part libs. I'm telling you this as I have writen software which run on Air Force One.
Web Dev is not all Dev. Anyone who thinks C# is "dying" is living under a rock for the past decade. I'm also aware of C#'s shortcommings too
I just checked the article and the benchmarks got actually updated
27:00
i believe odin does that with their built vector classes and matrices. able to get the power out of simd but have it implicitly built in. Zig is also getting them as well, but Odin is technically 1.0 already.
It sounds like mojo copy on write is like swift. What rust offers is contiguous memory layouts which gains from cache hits.
IIRC Matlab and Octave (its open source counterpart) also do copy on write.
@@PRIMARYATIAS Copy on write is a convenience to the programmer that has runtime performance implications. It is not a bad tradeoff for many uses, but does impact ultimate performance.
I'd love to see Prime react to an interview Chris Lattner did on how Mojo works and what's happening behind the scenes. I don't think most people realize that Mojo is simply using Python as the syntactical glue for a completely different set of backend processes. To use a car analogy, if Python is a Toyota Camry it's syntax is the paint job and decaling on the outside of the car. Mojo is a Formula 1 car that uses the same paint color as Python's Toyota Camry but it also has cool racing stripes and decals for the superset features/syntax. When I hear people talk about Mojo it's as if they think the language is still a Python Toyota Camry with some aftermarket mods to make it faster which leads them to think it can't possibly be as fast as something like Rust which to them is racecar. Nope Mojo is an actual Formula 1 car with a familiar Python paint job. That's about all they have in common. Which clears up why there's so much appeal. Mojo is basically telling Python programmers you just need to learn a few new concepts and some additional syntax and you'll be able to drive this Formula 1 car. What's even better is that it'll feel almost as easy to drive as your Toyota Camry which you can still drive whenever you want.
mojo sucks for a few reasons
1. closed source
2. auth needed to use it
3. terrible setup on linux
if this is true its DOA
@@madsen4617 DOA ?
He always reminds me of one of the voice actors in elder scrolls. Especially when he speaks the way he does in the first second of this video
Bro I’m never forgetting what a mutex or semaphore is after that godly explanation. 13:00
@@harikrishnanb7273 13:00 :D
That “explain it to me like i am 4 because i am too dumb to be 5” had me crying
Modular pinned Primeagen who then passed Modular by reference.
He's saying tail call optimization isn't possible because of the scope level deferred memory managment. Meaning for that specific use tail call optimization isn't possible for Rust. Whereas for Mojo, given they are not deferring scope cleanup at the end of each call (end of scope), it easily facilitates tail call optimization. In other words Mojo will provide tail call optimization for generalized use whereas Rust only provides it in subsets where scope does not require memory allocation clean up until scope destruction.
This is also why the "drop" didn't fix the issue because it's not changing the deferred scope destruction. In Rust, "drop" flags the allocation for scope destruction. Effectively changing nothing.
1. Rust is definitely doing TCO in the example he show since there's no way that the program will create 1 billion stack without hitting the limit.
2. I don't see any reason that allocating heap memory in recursive function will make TCO impossible in rust, since converting TCO-able function into loop is literally just declaring argument as normal variable, then put the function body into the while loop, and compiler can just put the clean-up step before jumping to next loop.
@@OnFireByte I agree, I would very much like to better understand what's going on here. It could be the author is falsely attributing TCO failure to some underlying semantic implementation detail within Rust.
I guess a review of the generated code is the only way to know for sure what's actually going on there.
3. Drop flags only show up when you have a drop that can't be statically determined (e.g. if a variable is only dropped when a runtime condition is true). There are not going to be any drop flags compiled into that code, and explicitly adding an unconditional drop before the recursive call _should_ cause the implicit drop at the end of scope to be omitted.
@@GrantGryczan My understanding is the drop always adds a bit flag and nothing more. The drop is then evaluated at end of scope. Which means in this case, the drop remains regardless of the drop flag. Resulting in no change as the drop is already deferred from scope destruction.
In other words, the drop is saying I want you to do what you're already planning on doing.
@@justanothercomment416 As I said in my last comment, that is incorrect; the drop flag is only needed in very few cases. Take a look at the official nomicon documentation on drop flags. It explains this well and is very short and easy to understand.
One thing that doesn't seem to get mentioned at all - mojo is proprietary and not open source? That alone means it's a non-starter for so many projects and use cases that I highly doubt it will reach any kind of critical mass as a language to replace and / or supplement Python / C++ / Rust in the ML space (or any other space for that matter).
Completely agree
Oh, yeah that's a deal breaker imo. But considering that most (afaik) Ai statisticians don't care too much about what lies under the hood (their code is abysmal at times), they might not really care about whether it's open source, but that also means they might not care about using MOJO either
I think they promised to open source it later.
I'm not holding my breath.
Also if it compile to executable, it's bad thing for AI. In python I can take code of llama and hack around it.
When I download LLM models I always review implementation(as most of them are (a) copies of llama, (b) small), if models were delivered as exe it would be terrible: I run linux, many AI researchers run linux, but many LLM users run windows so who knows what executable should be uploaded on HF and .exe can't be reviewed as easily as python code.
@@XxZeldaxXXxLinkxX They won't care, but the people developing the libraries they're relying on will care.
@@XxZeldaxXXxLinkxX they might not care but the one funding them will care
the fact they won't have SWE by their side to help them as those cares also matters a lot
I work with AI, I use both Python and Rust. I don't know Mojo (yet). This debate irritated me quite a bit - good debate, but I mildly disagree 🙂
We don't use python for its speed! Python is good language to "configure" frameworks like Keras, Torch, Tensorflow or Scikit - that are implemented in c++. Rust is a great replacement for that c++, not for python. Will Mojo be that c++ replacement? I have doubts. Can you trust a language rooted in Python-like prototyping to write hardcore numerical libraries? I would need some more convincing. When somebody says that it is 50% faster than Rust, that does not elicit trust - it just creates hype. On the other hand, to replace Python, Mojo would need to have a library support comparable to python - why would you use it otherwise? Again - we don't use python for its speed...
Funny enough - Rust's speed or safety may as well not be the main reason to use it. I have started to rewrite some of my python code to Rust not to gain speed, but mainly for the excellent type-system and secondary for its ability to compile to wasm.
agree on the type system, and add traits and pattern matching for me (I know, Python >3.11 has it too, but it feels like it was an afterthought). I like Rust approach to writing software more simply because of these language design choices (plus testing and examples). In addition, I get amazing speed and memory safety, which I welcome.
@@orestdubay6508 cry libtard, Rust is superior 🦀🦀🦀🦀🦀
and then there is the "eye hovering over the pyramid" category: Hardware Description Languages, like Verilog and VHDL.
10:35 actually got me there prime, damn, ''ruby is not skill issue, it's just slow''
Phew xD, really messed up with my head, got me in the first half, had to recheck the video for the graph xD
If the argument for migrating from python adding 15% learning to get 100x performance would be an irresistible value proposition, then everyone would be writing Nim or Julia
Yeah, I’m quite lost in this Mojo vs Rust discussion. Which usecases we’re talking about, which developers? Say, we take the claim that Mojo has the hardware level of performance seriously. Should BLAS and TensorFlow be reimplemented in Mojo? In that case, I don’t think the familiarity would be a strong selling point. If it’s on the Python side of things, then the most of runtime is spent inside libraries anyway, so what kind of performance gain we are talking about here: instead of 7154 seconds, it will take 7127 (if we are generous)?
So the problem with tail-call optimization in this instance is that they added an extra semicolon, that's it. That's a feature that's in C as well, tail-call optimization only happens when you're returning the final expression.
Also, the reason Vector::new() is faster is the allocation gets optimized away.
I'm all about the idiot-matic...i write code, come back 6 months later and think: "which idiot wrote this?...oh"
then you rewrite it better, comeback in 6 months and think "which idiot wrote this" and rewrite it how you had it the first time.
That’s me after 48 hours
Haha okay found my peer group in here
🤣👌
Why do you harp about Arc over and over, it is a way for multiple threads/tasks to safely share same recourse, it is not specific to Rust, other languages have that too.
It's because it's a tool that effectively steamrolls over the borrow checker. Yeah there's legitimate uses, but you can just use it as a "fuck it just take the damn variable". Using it introduces overhead and performance reduction
@@XxZeldaxXXxLinkxX You use it when you need to use it, if you want to access same recourse from multiple threads/tasks.
Yes you can misuse it but in most technologies there are tools that can be misused. My comment reflected constant Primeagen's harping about Arc like that is Rust's way of doing most of data flow/access, which is not.
@@maniacZesci he's not harping on Rust, he's harping on the the people that do that (as a crutch ) . Like harping on the people that use "as any" in typescript. Just memeing pretty much
@@XxZeldaxXXxLinkxX fair enough I might have missed that, not a big fan of reaction videos so I don't follow his channel closely.
If you compile targeting a native CPU typically rust will auto-generate SIMD code for you, which you can see on compiler explorer with quite simple code. It becomes more fiddly if you want something that is more platform independent, or if you have dynamic input sizes which always mean you get a couple of items left at the end of the array, the remainder from array_size / sims_block_size, then you need to write painful hand-cranked stuff, but if you know what platform you are running on and compile for it you get most of the benefit without writing specialist code, just as MOJO does.
I don't understand why anyone would learn Mojo when you have Nim. I mean it's the same idea- python-like syntax but compiled. It's even faster than Rust though in many benchmarks and the ecosystem is more mature.
47:30 I think "Would you switch to typescript if introducing this new syntax would allow it to run 100x faster than javascript?" would be even closer analogy. And even I would stop writing vanilla JavaScript if TypeScript were actually faster.
28:57 What he might have omitted there is that RAII does not necessarily means heap allocation. Basically, if you are not using new, the memory for the object will be allocated in the stack. And allocating in the stack is one instruction, freeing is one instruction, no matter how many objects are allocated in the function. So this is far better than GC (if you forget that C# allows to put structs in the stack). On the other hand, yeah, malloc/free can feel slower than a GC in many cases.
I have no idea what's going on in this video, but I find it fascinating
Correct C++ is Garbage collected, kinda... There's these things called smart pointers that use the constructor/destructor paradigm to automatically delete on out scope
Rust also has that, it’s RC/ARC.
Definitely not a traditional GC like GC tier language (mark and sweep that need to stop the world) but yeah you could say that
@@OnFireByte I'm not say rust doesn't have it I'm just saying calling C++ a manual memory language is wrong if you follow best practices (which is to not use raw pointers unless you don't transfer ownership).
That's a lie.
If C++ had GC it would be possible to make equivalent of python's
class GraphNode:
linked: List["GraphNode"]
Impossible in C++, You need to come up manually with explicit strategy on who owns what in a graph and clear up memory.
* You can't use unique_ptr's because many nodes can link the same node.
* You can't use shared_ptr because graphs have cycles which means if you have A->BC and pass A and A goes out of scope, B and C survive
* It's impossible to use weak_ref because somebody need to have non-weak ref.
So you need to manually make graph class to handle ownership because C++ has no GC.
You don't need to do anything of that in GC. "My program doesn't leak memory kinda" doesn't count. when it ~kinda does.
@@AM-yk5yd no it's not they're called smart pointers.
@@AM-yk5yd it’s just how you define GC, many people consider reference counting as GC because they define GC as just a system that automatically and safely deallocating memory at runtime, but yeah RC isn’t GC if you wanna say that GC need be able to deallocate every data thats doesn’t get referenced by root node (tracing GC). It’s just definition anyway
If you are worried about speed the language is probably the last place you should be looking. Especially as a web developer. If you are that concerned you arent going to be swapping out languages as a fashion statement especially when 50 year old languages will do the job and have been doing the job for those with those concerns.
I will never understand some developers. I sometimes think that they really would be more comfortable in a congregation than pretending to be an engineer.
Not all do webdev. Mojo is not for webdev. It's for AI and scientific programming
@@kinomonogatari Oh no I know. I mean that seems to be primarily what the Primeagen does ... isn't it? If I am wrong then ignore it.
@@sacredgeometry Yes Prime does webdev. Mojo is exciting for me though as a physicist because I have a lot of gripes with the current tooling at our dispense. Numpy, Numba etc are all excellent but I believe that Python is not the right tool for high performance scientific software. That has always been C/C++ and of course, Fortran. So when I'm asked to build complex models in Python from scratch (because that's what community is accustomed to) it's a pain to make it as performant as those compiled languages. That's why I started looking towards Julia and intend to use it as my primary language for my own scientific development until Mojo becomes widely available/open-source. And when it does, we'll see if it is indeed better or not. But if it goes the MATLAB proprietary way, then Julia is our best bet.
If you are worried about raw performance/latency you ARE limited to high performance languages like C/C++/Rust/... If you are programming tight real time control loops or even a game engine you just can't afford running a garbage collector(java) or a slow interpreter(python). Python is awesome but if I can i will use the c backend of a library as it can be a 100x faster (protobuf is a good example)
@@robstamm60 Absolutely. Time and performance critical software exists but as I said: The people writing game engines aren't constantly hunting for new languages.
Almost all of the embedded developers I know think the overhead/ abstraction of C++ is too much and that C is perfectly well suited to their jobs.
They aren't looking to replace 30+ years of experience every few months to hop on the new hype train.
I honestly LOVE that Mojo programmers are called Mojicians. It just sounds cool. A programming magician. Honestly sounds better than Rustaceans and Pythonistas.
GC requires indirect access. Direct allocation/deallocation can cause fragmentation. Rust tends to have larger continuous struts than copy on write memory management. Explicit memory management can run in far smaller memory usage.
One more Important thing to realise, If you know python and have learned Rust, you are more close to learn Mojo. Because Mojo also introducing features from Rust like Ownership and Borrowing etc. Adding such features will have a skill issue impact on Python developer interested in learning Mojo. Because al-least you need to learn those concept before using them.
To be honest though, Ruby also comes with a JIT compiler (a recent new feature) so it can be sped up if needed.
Forget about the AI buzzword bingo, but if Mojo becomes a general purpose Language which can be compiled an still interact with the Python ecosystem (even if the library calls have to be interpreted and GC obviously), it would still be a win for me! Yes, maybe their claims about performance are false, but if it is good enough, at least as fast as Go and supports all normal Python features (even if for Example Structs are typed while Python classes are untyped but can still have things like inheritance), it would still be the optimal language, maybe not for AI developers, but for the average Web/Backend/Enterprise Developer.
Has he already covered with the community the Julia Language ?
About vec![0; 42], it actually memsets the first 42 elements. So it allocates and sets, so it might allocate on the first push. With capacity only allocates, so as long as you push less than capacity, you're guaranteed to not allocate.
"Future proof for 50 years" sounds like a dumb prediction, that includes a ramp up and ramp down of usage like we've seen with C, and by the time those 50 years are reached (if that even happens) a new and better language will have been developed.
There is nothing lost by learning the language, especially if you already use python, but it's a hyperbolic statement.
the point prime made at @19:24 about the ownership being orthogonal to the type is actually quite good. I wish rust did this the other way round.
It seems the fell into a trap trying to make references similar to C++ references.
the could required you to say things like `ref` and `owned/copy/clone`.
and also remove the idea of implicit copies and require you to always .clone() something.
You should write an LSP in C++
1. you can just quote the command `hyperfine "node src/index.js"`
2. The point about dropping I think is because the drop for the Vec is put after the recursive call, so it's preventing the TCO?
What was he saying at 20:35 - 20:43? Sounded like a lot of annoying beeps. Is he a robot? Love the content!
They've actually changed the article on part for Tail Call Optimization.
it might be worth rereading that part.
great we needed one more language, can't wait for the next one
To prove the TCO example, you could write a for loop that allocates the vector the same number of times. I mean if the idea of TCO and TCE is to make recursive algorithms work as iterative then this should be a fair example of the advantages of having that optimization.
My understanding is that since stack variables are eagerly destructed, every time you stop using a variable, the stack pointer decrements so when you get at the end of the funcion, your next stack starts where the old one was. This improves locality and you can work exclusively in cache, making the mojo version significantly faster, your playing with registers at that point.
If we start to account for skill issues, then Java can be as fast as Rust/C++ or even faster (after warmup), because having enough skill you can write garbage-free code and make mnual memory allocations/deallocations. And the part that can make it faster is JIT optimizations, which can be done in current specific usecase, like look-unwinding or operation reordering, which C++ or Rust simply cannot do, because they don't know how the code they produce will be used every time you run a program.
If Mojo can have Pydantic data structs with validation, HTTP libs for serving and posting, database connectors and a kafka connector or something in addition to the AI stuff on the standard library, it could potentially be THE lang for AI powered web
I almost ended up with a Arc mutex hashmap but you saved me from that mistake a few days ago lol.
One thing i don’t like about rust is it is not predictable what the compiler does in many cases in terms of optimization. I frequently find some weird not-optimized-away reason here or there in the user forum. But i guess this improves over-time.
Mojo is proprietary and not python. Codon has the same issue. The skill issue with effective SIMD programming is not a syntax issue. If a programmer has the intelligence to program with SIMD, GPU, lifetimes, manual memory management effectively, they can certainly overcome superficial syntax differences such as indention vs curly braces.
When thinking of memory management techniques for AI, borrow checking seems like a general bad fit since it is often paired with the general heap allocator. Arena (bump) allocation probably makes more sense for performance. Languages like Zig/Odin/Jai have better deterministic memory management control. Rust is not flexible when it comes to manual memory management, although certainly just a few "unsafe" blocks away from hacking something together.
If safety was the only concern then they wouldn’t be trying to replace C for decades. It’s a little bit more complicated than that. The best joke about that is that the essence of computing is about sepatation of church and state.
Modular makes mojo, they of course will say whatever to make mojo more relevant
I'm excited about when mojo's research will affect other languages to rethink their foundational types. Like rust did to c++. It's not like Mojo has access to a super computer, at the end of the day all langauges have to compile to same asm, and Mojo's "research" and their new ideas will benefit all langauges in future.
I think mojo is going to enjoy for being new for sometime, but later C++ will catch on to it by embracing only the good experiments that mojo showed is worth implementing.
For profit company with a bunch of cliché hype-generating moves and "future plans"? Stroking the ego of industry influencers without really saying anything? They're just trying to attract investors, lol.
Very creditable people, though. Plus, AI is THE industry that will grow in the next decade
@@evergreen- They shouldn't lean on misleading advertising if they want to remain creditable. It makes my vaporware detector go off with big red flashing lights.
@@experimentalcyborg you wanna tell me that Java was advertised honesty? Or JavaScript (ecmascript)? What about “blazingly fast”(C) Rust? No, Modular are upselling their Mojo language to get people to try it.
Eventually, Mojo 1.0 will be released, people will use it and we will see what it’s actually worth.
RAII was something that I fell in love with in the early 2000's, but I've now grown more skeptical of it. It tends to mix the responsibility of ownership with other responsibilities / operations. A simple example would be sockets, do you want a wrapper that handles the lifetime of and operations on a socket, or do you want them separate, and then take advantage of select/poll/etc. I'm more and more leaning towards more explicit lifetimes and using "views" for operations. A string should be a view, a stringbuilder is the explicit lifetime handler.
I think in Crystal it works similarly, They have both StringBuilder and String which can never be changed unless copied.
The "people writing Python aren't gonna move to rust if mojo becomes a thing" isn't true I think (saying that as one of the people in that domain that actually writes rust right now). Sometimes the problem with python isn't speed but correctness - there's definitely been insitances where I couldn't be confident in the python code doing the right thing; that I haven't missed some edge cases etc., and from what I heard mojo does hardly improve on python in that domain. Mojo may take some use away from rust but it can't replace it - even in the ML / AI domain
What leads to this correctness? Obviously not memory management (cause python has automatic memory managment). So it isn't rusts ownership and borrowing. Is it simply the existence of strong typing? Mojo has strong typing if wanted (and it is often required for high performance mojo code). Is it the more ML features of rust (it powerful enum type and pattern matching)?
Genuinely curious what you think leads to this gain in correctness.
@@brendanhansknecht4650 Rust has inherited a lot of ML-isms (as in SML not AI), basically stuff like algebraic dt, hindley-milner types, optionals etc allow you to encode lot of extra information and guard rails into the type system. Mojo can’t have this because it would break compatibility with python esque stuff on fundamental level.
@@brendanhansknecht4650 I think it's that it's generally very explicit and ekes out edge cases - and that it's strongly and *statically* typed yes; and that it has quite an expressive typesystem. I'm not going to accidentally put a a "regular" unsigned into a place where a nonzero one is required for example; I can make algorithms that fail with nans take a floating point type that doesn't have NaNs, can use sum types where they're a good fit, ... Python is of course also strongly typed but the dynamicism takes away a lot.
Regarding the memory management: if you get into writing more optimized python you actually start to care about memory management even in Python. I feel like there's not really a lot - if anything - gained here with python over rust.
@@SVVV97 cool. So those are the same reason why I would say I prefer rust over python.
That said, people who write python is a gigantic market. Most of them aren't in the same boat. I think for most people who write python, Mojo is much more interesting. Assuming mojo is complete, it would give them:
1. Instant performance gains without changing their code at all
2. A way to add strong static types. On top of that, adding types increases the performance even more.
3. To the python people I interact with, they don't understand the benefits that come from ML. They have never used a nice sum type. So they don't know what they are missing in rust and other ML descendant languages. That said, I do hope that mojo adds good ml style types and pattern matching to python. Would be super happy if they just copy the rust enum type or similar.
4. Assuming modular as a company is successful, it also gives the access to state of the art machine learning tooling
All of this with only incrementally changing their python code. I think for most people I know that program in python that is a way bigger sell than rust. Rust isn't something they are considering learning. It is just something they hope someone else learns to make them nicer libraries.
Anyway, all this really just to point out the target market of mojo, which is quite large (cause the python ecosystem is huge). I think it only lightly overlaps with the rust market.
Aside, I don't full understand Mojo's memory model, but it has ownership, borrowing, and no GC. That said, if I understand correctly, it will have to fall back on reference counting more often than rust.
@@brendanhansknecht4650 "1. Instant performance gains without changing their code at all"
Doesn't seem to be that true, at least not unqualifiedly true. There might be some cases where that happens, particularly where python's design leads to things being excruciatingly slow (e.g. loops) but all the examples they have of mojo going blazingly fast (TM) are using the new syntax.
"2. A way to add strong static types. On top of that, adding types increases the performance even more."
That seems to be built upon python's type annotations, which is understandable, but those are kind of a bad fit for python in general due to their strongly nominal nature in a language that's structural to an extreme. Getting those type annotations right is often non-trivial for this reason, and I don't see what mojo is doing to improve on that. They should've gone with something like C++ concepts or Rust traits instead, that is, syntactic and semantic constraints on types rather than explicitly named types, in most cases.
"3. (...) They have never used a nice sum type."
Related to the above, seems like a bad fit for such a structural-heavy language.
"cause the python ecosystem is huge"
I think it remains to be seen how much of an advantage that really is in the end. I suspect people will find that python's dynamic features will make moving to Mojo harder than might've been anticipated from the sales pitch.
There’s a “tailcall” crate which adds an annotation (derived trait) to functions.
I think Primeagen misread that as "Explain that like I'm 5 years into a Computer Science program"
The Vec with capacity allocates the vector whereas the new Vec is removed by the compiler because it's never used. I do not know Rust but from a general compiler viewpoint, this would be logical. Rust might even "zero" out the memory allocated to the Vec of capacity.
Mojo seems to just wait to allocate until a value is pushed to the vector meaning it never allocates any memory for the Vector in the given example.
honestly, I would say 70% of python developers don't want to use or learn another language and don't have the skills to code in C++, C, Rust, C# or Java.
If there is one person qualified to make a new language it's chris lattner.
These discussions are better approached with a very fresh perspective + very open mind. that's the only way to properly evaluate them and truly understand the essence behind the point that the other side is making
respect to mojo for using "fn" instead of "def"
It actually has both-def remains the same as regular Python, while fn has new Mojo semantics & optimizations
@@serena_m_ which in my humble opinion is worse. Sorry but making two different ways to write functions will be so confusing. They also have two types of objects, the standard class and structs. I think this is messy and will make things more difficult, for people coming from python.
fn is bad because it's the nth element of the f sequence. Are we paying by the character now?
I do actually like Mojo's philosophy, particularly when it comes to ML. The ability to load Python's modules is also quite a strong selling point. I'm going to give it a try for a project I'm working on.
Will not use a programming language requires my email address to install
At 17:25 did the article confuse move semantics and copy semantics? Rust moves by default to avoid copying the string. In case of copy the original foo would be available for dbg!(foo) and there wouldn't be a compiler error. Primagen should point this out.
I don't why all the talk about this language focuses on perpetuating the lie.
"It's Python but fast."
It's probably fast, but it's not Python.
It's the skeleton of Rust with the skin of a certain snake...
9:12 And *_Odin_* !!! And *_Ada_* !!!
How dare you forget about them, @ThePrimeTimeagen ? 😄
He keeps forgetting about GingerBill. 😆
I have no clue what any of this means but I can't stop watching.
Frankely the whole who is faster debate: Dont give a flying fuck. As someone who was probably going to be programming in python or MatLab for her entire career i just see Mojo 🔥 as an absolute win.
And that's really what their pitch should be: "hey python devs, ready for a language thay is written the same way as the one you already use and is 8x faster with exactly the same code and could do more once you learn the arcane runes?"
mojo is going to end up like julia where it’s mostly a meme and you only end up getting fast code if you spend a bunch of time fussing around trying to wrangle the runtime to do what you want
no such thing as free lunch
also the focus on tail call optimization as a selling point is kinda meme-worthy in and of itself
nobody who’s serious about performance is using recursion and relying on TCO to begin with, and if they are it’s because what they’re doing couldn’t be translated to a for loop without extra memory anyway
Julia would have been ok if it could properly statically compile.
@@yevgeniygrechka6431 I agree. Right now Mojo is doing it right. A slower REPL for dev, and static compilation for running the code. No idea why Julia devs bet in pure dynamic language with JIT. Sure it gives you nice features, but makes it worse than Python for small task, that are most of the tasks you do.
Hopefully less buggy though. Also Mojo is kind of already winning by having 0-indexed arrays.
The article is updated, one more reason to trust Modular, they react to community feedback and point out their mistakes.
Mojician > Rustacean
~40:00 Maybe rust in current iteration of the compiler in release mode can detect that vec is not used and DCE it out of existence. More proper test would be to do something with vec. Honestly, at that time it'd be necessary to look at assembler code.
(I'm not fan of their copies, it seems it can create a lot of headaches, if some things will be copied deeply, some not, but hopefully they thought of it and there will be no auto_ptr 2.0 but with every copyable type)
42:52 my guess is that, because they delete objects as soon as they arent in use, and becausethe vec‘s never in use, that they never alocate until u write code that uses it.
As a mainframe coder I used Java/several other packages/ assemble/cobal (note it can talk to rust code)/etc./c/c++(overloaded, virtual memory problems) ...
Java to talk to Rust, Go to manage HTML NONSENSE, ETC., ...
Whenever you consider performance, you also have to consider the optimizations you never had time to do - and the slow "good enough" implementations that you don't have time to improve.
This is why I advocate for Python and am super excited about Mojo. Mojo will be 10x faster than Rust/Go (in individual real-world applications) if it improves the speed of iteration, the number of developers who can contribute, and the readability of the code.
I think the main thing keeping me away from Rust right now is the community calling themselves Rustaceans.
Btw, who noted that background replacement has treated his gray hair as background allowed to see through, probably because of the hoody
From the official Mojo manual: "Mojo uses a third approach called “ownership” that relies on a collection of rules that programmers must follow when passing values. The rules ensure there is only one “owner” for each chunk of memory at a time, and that the memory is deallocated accordingly. In this way, Mojo automatically allocates and deallocates heap memory for you, but it does so in a way that’s deterministic and safe from errors such as use-after-free, double-free and memory leaks. Plus, it does so with a very low performance overhead."
So it's much closer to Rust than Java or J# or JS.
Two words: Time Dilation. There's always a sense of being relative. Use what works. The time variants between Rust and Mojo are going to be too close and you'll not really lose unless maybe if you are targeting a process that has an advantage. Mojo will probably target ML solutions and solve them with simple solutions. Whatever you use the other can be it's good looking sibling.
46:15 Prime agrees with Prime