Phenomenal video! I really like you way of explaining things. It strikes a perfect balance between clear and fast-paced, not assuming a ton of background intelligence, but not insulting the viewer's intelligence. Very well made.
Looking forward to seeing these points being elaborated. I have a hard time visualizing all the optimization benefits mentioned. Thanks for making this
Content by Alexis is always great :) I think a lot of eager evaluation supporters will not find code motion optimizations compelling because a complexity-improving optimization should be in the hands of the programmer rather than the compiler, according to them. Even if that view is accepted, there are still nice assembly level optimizations that require reordering things in order to get parallel execution or good speculative execution or memory prefetch, though.
I don't see how eager evaluation doesn't also get those benefits. It's not that laziness allows you to reorder execution, it's that it forces side effects out of computation. Having a separation of side effects makes it trivial to reorder execution. To me, laziness is making the compilers job easier, not mine.
@@SquishAndFlickthat's not what laziness is. As for your last point, the more the compiler can do and assume the less you have to do. So yes it does make your job easier because instead of doing the computers job and micro managing everything like (I need to take before map to be performant) you can actually focus on solving problems and being an engineer
Haskell's laziness _is_ letting the programmer choose. If I want to do (Data.Set.fromList . map (read :: String->Int) . words $ "8 3 6"), I'm not forced to make multiple different singly linked lists. If I did want one, I could build it with deepseq.
This makes me think it would be interesting to have a programming language where we can write multiple statements on 1 line separated by semicolons and then in that particular case the execution order is unconstrained, but when separated by lines the execution order is constrained to be top down. Or something similar like we could end statements with colons to make the execution order unconstrained.
The biggest performance gain is from parallel execution in many cases for those programs which can take advantage of it, and parallelism arises naturally out of eager evaluation.
The dual of "lazy" is not "strict" but "eager". "Strict" and "nont-strict" are properties of semantics [affecting the result of the program], "lazy" and "eager" are properties of the evaluation process [affecting run-time resources consumption by the program].
This is only partially true. Yes, “eager” does refer to the operational aspect of evaluation, and yes, “lazy” is the opposite of that term in that context. For example, we might talk about “lazy initialization”, in which case we’d be using the term in that way. But “lazy” does *also* refer to a type of semantics, and laziness is a specific type of nonstrict semantics. Haskell is not just nonstrict, it is specifically lazy, and I will discuss this to some extent in Part 2 and even further in Part 3.
What I understand from your description, "lazyness" refers to delaying evaluation but doesn't change relative order to other expressions. Therefore it's only an optimization without changing semantics. But that's only true because Haskell is pure. If arguments of a function call contains side effects, eg. f (print "hello") (print " ") where f x y = print "world", then evaluation order of arguments and function calls, how many times a parameter is used, all affects semantics.
I leave my original comment as it was, for history, but i stand corrected by @tweag. My mistake was not knowing that 'lazy' is a semantics too. I should have known.
From your argument it seems like if compiler knows if a function is pure it can reorder the code to make it faster. But where is laziness in this situation?
The laziness is in not performing paths that aren't required. E.g. with «let fact x = product [1..x] in map fst $ map (\x -> (x+1, fact x)) xs», there's no need to build tuples or evaluate fact. You may know that as dead code elimination, but note that it's eliminating entire data structures.
I was a longtime Haskell user before switching to Rust about 6 years ago. I look forward to your arguments, but I'm extremely skeptical. I've never missed infinite structures. Honestly, the only real benefit I miss from laziness is forcing Haskell's type system to be awesome.
@@ngruhn I don't use infinite iterators as I haven't encountered a need for them (maybe someday). I generally find laziness to be a solution in search of a problem.
I never had a Haskell job so I never got to use it on “real world“ problems. But for Advent of Code problems I find infinite data structures quite useful. You can often decouple the construction of a search space from searching through it. Usually you have to do both at the same time.
Rust is a systems language and isn't fp. The two aren't comparable at all. He didn't even mention infinite structures afaik but they're extremely useful lol. Like half of math is infinite structures. Also not needing to make sure you micro manage the compiler and tell it to take before map for performance... And not needing to differentiate between data and codata. Sounds like you did nothing with haskell at all
@@SquishAndFlick"I find laziness to be a solution to a problem" more like you don't understand cs or software engineering and anything above your level you deem useless while simultaneously doing little to actually elevate yourself and learn why laziness is good. There's maybe 1000 papers on it out there but here we have the best programmer in the world who knows everything while apparently not being able to even come up with a benefit of laziness 😂. It's hilarious too because C++ devs say rust is a solution to a problem. None of you people belong in this industry
I’m really looking forward to this series! In the example, something you could do to tell gcc that reordering is okay is to mark `bar` with `__attribute__((pure))`. If you view this attribute as describing the “type” of bar (pure has type Int -> Int, impure has type Int -> IO Int), then wouldn’t you get similar reordering benefits?
You are mixing up the differences of pure vs impure with those of lazy vs eager (or you are just explaining in a confusing way). The flexibility of code re-ordering you talk about is mostly due to purity. A pure eager language can reorder expressions. IMO built-in laziness (deferred evaluation of function arguments) is mainly useful with algorithms written with in (possibly infinite) depth-first recursive style here the 'infinite' recursion terminates quickly in practice. The programmer also has to have some heuristics for the actual termination behavior of the algorithms used (not usually formally expressed), to avoid accidental infinite/non-terminating recursion. IMO this is only occasionally useful so a primarily pure eager language which has support for lazy data structures provides a clearer understand of the algorithmic structures of the program/library than a 'laziness everywhere' language. The lazy language may look more 'mathematically clean' which is appealing to math nerds, but the hidden complexity can easily bite you.
No he's not, you just don't understand what he's saying and are babbling a bunch of false information he's basically telling you is wrong or at least short sighted. And "nice to math nerds" you mean people who actually understand CS and software engineering and like to solve problems instead of moving memory around and doing the job of the compiler? Having to micro manage everything is not engineering, that's nerd stuff for people on the spectrum. That's why low level developers who use C tend to be special
Looking forward to part 2! One minor nitpick, though: you claim that the compiler can reorder calls to functions without violating the semantics of the program if they are pure. This is not 100% true - they also need to be total. However, I have the feeling that that's part of where you are going in the next video :).
Is that true? Swapping the order of evaluation of two pure expressions, exactly one of which evaluates to bottom, gives an overall evaluation to bottom in both cases, with no side effects to reveal which happened. (They are operationally different of course, but give the same semantic outcome.)
@jeffclites743 you're right. I wasn't clear enough in my original comment (which i should have been given I was being a bit pedantic). The claim in the video (or at least how I interpret it) is that if we have two functions, and we know that one is pure, then we can reorder them. This is what i was trying to say was false. It is in fact only true if either the other function is also pure (the case you provide), or one is both pure and total, in which case the other is allowed to have side effects.
@@ConnorRedfern-g9w Ah yes that makes sense. In a way, nontermination is an effect, or at least it's something observable. On the other hand, a function can be pure and total but take arbitrarily long to complete, so in practice you can't tell the different between nonterminating and excessively long-running. I think that to be really precise with all of this, you need to precisely define what counts as observable and what you are allowed to "care" about. So you could decide that a nonterminating program is broken and that it doesn't matter when it hits an infinite loop relative to other observable effects, or you could decide that it does matter. (I could imagine an argument for either approach.)
@@ConnorRedfern-g9wbeing a pure function requires being total. In fact being a function at all does but without totality you don't get referential transparency. You also need determinism. No idea how you people got to this level of understanding and still don't know what a pure function is and think it just means side effect free
@@AndreiGeorgescu-j9pYeah not sure how I missed this... I think my brain was in "haskell mode" which conflates "pure" with "does not have a monadic return value" rather than "as in set theory" (yeah yeah this is bad I know). On your first point, doesn't being referentially transparent mean that a function could be replaced with its return value & not change the semantics of the program? In the case of nontermination I would say that this is either poorly defined or you'd need to put in a "bottom" value (from domain theory). In this sense, a nonterminating function is referentially transparent (you can in fact replace it with bottom). There could be some more formal definition of referential transparency I don't know about that prove me wrong though.
Nice material, although it would be good to show assembly generated during the compilation with different optimization flags. Also Alexis could speak slower :)
Great Explanation. However, I do think if Haskell is to remain consistent in its philosophy, that is, strict and safe, it would be better to have the sequential execution order as default. To enable optimizations that disregard the order of execution, one should go for some explicit construct/built-in types instead.
Phenomenal video! I really like you way of explaining things. It strikes a perfect balance between clear and fast-paced, not assuming a ton of background intelligence, but not insulting the viewer's intelligence. Very well made.
Looking forward to seeing these points being elaborated. I have a hard time visualizing all the optimization benefits mentioned. Thanks for making this
Excited for the next vid!
Content by Alexis is always great :)
I think a lot of eager evaluation supporters will not find code motion optimizations compelling because a complexity-improving optimization should be in the hands of the programmer rather than the compiler, according to them.
Even if that view is accepted, there are still nice assembly level optimizations that require reordering things in order to get parallel execution or good speculative execution or memory prefetch, though.
I don't see how eager evaluation doesn't also get those benefits. It's not that laziness allows you to reorder execution, it's that it forces side effects out of computation. Having a separation of side effects makes it trivial to reorder execution. To me, laziness is making the compilers job easier, not mine.
@@SquishAndFlickthat's not what laziness is. As for your last point, the more the compiler can do and assume the less you have to do. So yes it does make your job easier because instead of doing the computers job and micro managing everything like (I need to take before map to be performant) you can actually focus on solving problems and being an engineer
Haskell's laziness _is_ letting the programmer choose. If I want to do (Data.Set.fromList . map (read :: String->Int) . words $ "8 3 6"), I'm not forced to make multiple different singly linked lists. If I did want one, I could build it with deepseq.
Functional programming videos by Alexis lets gooo
Hey, great video! Excited for the rest of the series
I am looking forward to see where this is going. Keep the videos coming! :)
This makes me think it would be interesting to have a programming language where we can write multiple statements on 1 line separated by semicolons and then in that particular case the execution order is unconstrained, but when separated by lines the execution order is constrained to be top down. Or something similar like we could end statements with colons to make the execution order unconstrained.
Great video! I'm looking forward to the rest of the series!
Beautifully explained and presented. Thank you ✌️
Haskell *and* prolog. Oh my!
I’ve been waiting for this since you alluded to it in the last video
Excellent video! the technical theory of haskell is super fascinating and you explain things super well!
Good stuff, thanks! I was also curious about the semantics of a non-lazy Haskell.
The biggest performance gain is from parallel execution in many cases for those programs which can take advantage of it, and parallelism arises naturally out of eager evaluation.
The dual of "lazy" is not "strict" but "eager". "Strict" and "nont-strict" are properties of semantics [affecting the result of the program], "lazy" and "eager" are properties of the evaluation process [affecting run-time resources consumption by the program].
This is only partially true. Yes, “eager” does refer to the operational aspect of evaluation, and yes, “lazy” is the opposite of that term in that context. For example, we might talk about “lazy initialization”, in which case we’d be using the term in that way. But “lazy” does *also* refer to a type of semantics, and laziness is a specific type of nonstrict semantics. Haskell is not just nonstrict, it is specifically lazy, and I will discuss this to some extent in Part 2 and even further in Part 3.
What I understand from your description, "lazyness" refers to delaying evaluation but doesn't change relative order to other expressions. Therefore it's only an optimization without changing semantics.
But that's only true because Haskell is pure. If arguments of a function call contains side effects, eg. f (print "hello") (print " ") where f x y = print "world", then evaluation order of arguments and function calls, how many times a parameter is used, all affects semantics.
I leave my original comment as it was, for history, but i stand corrected by @tweag. My mistake was not knowing that 'lazy' is a semantics too. I should have known.
it's likely i've misunderstood but wouldn't an eager-by-default language that tracks side effects also be able to reorder things in the same way?
Excellent, thank you, this was exceptionally informative
From your argument it seems like if compiler knows if a function is pure it can reorder the code to make it faster.
But where is laziness in this situation?
The laziness is in not performing paths that aren't required. E.g. with «let fact x = product [1..x] in map fst $ map (\x -> (x+1, fact x)) xs», there's no need to build tuples or evaluate fact. You may know that as dead code elimination, but note that it's eliminating entire data structures.
I was a longtime Haskell user before switching to Rust about 6 years ago. I look forward to your arguments, but I'm extremely skeptical.
I've never missed infinite structures. Honestly, the only real benefit I miss from laziness is forcing Haskell's type system to be awesome.
Are you using infinite iterators instead?
@@ngruhn I don't use infinite iterators as I haven't encountered a need for them (maybe someday). I generally find laziness to be a solution in search of a problem.
I never had a Haskell job so I never got to use it on “real world“ problems. But for Advent of Code problems I find infinite data structures quite useful. You can often decouple the construction of a search space from searching through it. Usually you have to do both at the same time.
Rust is a systems language and isn't fp. The two aren't comparable at all. He didn't even mention infinite structures afaik but they're extremely useful lol. Like half of math is infinite structures. Also not needing to make sure you micro manage the compiler and tell it to take before map for performance... And not needing to differentiate between data and codata. Sounds like you did nothing with haskell at all
@@SquishAndFlick"I find laziness to be a solution to a problem" more like you don't understand cs or software engineering and anything above your level you deem useless while simultaneously doing little to actually elevate yourself and learn why laziness is good. There's maybe 1000 papers on it out there but here we have the best programmer in the world who knows everything while apparently not being able to even come up with a benefit of laziness 😂. It's hilarious too because C++ devs say rust is a solution to a problem. None of you people belong in this industry
GREAT STUFF
I’m really looking forward to this series!
In the example, something you could do to tell gcc that reordering is okay is to mark `bar` with `__attribute__((pure))`. If you view this attribute as describing the “type” of bar (pure has type Int -> Int, impure has type Int -> IO Int), then wouldn’t you get similar reordering benefits?
What font are you using?
You are mixing up the differences of pure vs impure with those of lazy vs eager (or you are just explaining in a confusing way). The flexibility of code re-ordering you talk about is mostly due to purity. A pure eager language can reorder expressions. IMO built-in laziness (deferred evaluation of function arguments) is mainly useful with algorithms written with in (possibly infinite) depth-first recursive style here the 'infinite' recursion terminates quickly in practice. The programmer also has to have some heuristics for the actual termination behavior of the algorithms used (not usually formally expressed), to avoid accidental infinite/non-terminating recursion. IMO this is only occasionally useful so a primarily pure eager language which has support for lazy data structures provides a clearer understand of the algorithmic structures of the program/library than a 'laziness everywhere' language.
The lazy language may look more 'mathematically clean' which is appealing to math nerds, but the hidden complexity can easily bite you.
No he's not, you just don't understand what he's saying and are babbling a bunch of false information he's basically telling you is wrong or at least short sighted. And "nice to math nerds" you mean people who actually understand CS and software engineering and like to solve problems instead of moving memory around and doing the job of the compiler? Having to micro manage everything is not engineering, that's nerd stuff for people on the spectrum. That's why low level developers who use C tend to be special
Looking forward to part 2! One minor nitpick, though: you claim that the compiler can reorder calls to functions without violating the semantics of the program if they are pure. This is not 100% true - they also need to be total. However, I have the feeling that that's part of where you are going in the next video :).
Is that true? Swapping the order of evaluation of two pure expressions, exactly one of which evaluates to bottom, gives an overall evaluation to bottom in both cases, with no side effects to reveal which happened. (They are operationally different of course, but give the same semantic outcome.)
@jeffclites743 you're right. I wasn't clear enough in my original comment (which i should have been given I was being a bit pedantic). The claim in the video (or at least how I interpret it) is that if we have two functions, and we know that one is pure, then we can reorder them. This is what i was trying to say was false. It is in fact only true if either the other function is also pure (the case you provide), or one is both pure and total, in which case the other is allowed to have side effects.
@@ConnorRedfern-g9w Ah yes that makes sense. In a way, nontermination is an effect, or at least it's something observable.
On the other hand, a function can be pure and total but take arbitrarily long to complete, so in practice you can't tell the different between nonterminating and excessively long-running. I think that to be really precise with all of this, you need to precisely define what counts as observable and what you are allowed to "care" about. So you could decide that a nonterminating program is broken and that it doesn't matter when it hits an infinite loop relative to other observable effects, or you could decide that it does matter. (I could imagine an argument for either approach.)
@@ConnorRedfern-g9wbeing a pure function requires being total. In fact being a function at all does but without totality you don't get referential transparency. You also need determinism. No idea how you people got to this level of understanding and still don't know what a pure function is and think it just means side effect free
@@AndreiGeorgescu-j9pYeah not sure how I missed this... I think my brain was in "haskell mode" which conflates "pure" with "does not have a monadic return value" rather than "as in set theory" (yeah yeah this is bad I know).
On your first point, doesn't being referentially transparent mean that a function could be replaced with its return value & not change the semantics of the program? In the case of nontermination I would say that this is either poorly defined or you'd need to put in a "bottom" value (from domain theory). In this sense, a nonterminating function is referentially transparent (you can in fact replace it with bottom). There could be some more formal definition of referential transparency I don't know about that prove me wrong though.
Nice material, although it would be good to show assembly generated during the compilation with different optimization flags. Also Alexis could speak slower :)
lazy is as lazy does >>=
finally, i can be lazy...
Great Explanation. However, I do think if Haskell is to remain consistent in its philosophy, that is, strict and safe, it would be better to have the sequential execution order as default. To enable optimizations that disregard the order of execution, one should go for some explicit construct/built-in types instead.
The suspense is unbearable ;)