Discussion: Iteration

Jonathan Blow

มุมมอง 15 550

เพิ่มลงใน
- เพลย์ลิสต์ของฉัน
- ดูภายหลัง
แชร์

แชร์

ฝัง

ขนาดวิดีโอ:

แสดงแผงควบคุมโปรแกรมเล่น

เล่นอัตโนมัติ

เล่นใหม่

เผยแพร่เมื่อ 26 ส.ค. 2024

ความคิดเห็น • 48

@Xavier-cd6fx 6 ปีที่แล้ว ⁺²
There are some parts of programs that may require more capability to be written in an abstract way because performances are less critical and the modularity (for fast coding) become more important like for the GUI parts or some other things like that. But as you told that you want to do a language that address perfectly the 85% of what is critical in game dev, maybe the iterators aren't really necessary. After that obviously if there is no penalty it can be interesting for some part of programs.
I saw a little of the conversation with Evan Teran, so I want to share an experience.
I used to use stl, Qt, boost (geometry only),... but before leaving my last job I rewrote a 3D engine that was programmed in OOP (with inheritance, RAI,... like OGRE I think), I broke that completely. The performance win was pretty impressive especially on mobile (> x3), putting all renderables objects in separated arrays depending on their types allowed to do the same amount of operations much faster (sorting,...). The resulting wasn't only faster (which was the main goal), but also more concise (1/3 of code maybe), some weird sorting issues solved and I was able to give the flame to a coworker pretty fast too. Sadly I let some other areas of improvement, data structures aren't still optimal and some computations aren't parallelized as it could be.
I learned from this that for this kind of things data oriented code style is much more comprehensive than OOP, because before this rewrite I spent days to try to fix some weird bugs I finally addressed easily (minutes).
So now I am lesser interested in language like D, and more by data oriented programming (and so jai), because I realize that all applications that I work on should take care much more about performances, not only at algorithmic complexity level but also at the hardware one.
PS: As user I have the impression that I see more and more slow/crappy applications, and I tend to ask me if there is too much programmers that don't care enough or if the issue come from tools including the programming languages.
@EvanTeran 8 ปีที่แล้ว ⁺⁴
Your points about the c++ algorithms library are well taken. But an aspect that I feel you glossed over is that they all have **very** specific algorithmic complexity requirements. So while these algorithms don't care very much about the underlying data structures, you know exactly what the performance of them will be (well they can be better than required I suppose).
In many of the cases, the algorithms are specified to literally perform with the best performance known for that operation. As in, there is no known implementation that has a lower algorithm complexity. So the notion is "abstraction leading to poor performance" while true for many things, isn't really the case with these algorithms. (I do welcome counter examples if you disagree...).
In fact, the algorithms library offers implementer an opportunity to do better than what you could do using more conventional language constructs. For a trivial example, if you were to write code to swap some variables, you'd probably have a temporary, and 3 copies and you're done. Not too bad, but your CPU may have features to do even better. A library implementation could choose to have a specialized version of std::swap such that if you are swapping two word sized integers, and the compiler is able to ensure they will be in registers, then it may choose to implement a single xchg instruction.
This is possible because by having the minor abstraction of std::swap, we are able to express to the compiler itself that we aren't just moving data around, but that we specifically intend to swap 2 values. As a bonus, it's shorter to write and is more self documenting.
@EvanTeran 8 ปีที่แล้ว
As a bonus, en.cppreference.com/w/ has much more accurate documentation that cplusplus.com :-)
@jblow888 8 ปีที่แล้ว ⁺⁴
I don't know what the performance of them will be, because I don't care, because I don't ever touch that stuff at all.
As for all the rhetoric you are giving me about how increased abstraction can make your code run faster ... it is all nutty, or at the very best some kind of wishful thinking; this kind of approach has not ever worked in actual reality. When people in my industry want swap to go extra-fast, they specify exactly what happens using the features of that particular CPU, etc. This is always how you optimize -- by specifying exactly what happens. This is necessary because you know more about what your program is supposed to be doing than the compiler does.
@EvanTeran 8 ปีที่แล้ว ⁺³
> I don't know what the performance of them will be, because I don't care, because I don't ever touch that stuff at all.
Well, the point is that if a developer does use them, it isn't some black box that has wildly unpredictable performance. That would make it unusable. The point is that you CAN use it and if you do, you will know the exact performance it will have on an algorithmic level. If you choose not to use something, that's not a fault of the thing in itself. I doubt that you'd be able to find any algorithm in there that could implemented more efficiently (if you can, like I said, examples are welcome).
> As for all the rhetoric you are giving me about how increased abstraction can make your code run faster ... it is all nutty, or at the very best some kind of wishful thinking; this kind
> of approach has not ever worked in actual reality. When people in my industry want swap to go extra-fast, they specify exactly what happens using the features of that particular
> CPU, etc. This is always how you optimize -- by specifying exactly what happens. This is necessary because you know more about what your program is supposed to be doing
> than the compiler does.
It's not rhetoric or wishful thinking, It's expressiveness and frankly experience. I never said the abstractions can "make your code run faster", I said that they allow you to express intent, allowing the library to choose a more specialized implementation.
You are right that the compiler can't know *exactly* what you want. That's also my point :-). By breaking down common algorithms into small reusable functions, we are able to express to the library intent. That intent is a means to specify nearly "exactly what happens" without having to go quite as deep as assembly. Because efficiency of productivity matters too. If we can have that and still have the same exact algorithmic complexity, that's a win overall.
There exists a narrow middle ground between "so abstract it can do everything... but slowly" and "so exact and specific it does one thing as efficiently as feasible in one particular circumstance". That's the area that the algorithms library aims for.
But I think you may be missing why the algorithms library exists, or more specifically what it is trying to achieve. The algorithms are largely simple (2-4 lines long on average), and have rigorously defined "best solutions". The goal of the algorithms library is "these are **solved problems**", let the developer focus on other things besides implementing this algorithm for the millionth time!
I mean, suppose for some reason you need to reverse a list, which is more clear?
std::reverse(begin(l), end(l));
// i'd prefer std::reverse(l), but nothing's perfect
or
// reverse the list
auto first = begin(l);
auto last = end(l);
while ((first != last) && (first != --last)) {
T temp = *first;
*first = *last;
*last = temp;
++first;
}
No one is going to wonder what the first one does, it doesn't even need a comment. And they have EXACTLY the same performance, probably literally the same machine code. The reality is, both implementations are O(n), so you simply CAN'T do any better, you must visit each element once, it's the best you can do. So why should anyone have to implement this manually themselves?
@jblow888 8 ปีที่แล้ว ⁺⁴
I am not going to continue this argument, because I have actual programming to do. But I will say that if you are making statements like "both are O(n), so you CAN'T do any better", then sorry, I am not going to trust you to optimize anything. What academics call 'constant factors' are in reality very important, and serious programmers spend a large amount of their optimization time reducing constant factors.
@EvanTeran 8 ปีที่แล้ว ⁺⁹
Considering that I used the words "algorithmic complexity" several times in what I said, I feel that it is abundantly clear that when I said "both are O(n), so you CAN'T do any better", I was speaking in terms of ... algorithmic complexity. If you feel that you can reverse a list in less than O(n), demonstrate it, and I will support your Nobel Prize in computer science. So, yes, I am aware of constant factors, of course they matter, but if things like reversing a list is where the code spends most of its time, then that a sign of bad architecture more than anything else.
I have provided examples to demonstrate my case, so It's REALLY easy to prove me wrong. Just pick an algorithm found in that header and provide an implementation that beats it for a non-trivial, non-contrived data set. Considering that the vast majority of these algorithms are very simple, it should be very simple to do so if you are right... But honestly, by your own admission, you have no knowledge of the implementation details of these algorithms since you never use them and "don't care" about how they perform. Seems a bit presumptuous to make any assertions about the efficiency they may or may not have.
Also, you keep using phrases like "serious programmers", which to me seems to imply that you believe what others do (perhaps me) is not "serious" programming. If that is what you are trying to imply, it won't help your argument. I can't speak for others who share my opinions, but I've been writing compilers, emulators, operating systems and other highly performance sensitive software for over 15 years. Most would consider that "serious" as well.
@NeXtdra42 8 ปีที่แล้ว ⁺¹
c# has a yield keyword, but it works very different from what you described.
Yield only works on functions returning an Iterator type and the compiler then creates the iterator implementation for you. This would make it possible to still use the complex long form with 3+ functions when you want, but also let people write the really short syntax (for b: buckets for b.elements yield it or something like that) when the control isn't necessary
@blenderpanzi 8 ปีที่แล้ว ⁺²
That combined is 1:1 Python's itertools.chain: docs.python.org/2/library/itertools.html#itertools.chain
The implementation of that function is 4 LOC:
def chain(*iterables):
for it in iterables:
for element in it:
yield element
@yyny0 3 ปีที่แล้ว
`yield` statements are very hard to implement in compiled languages (let alone efficiently), although if they *were* to be properly implemented, then they could easily provide a practical replacement for what Jonathan is proposing here.
@blenderpanzi 3 ปีที่แล้ว
@@yyny0 I guess generator functions need to be translated into state machines when compiled. While I imagine that's not too simple, there are languages that do that. C# has it, and I think Rust has it as experimental. But yes, seems to be the exception in (even JIT) compiled statically typed languages.
@RobinsonFarrar 8 ปีที่แล้ว ⁺¹
Before this talk I was thinking about more control flow options aside from break, continue, and return. Goto is the most flexible but therefore also problematic. But it'd be nice if you could mark blocks of code that could be broken out of by name, or their depth, or something of the like.
So
for my_array {
print(it);
if (it == 5) break;
}
with
for bucket : array.all_buckets {
for * bucket.data {
if !bucket.ocupied[it_index] continue;
visit(it);
}
}
becomes a named block that you can break out of, in a way that retains the meaning of return, or you replace return with something else that breaks out of the parent's context as well
for_nest_5 :: {
for bucket : array.all_buckets {
for * bucket.data {
if !bucket.ocupied[it_index] continue;
print(it);
if (it == 5) break for_nest_5;
}
}
}
@RobinsonFarrar 8 ปีที่แล้ว
and I see that you already mentioned that for loops are labeled
@AinurEru 8 ปีที่แล้ว
Watched a few related talks lately, about co-routines and iteration:
1. await 2.0: Stackless Resumable Functions
2. Awaiting for the ranges
3. From Iterators to Ranges
C# does code-rewrite and state-machine stuff, which both have their issues
stackless-co-rutines look extremely-promising as super-low-overhead solution for generators and generator-based iteration - that simplifies stuff at the 'language-level' in ways that would otherwise be impossible.
I recommend checking-out these talks for some ideas :)
@jamiegx 8 ปีที่แล้ว ⁺³
Might be worth exploring what you would need to implement something like this purely with your metaprogramming features.
@jblow888 8 ปีที่แล้ว
+James Gray (james4k) You already could just do it... well, there might be some minor inconveniences.
@blenderpanzi 8 ปีที่แล้ว ⁺¹
It feels to me the visit thing is kinda like yield in Ruby behaves, but with inlining of the body. Ruby passes closures for each loop and executes that "function object" at the yield statement. So it does not do the fiber/state machine transformation Python and C# do.
@NikkyAI 8 ปีที่แล้ว ⁺¹
wouldnt maybe viewing 'for x { ... }' as closure or capture help?
and then passing a that to the iterator (or visitor?) , which is then inlining it and you have the same result
you would have to make sure that break is expanded to always refer to the current/correct one if its not defined
i could have mixed up things here but that seems to me (non professional programmer) as something that might just work
@DeGuerre 8 ปีที่แล้ว ⁺¹
Just so we're clear, I was definitely corroborating your point on the STL algorithms library. Actually, I just noticed that sort is in there too, so I guess I've used that more than I've used binary search. See, that's how much I care about the STL algorithms library. I didn't even realise that this was where std::sort lived.
You're spot on that it's almost always one of those "I just need to get this done now"-type situations. I've used std::sort a couple of dozen times and a binary search algorithm maybe five times total over the course of 15 years. That's it. Everything else in there is either more conveniently expressed with a for-loop, or so absurdly special-purpose that it's almost certainly a mistake to throw an allegedly general-purpose algorithm at it.
@blenderpanzi 8 ปีที่แล้ว ⁺¹
I suspect that lambdas will be used now for a lot of things was meant to be used. Not sure. I never used either (I'm not a professional C++ developer anyway).
@blenderpanzi 8 ปีที่แล้ว ⁺¹
Oh, wait, I used std::sort(). That's in .
@AinurEru 8 ปีที่แล้ว
An BTW, Python already has almost exactly the kind of thing you're talking about wanting to have:
You simply implement a method called __iter__ in your class, which you could than implement as a generator using 'yield', and then instances of this class automatically-become usable with the built-in for-loop...
For example, if you run this code (even in a default python-interpreter console - try it out):
class Bucket(object):
def __init__(self, data, index_x, index_y):
self.data = data
self.index_x = index_x
self.index_y = index_y
class Bucket_Array(object):
def __init__(self, buckets):
self.buckets = buckets
def __iter__(self):
for bucket in self.buckets:
if bucket.data:
yield bucket
buckets = [Bucket(True, 1, 1), Bucket(False, 2, 2), Bucket(True,3,3)]
bucket_array = Bucket_Array(buckets)
for bucket in bucket_array:
print bucket.data, bucket.index_x, bucket.index_y
You'll get the following output:
True, 1, 1
True, 3, 3
The iteration with the for-loop dynamically created a generator, and did a next() on it, which ran it until the yield-statement, which caused it to yield only buckets with a truthy '.data' attribute in the 'for loop' iteration
@clankill3r 8 ปีที่แล้ว ⁺¹
But you can't remove.
@nameguy101 7 ปีที่แล้ว
Python's ___syntax___ for feature-specific functions is awful. It looks like an implementation-defined macro.
@nameguy101 7 ปีที่แล้ว
Since $ seems to mean "Know what I am at compile time", perhaps you could write (array: $.. [] $T) to mean that the length is known.
Then you could apply that to other vararg procedures similarly to the $$ feature (e.g. now that you know the length, you can safely stack-allocate something that used to be on the heap), and the compiler would tell you if you try to use bar: [] Any; foo(..bar); to spread arguments at runtime.
@jmricketts 8 ปีที่แล้ว
If visit(it) is just going to be a placeholder, why not use notes instead to make it clear that this is not a standard function? You could also use them to define how 'remove' and 'break' are implemented.
#iterator :: (array: *$T/Bucket_Array) {
for bucket: array.all_buckets {
for * bucket.data {
if !bucket.occupied[it_index] continue;
#loop_body_goes_here
}
}
#remove {
bucket_array_remove(array, it);
}
#break {
break bucket;
// Or maybe you set a flag or something.
}
}
There are still problems with this method though. What if the user uses a variable name defined in the iterator?
for bucket_array {
bucket := 0;
do_something(it);
bucket = bucket + 1;
}
Or what if the user wants to rename the iterator variable?
for bucket_item: bucket_array {
do_something(bucket_item);
}
@josbar 8 ปีที่แล้ว
Hi Jonathan, have you ever considered implementing some sort of "Unscoped procedures" in your language?
It would be a very powerful (and potentially dangerous) feature that could help you to achieve transparent generators mimicking the unrolled for-loops semantics at all levels. And also, it would be useful in so many other contexts, for instance error handling. The syntax of such a mechanism should be very explicit at caller-side in my opinion.
@jblow888 8 ปีที่แล้ว
You mean procedures with dynamic scope? Or what?
I don't find the idea of dynamic scope very useful, but I am playing around with macro-like ways of composing procedures, very vaguely.
@josbar 8 ปีที่แล้ว
No, not dynamic scope at all.
The idea is indeed very similar to a macro, because it would be just having a way to call a procedure where statements such as return, break or continue inside that procedure can affect the caller flow directly. So, using examples from your video, when you pass a 'visit' to 'iterate' and you call it from within the for-loop as an 'unscoped' procedure, if that 'visit' call breaks in its top lexical scope, that break would affect the for-loop at caller-side.
The main difference with a macro is that, as I believe, a macro would happen in a different compiler phase, and could not count with the same features the procedures already have. Also, I'm thinking that in order to make things more explicit at caller-side, it would be a help if we can declare at caller side what kind of control sentences are we delegating to the callee. So, on 'iterate' could be good idea to call 'visit(it);' with "visit#unscoped(it);" or even "visit#unscoped[return,break,continue](it);".
I just made the name "unscoped" procedures because it is like inlining a procedure at caller side without adding the lexical scope that should have any regular inlined procedure, probably is not a good name for such a thing.
@jankow3987 8 ปีที่แล้ว ⁺²
Did you rewrote the Compiler in Jai itself? I miss the old Compiler programming streams in C++, hope there is going to be something new in future :-)
@kaustubhken 5 ปีที่แล้ว
Jan Kow no he did not you can check his twitch stream channel is naysayer88
@Manquia 8 ปีที่แล้ว
Hey Jonathan,
Maybe a possible solution is to add a MetaType to the language which at compile time is expanded in a smart macro-ish way.
I am generally not a fan of macros because they aren't very flexible and do NOT think they should have a place in Jai. I do think however a way to have a strongly typed expandable code may be an option.
This would be in some ways an expansion of your template system and assuming your put it into the generated strings file shouldn't be hard to debug when types become messy.
@theb1rd 8 ปีที่แล้ว
The only functions I ever use from are swap, min, and max.
@shavais33 4 ปีที่แล้ว
In several places I've worked, I've encountered a lot of pressure to use things like stuff from where ever humanly possible (or even "inhumanly", sometimes, it seemed to me) because they want you to "stick with the BKMs." (Best Known Methods.) It's been extremely annoying to me. /facepalm annoying, at times. If you've managed to avoid that pressure, I would call you kind of sheltered and blessed. A few times I've worked with people who actually took the time to examine what I did and think about it critically and realize that it was both better for the purpose and easier to grok than their BKM. A very few times, they were actually willing to let my work stand despite the need to do what everyone else is already familiar with.
In Python and C# they have "yield" and "yield return" statements which facilitate the convenient implementation of iterators and generators. I wonder if implementing such a statement in Jai could done in a way that isn't too complex and would still be fast and type safe. Yield return is certainly type safe in C#. But it might be pretty complicated behind the scenes.
Oh later on the yield construct is mentioned. In Python and C#, when you call a generator function, it returns an object which you then use the "MoveNext" function of and then examine the "Current" value of the object. The MoveNext function returns true on success or false if there is no next. But the C# "foreach" statement is designed to work with "IEnumerators" so that calling MoveNext and breaking (exiting the loop) when it returns false and marshalling Current into what we're calling "it" is all done for you.
So there really won't be any basic "for (init, comparison, increment) { statements }" construct? Sure, the same thing can be accomplished with other types of loops, but it's awfully commonly encountered and it's a little more succinct than the same thing done with while. I suppose it's sometimes abused..
@skepticmoderate5790 4 ปีที่แล้ว
Rust has yield now. I'm sure there's is fine.
@clankill3r 8 ปีที่แล้ว
Before I disturb jonathan, could someone tell me if this is considered syntax or not; I have an idea of how chaining functions could be implemented.
@nameguy101 8 ปีที่แล้ว
Well it depends on how you are defining implementation, doesn't it?
@skepticmoderate5790 4 ปีที่แล้ว
He should just have |> (or some equivalent infix operator) like in F#.
@walter0bz 7 ปีที่แล้ว
I liked rust when it had it's original do notation , making the internal iteration style look very natural .. do something.foeach |x|{....} i like internal iteration because it's more natural to parallelise
@iceX33 8 ปีที่แล้ว
I am only on 30:00 but I want to share my thoughts already because I have to do something else and I might forget what I wanted to say now :)
As far as I know, Scala wins the contest of most „comprehensive“ for statements in programming languages.
It does:
- simple iteration
- guards
- midstream variable binding
- and yield, where yield is used not in a sense of a co-routine but more as a map(_for_ is an expression in Scala and not a statement).
(www.scala-lang.org/old/node/111)
Scala Iterators (docs.scala-lang.org/overviews/collections/iterators.html)
As far as I know, internally Scala uses map, flatMap and filter, methods defined on the data structure. This is also why people suggest to write performance critical loops as _while_ statements (at least in 2012, when I was looking into it)
Scala and Swift has an interesting syntactic sugar which also might be interesting for this topic. If last parameter of procedure is a procedure it can be written outside of the parenthesis:
myFor(list){
print($0);
}
this looks very similar to the standard for loop, and helps to design nicely readable APIs.
To the algorithm part, as was mentioned here LINQ is the culmination of this concept. It makes complex easy to write, which IMHO is not a good thing in performance critical applications.
I can recommend following talk, which outlines some performance relevant topics about LINQ and other things:
www.infoq.com/presentations/data-structure-lists
The Linq optimiser from the talk is basically meta programming that can be done with Jai. The question is would we like to have some syntactic sugar and should it be as crazy as the one in Scala :)
Btw. "midstream variable binding" is an elegant way of solving the nested _for_ loops, which were mentioned in the video.
@helmesjo 8 ปีที่แล้ว ⁺¹
Really enjoy your videos, but please, get a proper mic! :D

ต่อไป

เล่นอัตโนมัติ