Really enjoyed watching the vid, I've been learning computer architecture with nand2tetris and Digital Design and Computer Architecture by David Harris (Author), Sarah Harris (Author). I'm so happy to be able to understand the concepts he was talking about in this vid. Anyway thank you for the easy-for-beginner excellent content.
Hi Tom, at 16:36, on line 19, you should fix the "float(i);" to "(float) i;" I'm assuming you're trying to cast the integer value to a floating point data type.
Why did you need to use "float f" at time index 30:00 - why didn't you combine everything into 1 line of: "d_out[idx] = d_in[threadIdx.x] * d_in[threadIdx.x]" ? Is there a penalty for reading the thread index multiple times - or you did it just for clarity and explaining how the code works?
How do you ensure that the threadID does not go out of bounds of the array? I could have 1000 threads right? But only have 60 elements in array to square.
among all the cuda videos I ve watched this one made the most sense to me
true
This is very good video explanation about GPU computation
It is like impossible power of computation! Beautiful beast!
Amazing lecture. Helped me a loooooot for my final exam. Thank u soooo much. ❤️❤️❤️
Amazing info! Love the way the data flow and execution is explained!
best cuda explanation ever
Great lecture thanks for sharing! Thanks for sharing an interesting piece of history on how "bug" concept came to be
Really enjoyed watching the vid, I've been learning computer architecture with nand2tetris and Digital Design and Computer Architecture by David Harris (Author), Sarah Harris (Author). I'm so happy to be able to understand the concepts he was talking about in this vid. Anyway thank you for the easy-for-beginner excellent content.
Great Lecture! Very helpful!
Excellent introduktion! Thanks!
Great tutorial. Thank you !
Hi Tom, at 16:36, on line 19, you should fix the "float(i);" to "(float) i;" I'm assuming you're trying to cast the integer value to a floating point data type.
Cheers mate! Always love a good programming lecture. :)
15:20 Single Instruction Multiple Threads
Why did you need to use "float f" at time index 30:00 - why didn't you combine everything into 1 line of: "d_out[idx] = d_in[threadIdx.x] * d_in[threadIdx.x]" ? Is there a penalty for reading the thread index multiple times - or you did it just for clarity and explaining how the code works?
Very neat!Thank you!
Could you have squared the d_in array in place? So d_in[idx] = d_in[idx] * d_in[idx]
Can you tell me what threads mean ? because I'm new to the GPU world😁
nice boy
Thank you so much for the video! Quite helpful. Appreciate it :D
How do you ensure that the threadID does not go out of bounds of the array? I could have 1000 threads right? But only have 60 elements in array to square.
you pass the arraysize along with thread amount to the kernal e.g. square < < < 1, arraySize > > > ensres only 64 threads are created
Amazing !!
You could add timestamps
Great explanation! Thy
*thx not thy
Great tutorial! Thank you so much!