Hi Tom, at 16:36, on line 19, you should fix the "float(i);" to "(float) i;" I'm assuming you're trying to cast the integer value to a floating point data type.
Why did you need to use "float f" at time index 30:00 - why didn't you combine everything into 1 line of: "d_out[idx] = d_in[threadIdx.x] * d_in[threadIdx.x]" ? Is there a penalty for reading the thread index multiple times - or you did it just for clarity and explaining how the code works?
How do you ensure that the threadID does not go out of bounds of the array? I could have 1000 threads right? But only have 60 elements in array to square.
among all the cuda videos I ve watched this one made the most sense to me
true
Amazing info! Love the way the data flow and execution is explained!
Cheers mate! Always love a good programming lecture. :)
Great lecture thanks for sharing! Thanks for sharing an interesting piece of history on how "bug" concept came to be
Great tutorial. Thank you !
It is like impossible power of computation! Beautiful beast!
Amazing lecture. Helped me a loooooot for my final exam. Thank u soooo much. ❤️❤️❤️
Great tutorial! Thank you so much!
Great Lecture! Very helpful!
Thank you so much for the video! Quite helpful. Appreciate it :D
Very neat!Thank you!
Excellent introduktion! Thanks!
Amazing !!
Hi Tom, at 16:36, on line 19, you should fix the "float(i);" to "(float) i;" I'm assuming you're trying to cast the integer value to a floating point data type.
Why did you need to use "float f" at time index 30:00 - why didn't you combine everything into 1 line of: "d_out[idx] = d_in[threadIdx.x] * d_in[threadIdx.x]" ? Is there a penalty for reading the thread index multiple times - or you did it just for clarity and explaining how the code works?
How do you ensure that the threadID does not go out of bounds of the array? I could have 1000 threads right? But only have 60 elements in array to square.
you pass the arraysize along with thread amount to the kernal e.g. square < < < 1, arraySize > > > ensres only 64 threads are created
Could you have squared the d_in array in place? So d_in[idx] = d_in[idx] * d_in[idx]
nice boy
Can you tell me what threads mean ? because I'm new to the GPU world😁
15:20 Single Instruction Multiple Threads
You could add timestamps
Great explanation! Thy
*thx not thy