Thank you so much for responding to my request for making a CUDA programming. I have donated 0.1 BTC to your account as a way to thank you. My professor has done so many hours trying to explain CUDA and none of my classmates really understood. I just can not believe that you do all this for free and that is why me and my classmates have decided to collect some funds to donate to you. Thanks for all that you do and please keep going.
Thank you so much. Probably the best introdution to CUDA with Python. The example you use, while very basic, touches on usage of blocks, which is usually omitted in other introduction-level tutorials. Great stuff! Hope you return with some more videos. I have subscribed!
I have been looking into gpu programming using numba and python for a while, this seems to be the best tutorial I was able to find so far.. . thank you
as a data scientist +2 years of experience, i ALWAYS learn something new with your content! please nich, never stop doing this things, and also, never cut your smile in your face, even if your are having bugs!!
Hey this is super useful! I elected High Performance Computing and Microprocessors and Embedded Systems modules for my degree, and this channel has become my go-to guide.
Love the channel Nicholas, have recently graduated from an NLP Master's degree and seeing you explain stuff in a simpler way and your coding challenges is really helping me connect with the material I've learned! Keep it up and I'll keep watching!
Ahmad , thanks for taking time to create these videos. It is unfortunate that people view your videos and then feel inspired to complain about a free gift. Folks could just keep it moving or add helpful insights.
Hey Ahmad , I love watching your videos because of the way you tell the story. Great graphics mate. Love the reference to rocket man too... lol keep up the good work.
Perfect Video! Saw was revealing to me to understand how it works. Thank you! I am a new subscriber of your channel. Regards from Buenos Aires, Argentina
This is very helpful. Most people don't realize the overheads and code refactoring necessary to take advantages of the GPUs. I am going to refactor a simple MNIST training propgram I have which currently uses only Numpy. See if I can get meaningful improvements in training time.
This was a great video to me, I have very limited C++ experience and was looking for an explanation of CUDA. Another video like this could easily have been 70-80% over my head. This one was only about 15% whoosh. And now I actually find C++ interesting again!
Thank you very much for this tutorial. I would love to have the code available because typing it in myself from the video is a bit hard especially with the atocomplete on all the time. Keep up the good work.
Once you initialized lr to 0.0, I knew you were going to forget to change it lol. Love the challenges tho, keep doing them, I think it would be cool to see how you implement a neural network from scratch
Ayyyy, so glad you like it @Patrick. For the last two weeks I've just been making videos on stuff I find hard or want to get my head around I figure it's not just me staring there at some of these concepts like huh?!? Thanks for checking it out!!
Interesting, but two remarks: Example 1: on my setup (3080Ti, i7-8700K, running in WSL2 with Ubuntu 22.04) vector multiplication runs actually *faster* on CPU (if you either use the vectorized formulation in MultiplyMyVectors with target "cpu" or, simply, a*b instead of the unnecessary for loop in the CPU code). IMO that is mostly due to the overhead of copying the data to the GPU memory. Example 2: to get a fair comparison, you should also use the JIT for FillArrayWithouGPU, decorating with @jit(target_backend="cpu"). Then, GPU array filling is still faster, but only by a factor of 2.
Thanks for the video, subscribed! A suggestion : this small change to your code would demonstrate a real-world gradient descent solution for linear regression with noisy data. E.g. :
Ahmad sorry for bother you, the problem was not installing Cuda Toolkit, srly I hate people who doesnt watch full video closely and ask stupid questions....and now I m one of them :D. Thx alot for this tutorial in 2 months i will try write my own GPU operator for my program, would be interting if this will be faster than CPU. (Btw using normal Visual code in python 3.10 env. on win 11, so far so good. (Altrough i have some code output delay problem when using openCV for some strange reason)
Ahmad , great video. You have a great way of explaining things and help a lot of people. IMO a lot of the critic you get - such as here is unfunded. By the way, I do not see any video postings by Juan???. I am trying to get my TH-cam channel started and hope that in 10 years time I will be 1/10th as good as Ahmad 👍.
yes, you could do this by hand, which would be a great challenge in distributed computing to code by hand. Another option is to use a framework/platform like AWS Sagemaker to do distributed kmeans. Most organizations will do this.
It works on both AMD and NVIDIA. If you have CUDA code, you can convert it to HIP with their automated tool, there is very little CUDA specific that can't be just translated over.
Very nice tutorial. I really liked it. It's brief, to the point and very clear. Thanks. Could you please make a video for the same example but in Linux?
Thank you so much for responding to my request for making a CUDA programming. I have donated 0.1 BTC to your account as a way to thank you. My professor has done so many hours trying to explain CUDA and none of my classmates really understood. I just can not believe that you do all this for free and that is why me and my classmates have decided to collect some funds to donate to you.
Thanks for all that you do and please keep going.
Thank you for the donation, it really means a lot !
@@AhmadBazzi No thank you !
Thank you so much for responding to my request for making a CUDA programming.
Wow amazing
Wow amazing
You just opened my eyes to parallel programming. Thanks for the quick overview.
Too hard to find high -quality content like this these days. Thank you so much
Too hard to find high-quality content like this these days. Thank you so much
That was very well explained. I have only have taken one course, and you made it clearer than my professor or fellow students ever did.
12:36 This guy is a God !
very nice
So beautiful
Thank you so much. Probably the best introdution to CUDA with Python. The example you use, while very basic, touches on usage of blocks, which is usually omitted in other introduction-level tutorials. Great stuff! Hope you return with some more videos. I have subscribed!
Excelent
this was such an excellent video
Just did my research and this guy is at one of the most prestigious universities in the world ! No wonder why his lectures come up neat !
I have been looking into gpu programming using numba and python for a while, this seems to be the best tutorial I was able to find so far.. . thank you
as a data scientist +2 years of experience, i ALWAYS learn something new with your content! please nich, never stop doing this things, and also, never cut your smile in your face, even if your are having bugs!!
Hey this is super useful! I elected High Performance Computing and Microprocessors and Embedded Systems modules for my degree, and this channel has become my go-to guide.
Thank you so much for this series! It's so clear and easy to follow
Love the channel Nicholas, have recently graduated from an NLP Master's degree and seeing you explain stuff in a simpler way and your coding challenges is really helping me connect with the material I've learned! Keep it up and I'll keep watching!
Ahmad , thanks for taking time to create these videos. It is unfortunate that people view your videos and then feel inspired to complain about a free gift. Folks could just keep it moving or add helpful insights.
This is the best introduction to CUDA I've seen, thanks a lot !
#
wanted to comment that the information in this presentation is very well structured and the flow is excellent.
Fantastic tutorials on CUDA. You deserve more followers.
Thanks for the comment... contact me for information and profitable investment strategies..⤴️
You saved me, i had to read the PointNet2 implementation for my BCS thesis. this made the job much easier!
Hey Ahmad , I love watching your videos because of the way you tell the story. Great graphics mate. Love the reference to rocket man too... lol keep up the good work.
This was by far one of the most enlightening videos you have put up on your channel. Thanks and keep up the good work!!
LOL. Loved the graphic at 6:23! Brought tears to my eyes.
holy shit, i was looking into this to speed up my mandelbrot-zooms and they are what you use as an example! This is a dream come true!
Perfect Video! Saw was revealing to me to understand how it works. Thank you! I am a new subscriber of your channel. Regards from Buenos Aires, Argentina
This was oddly intense. Great job Nicholas! Even though you ran out of time, this video is still a win to me. 😉
what a passionate tutorial! I wish you were my professor for my parallel programming course. Well done!
If your lectures were a neural network, they’d have zero overfitting-always accurate and efficient!
and that's what I call a great tutorial. Thankyou sir. I wish you make more tutorials.
Thanks for the comment... contact me for information and profitable investment strategies...⬆️
Woah congrats @Ally 🎊 🎉 glad you’re enjoying the challenges, plenty more to come!!
I feel like Cuda has been demystified. Very glad I found your series.
#
Oh Ahmad , your tutorials are incredible and inspiring....
Very well explained. The best CUDA explaination I have come across uptil now 😊😊. Keep up the spirits sir.👍👍
Thanks for the comment... contact me for information and profitable investment strategies...⤴️
Too hard to find high-quality content like this these days. ⚡
Wow It is really awesome! It is much better than a tutorial from university! Thanks!
Thanks for the comment... contact me for information and profitable investment strategies...⬆️
Great video, I like this kind of video where you code some AI task counterclock, you teach us the concepts and show us the reality of implementing it👏
the essence of Deep learning in a few lines of code... awesome
You are a lifesaver @Spencer, will do it next time i'm on the streaming rig!
Ohh, yes, Thank you, and the documentation at nvidia site about CUDA is highly professionally written. Thank you.
Thank you so very much. This is the exact kind of material I was looking for on this very specific subject. Kudos.
Your teaching is so engaging, I almost forgot my code is stuck in an infinite loop!
Awesome video !! It's preety cool to see such theoretical concepts coded and explained like this. Keep going Nich !!
OHHHH MANNN, I thought about doing that but I was debating whether I'd hit the 15 minute deadline already. Good suggestion @Julian!
Thanks for the video, I found the first half and the wrap up really excellent.
this is extremely helpful. you did an amazing job explaining the foundations
Thanks for the comment... contact me for information and profitable investment strategies...⤴️
This is very helpful. Most people don't realize the overheads and code refactoring necessary to take advantages of the GPUs. I am going to refactor a simple MNIST training propgram I have which currently uses only Numpy. See if I can get meaningful improvements in training time.
Thanks for making all these topics very approachable!
I was needing this!!! Thanks a lot, Sir!!!!
Excellent example of vector addition of using for loop and using CUDA
Thank you so much for this video. It has helped me massively to prepare for my computer science exam.
I have no idea what kind of videos i am watching ... but i sure will learn
I'm doing an internship in a research lab and I'll have to program some kernels to implement Blas primitives, this video really helps :)
Thanks for the comment... contact me for information and profitable investment strategies..⤴️
It's very informative and a good intro to CUDA programming. Thanks very much!
#
This was a great video to me, I have very limited C++ experience and was looking for an explanation of CUDA. Another video like this could easily have been 70-80% over my head. This one was only about 15% whoosh. And now I actually find C++ interesting again!
Thank you for this great introduction to numba and more specifically numba+cuda.
This is amazing! Thank you for taking effort to make it!
Thank you very much for this tutorial. I would love to have the code available because typing it in myself from the video is a bit hard especially with the atocomplete on all the time. Keep up the good work.
The video was very helpful for me. Many thanks to the author for developing his audience with interesting and useful content
Can't wait to see Juan's better tutorial that he's definitely going to release :') lmao. Great video Ahmad .
Love your videos. Please don't stop!
Amazing! I'm learning so much watching you code. Thank you for sharing.
An insanely underrated series!!!
Thanks for the comment... contact me for information and profitable investment strategies..⤴️
Once you initialized lr to 0.0, I knew you were going to forget to change it lol. Love the challenges tho, keep doing them, I think it would be cool to see how you implement a neural network from scratch
Well just built a new rig with a 980ti and a 4790k so I'm gonna put that to test. Thank you for your wonderful explanation :D
Ayyyy, so glad you like it @Patrick. For the last two weeks I've just been making videos on stuff I find hard or want to get my head around I figure it's not just me staring there at some of these concepts like huh?!? Thanks for checking it out!!
I like how you did the website for documenting the video notes for reference later
Hey, thanks for explanation! Very well done 👍 I am downloading CUDA 💪
PS. I really so moved for your stock price episode. thank you so sosososo much.
This was really good. Thanks for posting this!
Love your videos bro! Time to put down that redbull though lol just kidding happy holidays!
You and corey schafer are my best professors
i need to say this: you are the gamechanger here!!
Interesting, but two remarks:
Example 1: on my setup (3080Ti, i7-8700K, running in WSL2 with Ubuntu 22.04) vector multiplication runs actually *faster* on CPU (if you either use the vectorized formulation in MultiplyMyVectors with target "cpu" or, simply, a*b instead of the unnecessary for loop in the CPU code). IMO that is mostly due to the overhead of copying the data to the GPU memory.
Example 2: to get a fair comparison, you should also use the JIT for FillArrayWithouGPU, decorating with @jit(target_backend="cpu"). Then, GPU array filling is still faster, but only by a factor of 2.
Excellent explanation, keep going with this content man ;)
Great explanation! Fascinatingly clear
Thanks for the video, subscribed! A suggestion : this small change to your code would demonstrate a real-world gradient descent solution for linear regression with noisy data. E.g. :
Ahmad sorry for bother you, the problem was not installing Cuda Toolkit, srly I hate people who doesnt watch full video closely and ask stupid questions....and now I m one of them :D. Thx alot for this tutorial in 2 months i will try write my own GPU operator for my program, would be interting if this will be faster than CPU. (Btw using normal Visual code in python 3.10 env. on win 11, so far so good. (Altrough i have some code output delay problem when using openCV for some strange reason)
This is really helpful for my computing. Thank you.
You are bloody watching a master at work xD
Ahmad , great video. You have a great way of explaining things and help a lot of people. IMO a lot of the critic you get - such as here is unfunded. By the way, I do not see any video postings by Juan???. I am trying to get my TH-cam channel started and hope that in 10 years time I will be 1/10th as good as Ahmad 👍.
Many thanks for the lucid explanation.
opened my eyes to parallel programming
Thanks for the comment... contact me for information and profitable investment strategies..⬆️
Great talk, thank you ! Well structured and clear.
Thanks for the comment... contact me for information and profitable investment strategies...⬆️
It's great video programming sir,, hope the best for you
Sir,make more detailed sessions on CUDA,your explanation is great
HEYYYYY PHIL!! Long time no see, thanks a mil!!
Your videos are awesome , thanks a lot for this quality content :)
Absolutely lovely visuals!!!
yes, you could do this by hand, which would be a great challenge in distributed computing to code by hand. Another option is to use a framework/platform like AWS Sagemaker to do distributed kmeans. Most organizations will do this.
Awesome! learning never stops.
It is effectively a very easy approach to harness the power of cuda in simple python scripts.
wold love to see a video on what are a few CUDA programming challenges
YESSSS, right?! Glad you liked it Miguel!
It works on both AMD and NVIDIA. If you have CUDA code, you can convert it to HIP with their automated tool, there is very little CUDA specific that can't be just translated over.
Very nice tutorial. I really liked it. It's brief, to the point and very clear. Thanks. Could you please make a video for the same example but in Linux?
Thank you so much sir, you are an amazing human being !
Very clear, I loved it !
This reminds me a lot of the computer tutorial tapes from the 90s
glad to see you take it as a feedback and not as a hate comment
Damn you are such a great teacher dude.
So stoked you liked it 🙏
The Knowledge of Ahmad knows no bounds.