An Introduction to GPU Programming with CUDA

แชร์
ฝัง
  • เผยแพร่เมื่อ 7 ธ.ค. 2024

ความคิดเห็น • 373

  • @itsr4yd946
    @itsr4yd946 5 ปีที่แล้ว +93

    Typing 1

    • @1kounter
      @1kounter 5 ปีที่แล้ว +1

      can you explain please? I'm not too familiar with c++

    • @jeffreylebowski4927
      @jeffreylebowski4927 5 ปีที่แล้ว +7

      @@1kounter '

    • @arcsine
      @arcsine 5 ปีที่แล้ว +8

      people who do this have 99% chance of using the ternary operator whenever possible

    • @pcrizz
      @pcrizz 4 ปีที่แล้ว +2

      @@jeffreylebowski4927 So it's not exactly a million elements? This is misleading.

    • @jeffreylebowski4927
      @jeffreylebowski4927 4 ปีที่แล้ว +2

      @@pcrizz Yeah i agree, there was no need to make things more complicated and then not even be precise about it. - Its bad to do this as a teacher.

  • @simetry6477
    @simetry6477 7 ปีที่แล้ว +9

    That was very well explained. I have only have taken one course, and you made it clearer than my professor or fellow students ever did.

  • @emmanuelezenwere
    @emmanuelezenwere 5 ปีที่แล้ว +24

    For a newbie in the field of parallel computing and gpu accelerated computing like myself, thanks for taking the time to lay a strong foundation on how parallel computing works. It helped me correct a wrong assumption I had about GPUs.

    • @liamchristian2661
      @liamchristian2661 3 ปีที่แล้ว

      you probably dont care but does anybody know a way to log back into an Instagram account..?
      I stupidly forgot my password. I would love any tricks you can give me.

    • @liamchristian2661
      @liamchristian2661 3 ปีที่แล้ว

      @Trenton Lian thanks for your reply. I got to the site through google and im waiting for the hacking stuff now.
      Takes quite some time so I will reply here later with my results.

    • @liamchristian2661
      @liamchristian2661 3 ปีที่แล้ว

      @Trenton Lian It did the trick and I now got access to my account again. Im so happy:D
      Thanks so much you saved my account :D

    • @trentonlian3375
      @trentonlian3375 3 ปีที่แล้ว

      @Liam Christian You are welcome =)

  • @akuku5555
    @akuku5555 7 ปีที่แล้ว +30

    "cuda" in polish means miracles (we pronounce it little diffrent though) so it's kind of funny because gpu seems like little miracle

    • @TheRojo387
      @TheRojo387 3 ปีที่แล้ว +1

      "Tsoo-dah", no doubt!

  • @adityakamireddypalli3453
    @adityakamireddypalli3453 7 ปีที่แล้ว +3

    This was by far one of the most enlightening videos you have put up on your channel. Thanks and keep up the good work!!

  • @northeastmtb1575
    @northeastmtb1575 4 ปีที่แล้ว +6

    Thanks for actually showing the speed increase with more threads, haven't seen anyone do this.

  • @samuelh5
    @samuelh5 5 ปีที่แล้ว +2

    This was a great video to me, I have very limited C++ experience and was looking for an explanation of CUDA. Another video like this could easily have been 70-80% over my head. This one was only about 15% whoosh. And now I actually find C++ interesting again!

  • @ppcservices5175
    @ppcservices5175 7 ปีที่แล้ว +3

    LOL. Loved the graphic at 6:23! Brought tears to my eyes.

  • @freediugh416
    @freediugh416 7 ปีที่แล้ว

    OK those animations + your relevant narration is by far the best combination for learning. Loved it!

  • @jony7779
    @jony7779 5 ปีที่แล้ว +25

    3:08 actually the n'th fibonacci number can be found in O(log n) using matrix exponentiation.

    • @christophostertag4669
      @christophostertag4669 4 ปีที่แล้ว

      Yes. I wrote an articel about it.
      towardsdatascience.com/fibonacci-linear-recurrences-and-eigendecomposition-in-python-54a909990c7c?source=friends_link&sk=6dbc6bcaf2b9551e108da037963aea33

    • @christophostertag4669
      @christophostertag4669 4 ปีที่แล้ว +1

      And you actually do not need matrices, only the diagonal values from eigendecomposition.

    • @udaykiranreddysadula1065
      @udaykiranreddysadula1065 3 ปีที่แล้ว

      It can be found in O(1). As far as I remember the formula is derived using LDU decomposition or Diagonalising a matrix, for matrix exponentiation.

  • @SwanandKulkarni2194
    @SwanandKulkarni2194 7 ปีที่แล้ว +18

    Is 'Introduction to CUDA GPU Programming' a new series that is going to come up ?

    • @nands4410
      @nands4410 7 ปีที่แล้ว +5

      Swanand Kulkarni I hope so

    • @SirajRaval
      @SirajRaval  7 ปีที่แล้ว +7

      i wasn't planning on it but this vid seems to be doing well so may consider

  • @paulhansen5053
    @paulhansen5053 6 ปีที่แล้ว +47

    Nice talk, but your history is wrong -- the first graphics-on-a-chip was done by Silicon Graphics, Inc in about 1985, with the "geometry engine" chip. SGI dominated the industry in the late '80s and the '90s, much like Apple and Google do now. NVidia was a later spin-off in the late '90s formed mostly by people from SGI.

    • @NUCLEARARMAMENT
      @NUCLEARARMAMENT 4 ปีที่แล้ว +1

      Evans & Sutherland was doing hardware-accelerated, rasterized graphics in the '60s and '70s. SGI did pioneer the bringing of the technology to the general workstation markets in the '80s, though.

    • @paulhansen5053
      @paulhansen5053 4 ปีที่แล้ว +1

      @@NUCLEARARMAMENT I kinda don't think so, actually don't have time to check it out. Pretty sure that E&S in '60s and '70s was all vector graphics

    • @paulhansen5053
      @paulhansen5053 4 ปีที่แล้ว +1

      @@NUCLEARARMAMENT "Raster graphics" on a chip may have been done earlier than SGI, but not fully 3D polygonal rasterization, which is a the basis of modern 3D computer graphics.

    • @NUCLEARARMAMENT
      @NUCLEARARMAMENT 4 ปีที่แล้ว +1

      @@paulhansen5053 Also, the CT5 simulator from 1981 may not count as being from the '70s or '60s, but from what I understand, the CT5 was capable of realtime, rasterized, 3D polygonal rendering and was $20 million at the time. It used gouraud shading, if memory serves. There were several other CT (continuous tone) simulators developed by E&S in the '70s that did something similar or of much lower capability than the CT5 of '81. There was also the Digistar planeteriums that date back to the early '80s, and the Picture System goes back to at least the early '80s. Might be vector or raster, not entirely sure myself, though.

  • @otonanoC
    @otonanoC 7 ปีที่แล้ว

    The Knowledge of Siraj knows no bounds.

  • @tofani-pintudo
    @tofani-pintudo 7 ปีที่แล้ว

    Video quality is awesome ! Great job

  • @RealDrivernileproductions
    @RealDrivernileproductions 4 ปีที่แล้ว

    Excellent presentation! Thanks for sharing this vital knowledge!

  • @MataioPoChing
    @MataioPoChing 7 ปีที่แล้ว +17

    CUDAs to you, for covering... CUDA!

  • @pbally9457
    @pbally9457 2 ปีที่แล้ว

    Thank you so much for this video. It has helped me massively to prepare for my computer science exam.

  • @ivanr7725
    @ivanr7725 7 ปีที่แล้ว

    Love your videos. Please don't stop!

  • @srgkzy1294
    @srgkzy1294 7 ปีที่แล้ว

    I have no idea what kind of videos i am watching ... but i sure will learn

  • @rakeshmanathana
    @rakeshmanathana 6 ปีที่แล้ว

    Oh Siraj, your tutorials are incredible and inspiring....

  • @jiaming5269
    @jiaming5269 4 ปีที่แล้ว +1

    Thank you for this video

  • @ther6989
    @ther6989 5 ปีที่แล้ว +3

    I didn't know TES V was available for the NES platform. I wonder if it has mod support.

  • @daggawagga
    @daggawagga 7 ปีที่แล้ว +2

    I was just starting to learn OpenGL (with the Superbible) to get a feel for how gpus work. So far it's been very fun. I had always assumed it was waaay more complicated to make anything work in a gpu, but it doesn't seem to be the case.

    • @Lucas_Simoni
      @Lucas_Simoni 2 ปีที่แล้ว

      Lmao, I thought I had to write some sort of gpu assembly. But they don't even publish it, in practice, the gpu driver is what exposes what you can do with it... by what I read with a quick google search.

  • @henrygagejr.-founderbuildg9199
    @henrygagejr.-founderbuildg9199 5 ปีที่แล้ว

    Siraj, thanks for taking time to create these videos. It is unfortunate that people view your videos and then feel inspired to complain about a free gift. Folks could just keep it moving or add helpful insights.

  • @aidenstill7179
    @aidenstill7179 5 ปีที่แล้ว +1

    What do I need to know to create my own deep learning framework like PyTorch? What are the books and courses to get knowledge for this?

  • @jackpeterson6295
    @jackpeterson6295 7 ปีที่แล้ว +20

    You can calculate the nth fibonacci number without calculating the previous fibonacci numbers with its closed form.
    ocw.mit.edu/courses/electrical-engineering-and-computer-science/6-042j-mathematics-for-computer-science-fall-2005/readings/ln11.pdf (page 9)

    • @qwew2244
      @qwew2244 7 ปีที่แล้ว +1

      Nice read, thanks.

    • @randyorton06
      @randyorton06 7 ปีที่แล้ว +1

      siraj is noob

    • @TeutonTwin
      @TeutonTwin 7 ปีที่แล้ว

      You beat me to it...
      Fib(n) = ( 1.6180339..^n - (-0.6180339..)^n ) / 2.236067977..

    • @zbaaby
      @zbaaby 7 ปีที่แล้ว +3

      it's only an approximation

    • @SirajRaval
      @SirajRaval  7 ปีที่แล้ว

      see the top answer here codereview.stackexchange.com/questions/163354/using-2-threads-to-compute-the-nth-fibonacci-number-with-memoization

  • @thesage1096
    @thesage1096 4 ปีที่แล้ว +1

    I saw this in my recommended. The most I knew about gpu was that u need that for the cool games. Now imagine me watching this video.

  • @VeiledVerities
    @VeiledVerities 4 ปีที่แล้ว

    Just got a Jetson Nano - looking forward to learning more!!

  • @stomachcontentz
    @stomachcontentz 4 ปีที่แล้ว +1

    Say what you want about Siraj's bad choices recently, the dude is a master teacher.

  • @NickEnchev
    @NickEnchev 5 ปีที่แล้ว +1

    i like how you pointed at the vertex shader image when you said "cpu"

  • @Dsync
    @Dsync 7 ปีที่แล้ว +2

    Hey Siraj, I love watching your videos because of the way you tell the story. Great graphics mate. Love the reference to rocket man too... lol keep up the good work.

  • @himanshupoddar1395
    @himanshupoddar1395 6 ปีที่แล้ว +1

    Siraj I ll be waiting for your reply...
    Can you please make a video on how to use Google collab notebook to train our machine learning project

  • @Cyphlix
    @Cyphlix 6 ปีที่แล้ว +1

    these memes increase production quality by 10x

  • @SandeepKumarX
    @SandeepKumarX 3 ปีที่แล้ว

    Hi siraj, can you guide me what to install inorder to execute CUDA code? I'm scratching my head since a long time for the successful execution of the program on windows using Visual Studio.

  • @diegoantoniorosariopalomin4977
    @diegoantoniorosariopalomin4977 7 ปีที่แล้ว +5

    I hope khrono group goes through with its plan of making vulkan a replacement for open cl

    • @DaveAxiom
      @DaveAxiom 7 ปีที่แล้ว +1

      Vulkan is the successor of OpenGL.

    • @diegoantoniorosariopalomin4977
      @diegoantoniorosariopalomin4977 7 ปีที่แล้ว +1

      Dave Axiom but as it got rid of most of that api's( open gl ) baggage it also makes sense to use it for computation

    • @diegoantoniorosariopalomin4977
      @diegoantoniorosariopalomin4977 7 ปีที่แล้ว +1

      Dave Axiom www.phoronix.com/scan.php?page=news_item&px=IWOCL-2017-OpenCL

    • @diegoantoniorosariopalomin4977
      @diegoantoniorosariopalomin4977 7 ปีที่แล้ว +1

      Dave Axiom www.phoronix.com/scan.php?page=news_item&px=clspv-OpenCL-To-Vulkan

  • @radinkayorgova9407
    @radinkayorgova9407 4 ปีที่แล้ว

    Did you check the results of the addition? I have run the code and the values of 'y' before and after the add function remained ... what is wrong?

  • @UjwalNayak
    @UjwalNayak 5 ปีที่แล้ว

    You don't need a "for loop". Each thread computes sum of a[blockIdx.x * blockDim.x + threadIdx.x] + b[blockIdx.x * blockDim.x + threadIdx.x].

    • @Lord2225
      @Lord2225 5 ปีที่แล้ว

      Thread and blocks size they are not infinite. and of which of which I know it is more profitable to perform in a loop than to create large thread and blocks meshes.

  • @smartmineofficial
    @smartmineofficial 7 ปีที่แล้ว +16

    What's the link for the GitHub at the end of the video?

    • @smartmineofficial
      @smartmineofficial 7 ปีที่แล้ว +12

      OK, found it: github.com/alberduris/SirajsCodingChallenges/tree/master/Stock%20Market%20Prediction

    • @alberjumper
      @alberjumper 7 ปีที่แล้ว +3

      Thanks! ^^

    • @sophyval5437
      @sophyval5437 7 ปีที่แล้ว +1

      thanks bro!

    • @daggawagga
      @daggawagga 7 ปีที่แล้ว +2

      Awesome!

    • @SirajRaval
      @SirajRaval  7 ปีที่แล้ว +17

      just updated, can't believe i forgot it, won't happen again

  • @rougegorge3192
    @rougegorge3192 7 ปีที่แล้ว

    Love your style... Steel the same Siraj, subject very interesting!!

  • @mathematicssolved
    @mathematicssolved 7 ปีที่แล้ว

    Good stuff Siraj!

  • @Nova-Rift
    @Nova-Rift 5 ปีที่แล้ว

    Do we need the learn Cuda if we plan on using Tensorflow? I know Cuda allows us to process on many cores, but I heard something along the lines that Tensorflow already has this idea built in. Is this true, or do we need to pair Cuda and Tensorflow in our deep learning programs?

  • @joses.5943
    @joses.5943 7 ปีที่แล้ว

    Love your videos bro! Time to put down that redbull though lol just kidding happy holidays!

  • @MagicAndReason
    @MagicAndReason 7 ปีที่แล้ว

    Freaking
    Awesome.
    I regret I have but one up-thumb to give.

  • @yellowbeard1
    @yellowbeard1 5 ปีที่แล้ว

    I understand that this video is meant to be an introduction but can you make a video or two where you go into more detail and do some coding of this? I like your format of short videos with memes as introduction and longer videos for the coding but I can’t find the longer coding video for using CUDA or GPUs

    • @SirajRaval
      @SirajRaval  5 ปีที่แล้ว

      Yeah I need to do a long cuda video good idea

  • @Barnardrab
    @Barnardrab 7 ปีที่แล้ว

    In reference to 8:00, what happens if you specify a number of GPU threads/cores that is greater than that of the GPU itself? Is there a GET function that retrieves these values so as to prevent exceptions?

    • @ankurpandey7947
      @ankurpandey7947 3 ปีที่แล้ว

      There must be some : min(no_of_cores, input_cores) inbuilt , so if you pass the input cores more than the no of cores , it'll take the no of cores else it'll take your input

  • @cehanalexandru
    @cehanalexandru 5 ปีที่แล้ว

    Very helpful video. Thank you :) !

  • @gaureesha9840
    @gaureesha9840 7 ปีที่แล้ว

    At the end how is kernel parsing entire array if 'Stride' variable will be size of block. Some elements of array will be skipped?

  • @Jajalaatmaar
    @Jajalaatmaar 6 ปีที่แล้ว

    I spend several hours yesterday trying to install tensorflow-GPU in Anaconda so I could use my GPU to speed up training of our neural network. It's an incredible pain in the ass as there are 7565 dependencies that all need to be installed in specific way and you're forced to register for the Nvidia Developer Spam Program. I didn't manage to do it in the end.
    Anyone have any tip for painless installing of tensorflow-GPU? I was really suprised and annoyed by how difficult it was.

  • @BachPhotography
    @BachPhotography 5 ปีที่แล้ว +2

    Holy cow gpu progrmaming is complicated

  • @GothsOnTop
    @GothsOnTop 4 ปีที่แล้ว

    So If I was to make my own gpu would I have to learn C++ or something else?

  • @АлександрЧухров-я3й
    @АлександрЧухров-я3й 6 ปีที่แล้ว

    Could you help me, I need the best GPU for science purposes not more expensive than 2500$ (it should work good with double): could you tell me on what characteristics should I look and what is the best variant?

  • @timshinkle2782
    @timshinkle2782 7 ปีที่แล้ว +5

    What machine do you use for your GPU programming? Can you tell us how to port what you are doing from your on-premises computer to a Cloud provider? And which Cloud providers support this? That would be interesting to know.

    • @BohumirZamecnik
      @BohumirZamecnik 7 ปีที่แล้ว +3

      Easier way to start is a cloud instance, eg. on AWS p2.xlarge (aws.amazon.com/ec2/instance-types/p2/). If you do a lot of computation and want to save expenses for cloud or train faster, you can build you own computer (eg. timdettmers.com/2015/03/09/deep-learning-hardware-guide/).

    • @SirajRaval
      @SirajRaval  7 ปีที่แล้ว

      th-cam.com/video/Bgwujw-yom8/w-d-xo.html

    • @timshinkle2782
      @timshinkle2782 7 ปีที่แล้ว

      Thanks.

    • @timshinkle2782
      @timshinkle2782 7 ปีที่แล้ว

      Great video, thank you.

  • @VictorGallagherCarvings
    @VictorGallagherCarvings 7 ปีที่แล้ว +13

    Can this be doe with Python ?

    • @maxschafer4510
      @maxschafer4510 7 ปีที่แล้ว +8

      Victor Gallagher yes you can usw pygpu or my recommendation numba

    • @VictorGallagherCarvings
      @VictorGallagherCarvings 7 ปีที่แล้ว +5

      Cool! Thanks for replying, I only starting learning Python 3 months ago and just discovered numba this week. Regards

    • @SirajRaval
      @SirajRaval  7 ปีที่แล้ว +6

      yes mathema.tician.de/software/pycuda/

    • @Tinkula
      @Tinkula 7 ปีที่แล้ว +3

      You can also skip the middle man and create directly a python extension which uses CUDA.
      In my submission you can find a very minimal implementation of it:
      github.com/tterava/Mandelbrot
      It has all the necessary functions to allow your your C (or CUDA) code to be called from python code directly. All you need to do is to use the nvcc compiler to build the extension and python knows how to use it automatically.

    • @samueln300
      @samueln300 7 ปีที่แล้ว

      as stated you can use numba. just decorate your function with "vectorize"

  • @412kev2
    @412kev2 7 ปีที่แล้ว

    Awesome video, as usual. You should do a video on what makes a good feature and how to select good features (feature engineering). Have a good one!

  • @ghsrz8222
    @ghsrz8222 6 ปีที่แล้ว

    Very useful video, thanks a lot, Go ahead

  • @AkashMishra23
    @AkashMishra23 7 ปีที่แล้ว

    Another Awesome Video, Loved it

  • @herougo
    @herougo 7 ปีที่แล้ว +3

    Here's an idea. You can speed up Fibonacci number calculation by using the closed form formula. You distribute the multiplication of the powers over multiple threads. To assure accuracy, instead of using floats, you can write a class to represent numbers of the form a + b * sqrt(5), where a and b are rational numbers.

  • @IggyBing
    @IggyBing 5 ปีที่แล้ว +2

    Step 1: do NOT install GPU driver updates until you're dead certain that CUDA is supported on the given driver version.

  • @chasegraham246
    @chasegraham246 7 ปีที่แล้ว

    The benchmark for using 256 threads completed 171 times faster. Is that typical of parallelized operations in cuda?

  • @akashdeepjassal3746
    @akashdeepjassal3746 7 ปีที่แล้ว +63

    OpenCL please

    • @SirajRaval
      @SirajRaval  7 ปีที่แล้ว +12

      ill put that in the queue

    • @mrarv6417
      @mrarv6417 6 ปีที่แล้ว

      That would be awesome ! Keep it open guys ! (Even is CUDA is pretty awesome...)

    • @dewijones92
      @dewijones92 5 ปีที่แล้ว +1

      THIS!!
      We should not be promoting proprietary tech

    • @aidenstill7179
      @aidenstill7179 5 ปีที่แล้ว

      @@SirajRaval What do I need to know to create my own deep learning framework? What are the books and courses to get knowledge for this?

  • @NickEnchev
    @NickEnchev 5 ปีที่แล้ว +1

    oh and nvidia didn't release the first graphics processing unit

  • @ruimauaie6623
    @ruimauaie6623 5 ปีที่แล้ว

    hi everybody. I need a help I have a python code which include CUDA but I'm beginner on python use, when I run it I get error, (Torch not compiled with CUDA enabled) may someone help me to solve this issue? I'm doing my final research. I look forward to hearing from you

  • @williamowens7510
    @williamowens7510 7 ปีที่แล้ว

    This was really, really helpful to me. Thank you.

  • @FernandoMumbach
    @FernandoMumbach 7 ปีที่แล้ว +1

    Seems like CUDA tries to make it as easy as possible to access the memory from both GPU and CPU, but what's the tradeoff?

    • @KevinHosford
      @KevinHosford 7 ปีที่แล้ว

      Can be very wearing on a GPU reducing its lifespan

    • @byobcello
      @byobcello 7 ปีที่แล้ว

      Source?

    • @sleeepyjack89
      @sleeepyjack89 7 ปีที่แล้ว

      There are some performance trade-offs when using Unified Virtual Memory (UVM). Memory pages are migrated between the host and the GPU on-demand (plus some nifty prefetching heuristics) but frequent page faults often stall the system. I suggest using explicit memcpy semantics between the host and the GPU.

    • @FernandoMumbach
      @FernandoMumbach 7 ปีที่แล้ว

      Thank you Daniel, that was the kind of trade-off explanation I was looking for.
      Kevin, I'd like to read more about that. New GPUs seem to be _meant_ for that kind of stuff, don't they? I'd be surprised if it would really kill your GPU.

  • @sasssass1985
    @sasssass1985 3 ปีที่แล้ว

    thanks for the clear explaination

  • @kjs205
    @kjs205 ปีที่แล้ว

    I have a CS degree but this shit entire level above my understanding

  • @JupiterRoom
    @JupiterRoom 7 ปีที่แล้ว

    hey Siraj, do you write your own scripts? who funds all this?

  • @tanbirsohail
    @tanbirsohail 7 ปีที่แล้ว +28

    Hey Siraj, 2007 called and they want their memes back.

    • @SirajRaval
      @SirajRaval  7 ปีที่แล้ว +4

      +Tanbir Sohail haha oh shit noted thx

    • @unabonger777
      @unabonger777 7 ปีที่แล้ว +7

      the calling and wanting things back meme is older than that.

  • @gunapandian2332
    @gunapandian2332 7 ปีที่แล้ว +1

    Cuda in mac ?! 🤔

  • @wifekmezny9805
    @wifekmezny9805 7 ปีที่แล้ว

    I need a help please so urgent , I want to have all the prediction of each class my softmax gives me only one result of one class , is there an instruction softmax multilabel

  • @syedali2494
    @syedali2494 6 ปีที่แล้ว

    i dont understand except that there are cuda cores in GPUs and some graphics pipelines.

  • @curtischong2459
    @curtischong2459 7 ปีที่แล้ว

    Since you are adding the same numbers in the array, won't path prediction affect the speed of addition?

  • @segunvictorawoma2888
    @segunvictorawoma2888 3 ปีที่แล้ว

    Hi guys. PLease i ran the code on google colab and all it outputted was Match error: 0.
    Please can anyone help me?

  • @RamonChiNangWong078
    @RamonChiNangWong078 7 ปีที่แล้ว

    nice, can you do an opencl version and maybe explain the differences between Cuda and OpenCL

  • @druklist9983
    @druklist9983 7 ปีที่แล้ว

    What about AMD graphics ? Do we have any ML package to process on AMD graphics? Thank you in advance.

  • @magnuscritikaleak5045
    @magnuscritikaleak5045 4 ปีที่แล้ว

    Can you show me Parallel Programming inside scripting C# Unity programmes please?

  • @ultraderek
    @ultraderek 6 ปีที่แล้ว

    I thought you were going to write a NCHW convolution function.

  • @Nobodyrocked
    @Nobodyrocked 7 ปีที่แล้ว

    this might be a noob question, but what do 1

    • @henricx1
      @henricx1 6 ปีที่แล้ว

      its bit shifting to the left on the binary representation of the number, for example 1

  • @TheJdip123
    @TheJdip123 7 ปีที่แล้ว +1

    Did you use an eGPU for your Mac? How did you manage to write CUDA program using IOS?
    P.S. Thanks for your introduction to GPU. I think this is one of your best videos. Keep on the good work.

  • @waterearthmud4116
    @waterearthmud4116 6 ปีที่แล้ว

    2:22-wasnt 3Dfx around before nvidia?

  • @RaktimMondolandJoya
    @RaktimMondolandJoya 7 ปีที่แล้ว

    Make a video regarding "Reinforcement Learning". Thanks in advance.

  • @AmanSharma-hi3fd
    @AmanSharma-hi3fd 6 ปีที่แล้ว

    Your videos are awesome , thanks a lot for this quality content :)

  • @xDCloudStrifexD
    @xDCloudStrifexD 7 ปีที่แล้ว +3

    will i be able to use CUDA if i use an external nvidia gpu with my macbookpro? is this recommended?

    • @rossmauck8254
      @rossmauck8254 7 ปีที่แล้ว +1

      Andrew what gpu

    • @KevinHosford
      @KevinHosford 7 ปีที่แล้ว

      Mac doesn't support CUDA I think?, Maybe if you use windows 10 (Or even better Ununtu) on the Macbook through bootcamp with a 3.0 minimum level GPU card through thunderbolt might work

    • @vkoskiv
      @vkoskiv 7 ปีที่แล้ว

      macOS supports cuda.

    • @godofbiscuitssf
      @godofbiscuitssf 7 ปีที่แล้ว

      metal 2 shading language from Apple is Best for Apple hardware.

    • @xDCloudStrifexD
      @xDCloudStrifexD 7 ปีที่แล้ว

      I havent bought the external gpu yet, but looking at the 1080. just wanted to know if its feasible before getting it

  • @yashpandey350
    @yashpandey350 4 ปีที่แล้ว +4

    It's like missiles of knowledge coming through his brain.😂😂😂😂

  • @donbasti
    @donbasti 7 ปีที่แล้ว

    Hey Siraj, I am a massive fan of yours and just started drinking coffee to be capable of keeping up with your speed of talking .. just kidding :D, but to the point.
    I am now starting my third year of Software Engineering degree and would like to create a text summarizer for my Final Year Project. I decided to use the seq2seq that Tensorflow provides, because I recently read a paper saying that a Seq2Seq + Attention + Bidirectional model would outperform every other model for the moment. At the end the paper stated that you can improve its performance by training the model on more and more data.
    So my question is - Can I use Tensorflow's model and then improve its accuracy by training it with one dataset, then pickling it, then training it further with another dataset and so on...
    I am stuck for advice and would love it if you answer.
    All the best

  • @VrajPandya
    @VrajPandya 7 ปีที่แล้ว

    You don't need decive sync after kernel executions.

  • @jasonreviews
    @jasonreviews 7 ปีที่แล้ว

    your my new favorite youtuber. LOLs.

  • @MygenteTV
    @MygenteTV 5 ปีที่แล้ว

    the real stuff here is how are you using GPU in a mac? can you make a tutorial on how to make it work? as many know nvidia and apple are at war.

    • @existentialchild698
      @existentialchild698 5 ปีที่แล้ว

      There are external GPUs that are compatible with MacBooks. Favorite of mine, the Razer Core X. Almost every eGPU is compatible with Nvidia.

    • @MygenteTV
      @MygenteTV 5 ปีที่แล้ว

      @@existentialchild698 yeah know, but is a shame having to buy one when you already have one in your sistem. anyways i just make it work. compiled tensorflow with it and now i have enable gpu in my mid 2012 macbook pro

    • @existentialchild698
      @existentialchild698 5 ปีที่แล้ว

      @@MygenteTV Okay. There's also a service made by Google called "Google Colab" that allows you to run your python code on an Nvidia Tesla K80. (That's a really good GPU that would otherwise cost you over $2000 depending on where you buy it).

    • @MygenteTV
      @MygenteTV 5 ปีที่แล้ว +1

      @@existentialchild698 i see thank you.

    • @existentialchild698
      @existentialchild698 5 ปีที่แล้ว +1

      @@MygenteTV no problem, thankful to help! 😊

  • @F96-t6s
    @F96-t6s 6 ปีที่แล้ว +5

    "nVidia came out with the first Graphics Processing Unit in 1999..."
    That is so incorrect.

  • @erwinrommel5593
    @erwinrommel5593 4 ปีที่แล้ว

    can someone tell me how to execute a python code using gpu , through cmd wihtout using any virtual environment

  • @z3r0legend42
    @z3r0legend42 4 ปีที่แล้ว +2

    For fibonacci you can also just use a memoizer and store the previous values calculated on a cache to make your algorithm faster xd just a trick 👍

    • @polite3606
      @polite3606 2 ปีที่แล้ว

      you could also use faster asymptotic algorithm variant, if you want only to compute a single term of the sequence. It suffices to use a square and multiply on a certain matrix. It doesn’t make it parallel though :(

  • @Xexorian
    @Xexorian 6 ปีที่แล้ว

    What's the practical usage of this?

  • @themightyquinn1343
    @themightyquinn1343 7 ปีที่แล้ว

    Could you also make one for radeon opencl programming?

  • @ShashankSaiSangu
    @ShashankSaiSangu 5 ปีที่แล้ว

    Can you please make a tutorial on how an AMD graphic card can be used for training a neural network. I have AMD Radeon r5 m335 4GB but I am unable to use it

  • @mendezcreative
    @mendezcreative 6 ปีที่แล้ว

    Looks cool. Wish I understood.

  • @intr0vrt639
    @intr0vrt639 7 ปีที่แล้ว

    thanks for this siraj

  • @bikramthapa2687
    @bikramthapa2687 6 ปีที่แล้ว

    Awesome tutorial, thanks.

  • @jasminep4423
    @jasminep4423 6 ปีที่แล้ว

    Cuda was named after a car originally, but then the marketing team though that was lame so they changed the meaning

  • @frenchpet
    @frenchpet 7 ปีที่แล้ว

    Will CUDA work on my 3DFX Voodoo 3000?

  • @memoriasIT
    @memoriasIT 6 ปีที่แล้ว

    the cpu is the powerhouse of the pc