Sir can you please tell me I am making a pc for my Deep learning tasks, possible I will run LLms also not so much i n budget but I have considered a 4060ti 16gb msi will it be good or I can get better performance for cheaper price or same price , for processing I have Ryzen 7 7900x and the motherboard is gigabyte B650M XAX. Can u please give a solution sir, it's kind of urgent 😢
very strong competition for the more scientific applications of modern GPUs. But prices seem to crash atm, so I might actually get my hands onto the higher GeForce series for all those juicy CUDA cores. Really a shame the different manufacturers don't have cards with more/less memory on board like they used to "in ages past". 3070/80 with an extra 4GB can be a gamechanger for deep learning workloads.
I feel like I am poking at stuff way beyond my grade. I have been looking a bit at Nvidia Tesla v100 but I pretty much have no idea how it would hold up compared to anything else. I do not even know if I would be capable of figuring out a meaningful use for it. I want to do some deep learning stuff and make my own AI text adventure model nonsense. But I already got stumped at the basic coding stuff. And I am still unsure how I managed to screw over my gtx 1080 so it now lags when trying to play in vr. I know it probably is because I tried to make my poor 8gb card try to run some basic text adventure stuff. And something among the drivers and other stuff I had to install to get that going probably messed up something one way or another... If it was not something related to the coding program nonsense. :/ I guess what I want at this point probably is a combination of AI text processing, vr running? bitcoin mining? something? ... I have no idea how bitcoin mining even works, is it even possible to turn such a process on and off? I mean, bitcoin is money right? If I first bought such an expensive setup, it would not hurt if it could pay back at least a tiny bit of itself over time, the 98% of the time its not in use. ... Checked out some of your other videos. I am absolutely subscribing! You seem to go over a lot of interesting topics.
For people dipping their toes into ML and DL and those who are not 100% sure that this is their future, Colab/Kaggle Notebooks and even AWS/GCP/Azure instances are excellent options in my opinion as the time required for the return on investment is pretty high. Also, Prof Heaton could you compare the performance increase between 20 and 30 series? I am interested in knowing if the price hike (and general lack of unavailability) of 30 series is worth it.
@@HeatonResearch colab was nice but last month they've cut hours per month very significantly :( GPU prices also gone crazy. Kind of 2 problems hitting at the same time.
Okay, looks like NVIDIA will be providing me with a RTX A6000! Thank you NVIDIA! Look for an unboxing coming soon! Subscribe, so you do not miss it. Also, one correction to the video, there is no NVLINK on the RTX 3080.
@@HeatonResearch One good thing about tech is most people will have a GPU even better than that soon enough. I mean that in a nice way. Looking forward to more of your videos. :)
Great video Jeff. Thank you! Trying to add some info to your great video: im using NVlink on 2x RTX 8000. I have done lots of experiments on TensorFlow with NVlink on and off. In resume, on same model res, settings and bs: 2x 8000 nvlink: Around 700ms VS 2x 8000 No-nvlink: Around 1050ms (So you get 30% - 40% speed when using NVlink) Ampere vs Turing: Limiting the model to use only 24Gb of vram, compared 3090 vs 8000: You can get around same speed of 1x 3090 using 2x 8000 with NVlink only. Without NVlink 2x 8000 are around 25% slow vs Ampere.
What do you think about the new RTX 3060's for deep learning? They have 12 GB of VRAM, would that outperform a 3080, my guess is no, that it also depends on the cuda cores. Could you do a video talking about this, I am not sure if you have.
I would've liked to see an RTX 3060 thrown into the mix. I would like to know if bus-width matters with deep learning applications as a 3060 has 12GB VRAM.
I have the answer to save you some time: You don’t choose a GPU in 2021. You just wait for the new gen, not only that, you’ll probably need to wait till 2022 summer because people and scalpers will panic buy again.
What do you think about the new Quadro RTX A4000? A lot fewer CUDA cores than say RTX 3080, but that 16GB of ram will fit bigger batch sizes. Also the low end Quadro cards are easier to find.
Very useful, most of my tech channels focus on gaming performance for this kind of hardware and completely disregard ML, DL and other data science uses of this kind of hardware. I was looking for a compromise between gaming and productivity, thanks.
It's generally the same considerations for computational physics GPU simulations. VRAM capacity is the most important factor, and VRAM bandwidth comes second. I'm using 8x Instinct MI200 to pool their 512GB VRAM for super large resolution fluid simulations. The TFLOPs of the GPU itself is irrelevant, and so is the GPU generation. If multi-GPU is supported by the software, the it's always model parallelism, pooling/combining the VRAM of multiple GPUs. Looking at affordable options, the 3090 (non-Ti) is super appealing here.
vendor at 16:45 is System 76 just if you want to know where to get really good Linux systems. They are the Linux Apple company but with open source firmware and exchangeable parts. No Quadros yet :(
They are an order of magnitude less compatible with machine learning libraries and not available in mainstream cloud. These are the main reasons I've shied away from AMD. Not on the CPU side, I am a major AMD "fanboy".
Thank you for sharing this! How does the clock speed and number of CUDA cores affect the choice apart from the RAM? Is it true that more CUDA cores = more parallelization, hence speed up? Thanks again!
If you're comparing cards based on the same architecture then yes. Just beware that specifically what a CUDA core is tends to change slightly from generation to generation. For example, the RTX 3000-series cards have simpler cores than the RTX 2000-series, so despite the 3000 cards having roughly twice the cores as their 2000 equivalent, they're far from twice as powerful in most workloads. The 3000-cards are really good, but not overwhelmingly better than the older cards. Especially at current prices. You're generally better off comparing benchmark scores rather than core count and frequency.
@@fnorgen Hi, Should I consider getting a used Quadro card instead of the new 3000 series card to save some money, or would it be better to spend a little more and get the new architecture GPU
@@sumitbali9194 I suppose it is worth looking into. Depends on what kind of prices you find. If a used Quadro gets you significantly better performance per $ for your intended use then I say go for it. If performance per $ is about the same, then you are probably better off buying one of the new cards if it is still within your budget. If you can find one for sale.
hai Heaton. RTX 2060 have more tensor core than RTX 3060, but RTX 3060 have more cuda core and memory. which one do you think is better to buy for deep learning works?
Very informative videos. Thanks. Would you please point me to your sources that shows that rtx 3080s can use nv link. The info I see only shows the rtx 3090s having nv link capability. Any light you could provide would be greatly appreciated.
I am not fully knowledged about deep learning yet, but you are talking just about image processing. What if I want to train my model on e.g. something with e.g. 20 inputs? That means my matrix is e.g. 20x100, so RAM is not so important for me and I should look on number of cores? Right now I'm thinking abou 3060 Ti because of ~5k CUDA cores, but we'll see what will change with new gpus on market in the december.
Any thoughts on the new quadro rtx a5000? Or more generally on the benefits of tensor cores? I ask because the rtx a5000 actually has less tensor cores then the older rtx 5000. Which would you get if you were building a Tensorflow workstation?
I like the A5000, I would get along with it just fine. I rarely use the full 48GB that my A6000 has, and the A5000 has around 80% of the cores of an A6000. Not everything always scales to using all of those cores at 100%.
@@HeatonResearch similar question: what are your thoughts on an A4000? In an ideal world, I would aim for a 3080 at that price point, but in the 2021 world I have only found the A4000
I am still a bit confused after watching your videos. In my research group, we have a PC 24GB, with 1080RTX 8B for deeplearning using tensorflow/keras what would make more sense to increase speed, buy another 1080RTX 8 or 11GB or will (almost) every newer (and cheaper) higher spect NVIDA make a larger contribution?
Depending on the structure of the neural network being trained, 2 GPUs does not always mean a speed up or the ability to process twice as large of a neural network. Generally a 2nd GPU will nearly double the speed of training, if there is enough data that that it overcomes the overhead of two GPUs. To use the RAM of both GPUs on a single neural network requires customization of the neural network code is needed. So it is hard to say for sure which option is the best without trying the hardware. I tend to go for the newer, bigger GPUs, but not always necessary.
@@HeatonResearch Thanks for your answer and sorry for my late reply but due to circumstances I was out of office for a while. Ok, let's assume we stay with our data set under the 8GB using the cards in simple parallel. Any NVIDIA card equal or faster than the 1080 with 8GB or more will show some improvement I assume or should I use exactly identical cards? The reason I am asking is we first want to get some cheap experience with parallel GPUs before we start building a larger cluster with newer hardware.
I guess you answered this on "5 Questions about Dual GPU for Machine Learning (with Exxact dual 3090 workstation)" (th-cam.com/video/_d3xs1L4jeA/w-d-xo.html). So 1080 won't work together with 2060 or 3090., well that limits the choices, many thanks!
my GTX 1050 have 6.1 score of Compute capability, and GTX 1060 have score 6.1 too, that's meant GTX 1050 have same performance with GTX 1060 for ML training?
So, I'm looking at getting a Vision 3090 OC to pair with my 3080Ti. I have Gigabytes B550 Vision D-P and was wondering if I use the 3090 to occupy the first Pcie 4.0x16 and then connect the 3080Ti to the 4.0x8, in terms of training time, would that result in a significant difference given I'm cutting the available pcie lanes in half to one Gpu?
What do you think about RX 6900 XT for deep learning using python R and octave for financial analysis ? Do i really need a NVIDIA card for it ? As i just started getting in ML i lean towards RTX 3090, RTX 3080 and RX 6900 XT as i noticed it has 16 GB VRAM.
I'm working on NLP related deep learning project on 6 GB RTX 3060 for notebook PC. Frequently running into errors like "CUDA out of memory". Any suggestions? Should I change the hardware or is there a workaround?
I had hoped to pick up a 30-series GPU at launch, but that didn't happen. Right now I'm waiting to see what kind of availability we will see when the new SKUs come to market featuring different RAM/core count combinations. Like you, I prefer the larger RAM sizes in order to work with larger models and I'll trade off core counts for better RAM. Unfortunately, 20-series cards aren't an option right now because of the lack of availability of 30-series cards for the general market.
Same here.. thinking about the A6000, overkill for me at the moment but it is actually available for purchase it seems. I had hoped to get two 3090s for my multi GPU setup but that seems like a pipedream at this point.
i have budget for either an RTX2060 or GTX1660TI , both 6 GB but RTX has 'tensor cores'. how much performance gain can I expect from rtx over gtx for memory size as less as 6GB? is it worth spending more on RTX2060? Edit: for laptops only
Hi, i'm working on computer vision and i need to improve my set, what about e-GPU? I like the portability of working with a laptop but I have some doubts about the performance of using a laptop with an external RTX (e-GPU). I would like to know your thoughts on using deep learning with a laptop + eGPU. Best regards
hey Jeff! awesome content, really enjoyed the video. I've got a question about the stylegan training table - what do they mean by 25000 kimg? Is it 25 thousand or 25 million images? Thanks!
Correct me if I am wrong. But typical software requires consistency across similar hardware that's more meaningful than just the brand (2 MSI 3090, 2 ASUS 3090. Can't be 1:1 each or even 2 different brand/models. This is what I've come to understand. I've never SLI. Own a 3090 and a 1080 before this. But such software as "OpenMPI" I thought requires system consistency when combining. (I have another PC I would bring in to the mix). I honestly haven't confirmed this, just seen it in documentation at various times either implied or stated. And am curious if this is *technically* not the case.
I hope more people start developing for Ampere using tensorflow 2.x I wish I could train Stylegan2-ada on my Ampere card, but the Nvidia code is only using old CUDA10 and Tensorflow 1.x
HI Jeff! Thx for useful information. Sorry if there's already question about gaming laptops, i saw MSI GE76 Raider model with nvidia 3080 16Gb memory, it can be compare with desktop or mobile cards never competing pc versions? Thank you
Sir, I am about to start my master's program in AI. Could you please recommend which workstation I should buy with a limited budget (5-6k)? I used Google Colab, but it is difficult to do computer vision projects. Thank you!
The nvidia 3060 (non-ti) has been announced and it will come with 12 gb($329). This is the lowest tier so far from the 3060 series. Does it make sense to get a slower gpu if it has more memory compared to 3080 and 3070? I will be starting my learning now and not sure if I should invest on something less powerful now and upgrade next year to a high end 4000 series.
I really have not gotten into Jetson at this point. Comparison videos can be difficult unless I own all of the things compared, so I generally focus on how to use technology.
You could make a video on how to deal with big data. For example 100,000,000 rows. That would be interesting. With this size even simple sql count takes forever in postgresql
Nvidia does not recommend using the RTX 30 series for DL saying it's a consumer grade card for gaming and cannot be used for DL. For DL they recommend using Tesla and Quadro series card which are much costlier with almost same amount of RAM and cuda cores. They say you can use consumer cards for DL but their performance will drop over a period of time. If this is true this might be due to form factor or other parameters of the circuit. But I'm not sure. I'd appreciate if you can explore and make a video on the feasibility of consumer cards for DL.
@@RAVIRAJPRAJAPATmech Nvidia sofware-locks geforce cards for certain tasks in order to be able to sell pro series at stellar prices. However, for ever 90% of the tasks, you won’t notice. But I also want to point out that if you’re going to use the GPU intensly, non-stop, always working etc, you may go for pro series cards. Because they are built to last longer under stress, tight and hot spaces for a long time. They are easier to stack up and you’ll get better customer support in emergencies. Geforce is for consumers, who aren’t keeping their PC’s running high end tasks non stop.
Thanks for the video. As someone whos trying to get in on a budget, and with the no end in sight on the GPU shortages, what do you think about picking up an old Tesla K80. They can be had for a pittance on ebay right now (under 200 bucks)
There is a problem with choosing a RTX 3080 over a RTX 3070. The additional RAM is nice, but it comes at the cost of buswidth. To this day I don't know why the 3080 has lower buswidth than the 3070 or 3090, but I do think that this is worth considering when buying a GPU. It is one of the (if not the) main bottleneck for throughput on optimized pipelines.
Hello , great content, I am getting into data science and would be aiming to go for deep learning. I have heard that cloud services like google colab does give nice performance for gpu works for deep learning. I am on a budget and can't afford RTX version of Nvidia card which is actually able to do deep learning work. I am getting a good deal on non gpu laptop Lenovo Thinkpad E series gen 3 with AMD ryzen 7 5800U and vega 8 on board gpu. Will it be sufficient to learn data science and do machine learning ? Thanks in advance.
So coming at this from a gamer and AI cloud engineer, I am curious. I was thinking of a setup for a 2 x 3090 NVlinked with a 3rd GPU using the RTX A6000 (this will be my main GPU for all deep/machine learning duties). My main OS will be a Linux OS but curious to know if anyone has a similar setup and what their experience is.
Steve. Were you able to complete you build with 2 x 3090 NVL together and a standalone A6000. Asking because I am along that pathway with currently 2x 3090 NVLink and contemplating on getting an A6000 for DL primarily. Also what motherboard would you recommend (Intel processor)? Thanks
@@evolvere9021 Not yet. I actually built most of it but but have not settled on local storage. And I am using the 3995WX AMD Ryzen Threadripper pro (require lots of cores and high amount of RAM). The motherboard I am using is the ASUS pro WS WRX80E-SAGE Se (extended). Required a large motherboard (and large case) so it can fit the 3 GPU's ( 2 x 3090 NV Linked) and the A6000. Got it all running on a single M.2 NVME SSD drive but with all of the work I will be doing, looking for a hardware raid card where I can take up to 4 NVME SSD drives (Samsung pro) 2 TB drives and raid 10 it together. On the single drive, the setup works pretty good. I haven't test benched it really but it screams. I was able to run some decent speed machine learning instances. Will be looking to run some Neural Net cases on the A6000 but so far so good. The NV Linked 3090's work good but keep in mind, I only use them for video editing, watching some 8k released movies (not alot so far) and the cards work fine. I put this project on hold as I am waiting on the 3090 Ti cards as well as the hardware raid card for the nvme drives (raid 10). Also, I am waiting on getting 4 x 8k monitors. currently using a friend's and it's beautiful. Also, my contacts at AMD is telling me to wait another 6-12 months as the new threadripper pro CPU is coming out with double the cores. I require alot as I also am running multiple VM's (windows and other linux distro's) for other purposes as well as some docker clusters for various container related work I do. I think I am still thinking about secondary storage (SATA SSD drives 8 x 4 TB drives raid 10). This is where I will store my VM's and container related work. If all goes well, hopefully by the end of 2022, it will be done. With the high cost, I am treading carefully here to make sure I spec out the right high-end workstation.
In gaming SLI is dead for some years basically. The differences between the 3090 and 3080 are marginally, until the 3080 runs into mem limits like with Microsoft Flight Simulator more often then not... THX! 💙
If you get the 3090, dont forget to "copper mod" your memory. GDDR6x could run like 110°C and after applying copper it could go as low as 85° c or something.
Can't buy a GPU card? Try Continuous Gray Code Optimization (pfd online) on a CPU cluster. Each CPU has the full neural model and part of the training set (which can be local and private to the CPU.) Each CPU is sent the same short sparse list of mutations to make and returns the cost for its part of the training set. The costs are summed and if an improvement an accept message is sent to each CPU else a reject message. Very little data is moving around per second hence the CPUs can be anywhere on the internet. The mutation operator is random plus or minus a.exp(-p.rnd()) If the weights are constrained between -1 and 1 then a=2 to match the interval. The so called precision p is a problem dependent positive number. The function rnd() returns a uniform random between 0 and 1. If the mutation is too big it is simplest just to reject it.
Dude, ReLU is a switch. f(x)=x connect, f(x)=0 disconnect. The dot product of a number of dot products is still a dot product. Go figure things out from there.
Analysing the price... just make sense buy a expensive graphic card if you can get this money back working with this gpu ... because the depreciation is too high ... take a look to TITAN RTX .. it has 24gb memory and 4608 cuda cores and cost $2499..release em 2018 .. then after 2 year rtx 3090 with same amount of memory but with twice more cuda cores (10496) and cost $1499 .. so it is to much money to spend, if you don't get this return back. It could be cheaper... it could move more the market, making deep learning environments more accessible and people would change more frequently their graphic cards. But it depends on Nvidia. Resuming ...buy rtx 3090 just if you have this return back or if you have money enough to burn
I think of it this way: will my cloud compute costs exceed the price of a GPU? If not, then just do cloud. If yes, just get the GPU. You own it , not rent it. You could sell it used later. With cloud, you could spend thousands and have no equipment afterwards. I tend to keep computers for many many years so it's not like I buy a new one every year.
I have been looking at your videos to learn about Nvidia GPUs. I recently upgraded from a Quadro P2000 and Quadro M1200 moving to a 3090. It is a serious jump. Backwards it seems. It may because I use remote desktop and I use Matlab for deep learning, and I can't take advantage of parallélisation as I use large LSTMs. Anyway, the new system is slower and sometimes half as fast in training. Did I make a mistake by not buying an A5000? (I had originally ordered an A5000, but gave up getting it delivered) Can I recover if I go to the office? I'd love to have your expert advice or that of any NVidia contacts please. Thanks in advance.
Thx for the info. I am just starting my ML/Deep Learning and just recently ordered a notebook with RTX 3050 Ti due to my limited budget. I guess with this RTX 3050 I am pretty limited on doing DL. bummer!
In my test the 1080ti is 5 times FASTER than the 3080. 64sec per epoch 1080ti, 324sec per epoch 3080. The same RNN model, the same code and data. It seems like software (Conda, TensorFlow) is lagging badly. I configured the system following this video guide and left a comment there as well th-cam.com/video/nMmCu3Nm1xA/w-d-xo.html
Well, it's all unobtanium now, so those prices are purely hypothetical. Crypto-miners and scalpers have bought all the good cards. Desperate gamers have bought the rest.
I updated this video for 2023! th-cam.com/video/F1ythHjdWI0/w-d-xo.html
Sir can you please tell me I am making a pc for my Deep learning tasks, possible I will run LLms also not so much i n budget but I have considered a 4060ti 16gb msi will it be good or I can get better performance for cheaper price or same price , for processing I have Ryzen 7 7900x and the motherboard is gigabyte B650M XAX.
Can u please give a solution sir, it's kind of urgent 😢
Between cryptominers and scalpers I feel like I should rename this video to "What GPU to Dream about Getting in 2021" 😞
very strong competition for the more scientific applications of modern GPUs. But prices seem to crash atm, so I might actually get my hands onto the higher GeForce series for all those juicy CUDA cores. Really a shame the different manufacturers don't have cards with more/less memory on board like they used to "in ages past". 3070/80 with an extra 4GB can be a gamechanger for deep learning workloads.
😶🌫🤩
I feel like I am poking at stuff way beyond my grade. I have been looking a bit at Nvidia Tesla v100 but I pretty much have no idea how it would hold up compared to anything else.
I do not even know if I would be capable of figuring out a meaningful use for it. I want to do some deep learning stuff and make my own AI text adventure model nonsense. But I already got stumped at the basic coding stuff. And I am still unsure how I managed to screw over my gtx 1080 so it now lags when trying to play in vr. I know it probably is because I tried to make my poor 8gb card try to run some basic text adventure stuff. And something among the drivers and other stuff I had to install to get that going probably messed up something one way or another... If it was not something related to the coding program nonsense. :/
I guess what I want at this point probably is a combination of AI text processing, vr running? bitcoin mining? something? ... I have no idea how bitcoin mining even works, is it even possible to turn such a process on and off? I mean, bitcoin is money right? If I first bought such an expensive setup, it would not hurt if it could pay back at least a tiny bit of itself over time, the 98% of the time its not in use.
... Checked out some of your other videos. I am absolutely subscribing! You seem to go over a lot of interesting topics.
still relevant to this date ;0
NOT NO MO!
I don't think i've seen a better explanation on this. thanks a lot!
Thanks!
For people dipping their toes into ML and DL and those who are not 100% sure that this is their future, Colab/Kaggle Notebooks and even AWS/GCP/Azure instances are excellent options in my opinion as the time required for the return on investment is pretty high.
Also, Prof Heaton could you compare the performance increase between 20 and 30 series? I am interested in knowing if the price hike (and general lack of unavailability) of 30 series is worth it.
I agree 100%, I own higher-end GPUs and often use CoLab myself.
@@HeatonResearch colab was nice but last month they've cut hours per month very significantly :( GPU prices also gone crazy. Kind of 2 problems hitting at the same time.
Okay, looks like NVIDIA will be providing me with a RTX A6000! Thank you NVIDIA! Look for an unboxing coming soon! Subscribe, so you do not miss it. Also, one correction to the video, there is no NVLINK on the RTX 3080.
For free? How if I may ask? I guess because you are a known academic.
@@Garycarlyle It's a number of factors. I am an influencer, which is a combination of academic and social media.
@@HeatonResearch One good thing about tech is most people will have a GPU even better than that soon enough. I mean that in a nice way. Looking forward to more of your videos. :)
Can you talk about tensorcores!
Congratulations
Great video Jeff. Thank you! Trying to add some info to your great video:
im using NVlink on 2x RTX 8000. I have done lots of experiments on TensorFlow with NVlink on and off.
In resume, on same model res, settings and bs:
2x 8000 nvlink: Around 700ms VS 2x 8000 No-nvlink: Around 1050ms (So you get 30% - 40% speed when using NVlink)
Ampere vs Turing:
Limiting the model to use only 24Gb of vram, compared 3090 vs 8000:
You can get around same speed of 1x 3090 using 2x 8000 with NVlink only. Without NVlink 2x 8000 are around 25% slow vs Ampere.
Thanks, very useful.
What do you think about the new RTX 3060's for deep learning? They have 12 GB of VRAM, would that outperform a 3080, my guess is no, that it also depends on the cuda cores. Could you do a video talking about this, I am not sure if you have.
I would've liked to see an RTX 3060 thrown into the mix.
I would like to know if bus-width matters with deep learning applications as a 3060 has 12GB VRAM.
Yeah, definitly a good price effeciency with the normal rtx 3060 vor deep learning
@@thomassieghold2698 but not so many CUDA cores
I have the answer to save you some time: You don’t choose a GPU in 2021.
You just wait for the new gen, not only that, you’ll probably need to wait till 2022 summer because people and scalpers will panic buy again.
👍2022
What do you think about the new Quadro RTX A4000? A lot fewer CUDA cores than say RTX 3080, but that 16GB of ram will fit bigger batch sizes. Also the low end Quadro cards are easier to find.
Very useful, most of my tech channels focus on gaming performance for this kind of hardware and completely disregard ML, DL and other data science uses of this kind of hardware. I was looking for a compromise between gaming and productivity, thanks.
11:15 I don't think that 3080 supports NVLink
Yeah there’s no nvlink support on 3080
Good point, you could still run two of them, though. But yes, no NVLink.
I agree with your take on the 3090 but good luck finding one for $1500. They don’t exist.
Thank you very much for this video, It is probably the best one about this topic and I´m very clear about what I need for my PhD!!
Wow!!! Thanks a ton!! Fantastic video!!
Super informative video!
It's generally the same considerations for computational physics GPU simulations. VRAM capacity is the most important factor, and VRAM bandwidth comes second. I'm using 8x Instinct MI200 to pool their 512GB VRAM for super large resolution fluid simulations.
The TFLOPs of the GPU itself is irrelevant, and so is the GPU generation. If multi-GPU is supported by the software, the it's always model parallelism, pooling/combining the VRAM of multiple GPUs. Looking at affordable options, the 3090 (non-Ti) is super appealing here.
This is pure gold!
Best explanation on TH-cam, Thank you!
Wow, thanks!
Thanks for the thorough guide!
vendor at 16:45 is System 76 just if you want to know where to get really good Linux systems. They are the Linux Apple company but with open source firmware and exchangeable parts. No Quadros yet :(
Thanks for the info, I was looking for something for Deep Learning (and also Deep Mining on the side...hehe)
Thanks Jeff. Hi from Colombia 🤙
You are most welcome.
What are your thoughts with the M1 Max chips?
How does the amd radeon 6600 gpu’s perform ? Are they good as nvidia?
They are an order of magnitude less compatible with machine learning libraries and not available in mainstream cloud. These are the main reasons I've shied away from AMD. Not on the CPU side, I am a major AMD "fanboy".
Thank you for your explanation and sharing!
Thanks, Jeff!
You are welcome.
Would it be better to buy 4 x cards K80 or M40 -24GB, or buy the over priced RTX 3090? Whats your professional suggestion?
Can we have the link to the github page that mentions the config details
Thank you for sharing this! How does the clock speed and number of CUDA cores affect the choice apart from the RAM? Is it true that more CUDA cores = more parallelization, hence speed up? Thanks again!
If you're comparing cards based on the same architecture then yes. Just beware that specifically what a CUDA core is tends to change slightly from generation to generation.
For example, the RTX 3000-series cards have simpler cores than the RTX 2000-series, so despite the 3000 cards having roughly twice the cores as their 2000 equivalent, they're far from twice as powerful in most workloads. The 3000-cards are really good, but not overwhelmingly better than the older cards. Especially at current prices.
You're generally better off comparing benchmark scores rather than core count and frequency.
@@fnorgen Thank you for the explanation
@@fnorgen Hi, Should I consider getting a used Quadro card instead of the new 3000 series card to save some money, or would it be better to spend a little more and get the new architecture GPU
@@sumitbali9194 I suppose it is worth looking into. Depends on what kind of prices you find. If a used Quadro gets you significantly better performance per $ for your intended use then I say go for it. If performance per $ is about the same, then you are probably better off buying one of the new cards if it is still within your budget. If you can find one for sale.
hai Heaton. RTX 2060 have more tensor core than RTX 3060, but RTX 3060 have more cuda core and memory. which one do you think is better to buy for deep learning works?
Would you choose a Quadro P6000 over a 2080 or titan Xp? Pascal Quadros are much cheaper now. The 24G VRAM is really pulling me towards a P6000. 🤔
For the price of one RTX 3090 you can buy between 5 and 8 used Tesla M40 cards
They have 24GB as well
Very informative videos. Thanks. Would you please point me to your sources that shows that rtx 3080s can use nv link. The info I see only shows the rtx 3090s having nv link capability. Any light you could provide would be greatly appreciated.
would nvlinking two 3090s work as having 48gigs of VRAM for training?
I am not fully knowledged about deep learning yet, but you are talking just about image processing. What if I want to train my model on e.g. something with e.g. 20 inputs? That means my matrix is e.g. 20x100, so RAM is not so important for me and I should look on number of cores? Right now I'm thinking abou 3060 Ti because of ~5k CUDA cores, but we'll see what will change with new gpus on market in the december.
Any thoughts on the new quadro rtx a5000? Or more generally on the benefits of tensor cores? I ask because the rtx a5000 actually has less tensor cores then the older rtx 5000. Which would you get if you were building a Tensorflow workstation?
I like the A5000, I would get along with it just fine. I rarely use the full 48GB that my A6000 has, and the A5000 has around 80% of the cores of an A6000. Not everything always scales to using all of those cores at 100%.
@@HeatonResearch similar question: what are your thoughts on an A4000? In an ideal world, I would aim for a 3080 at that price point, but in the 2021 world I have only found the A4000
Whats the difference between M40 and P40 besides the price?
I am still a bit confused after watching your videos. In my research group, we have a PC 24GB, with 1080RTX 8B for deeplearning using tensorflow/keras what would make more sense to increase speed, buy another 1080RTX 8 or 11GB or will (almost) every newer (and cheaper) higher spect NVIDA make a larger contribution?
Depending on the structure of the neural network being trained, 2 GPUs does not always mean a speed up or the ability to process twice as large of a neural network. Generally a 2nd GPU will nearly double the speed of training, if there is enough data that that it overcomes the overhead of two GPUs. To use the RAM of both GPUs on a single neural network requires customization of the neural network code is needed. So it is hard to say for sure which option is the best without trying the hardware. I tend to go for the newer, bigger GPUs, but not always necessary.
@@HeatonResearch Thanks for your answer and sorry for my late reply but due to circumstances I was out of office for a while. Ok, let's assume we stay with our data set under the 8GB using the cards in simple parallel. Any NVIDIA card equal or faster than the 1080 with 8GB or more will show some improvement I assume or should I use exactly identical cards? The reason I am asking is we first want to get some cheap experience with parallel GPUs before we start building a larger cluster with newer hardware.
I guess you answered this on "5 Questions about Dual GPU for Machine Learning (with Exxact dual 3090 workstation)" (th-cam.com/video/_d3xs1L4jeA/w-d-xo.html). So 1080 won't work together with 2060 or 3090., well that limits the choices, many thanks!
my GTX 1050 have 6.1 score of Compute capability, and GTX 1060 have score 6.1 too, that's meant GTX 1050 have same performance with GTX 1060 for ML training?
Can you please make an updated video?
update please for 2022
What about the 3060? it has 12gb of ram
Hello Sir,
what do you think about the use of Nvidia GTX 1650 super for deep learning?
So, I'm looking at getting a Vision 3090 OC to pair with my 3080Ti. I have Gigabytes B550 Vision D-P and was wondering if I use the 3090 to occupy the first Pcie 4.0x16 and then connect the 3080Ti to the 4.0x8, in terms of training time, would that result in a significant difference given I'm cutting the available pcie lanes in half to one Gpu?
What is actual difference between Quadro RTX 3090 Vs Before RTX 3090. What do think RTX 5000 16 GB Vs GeFroce RTX 3090 24GB
What do you think about RX 6900 XT for deep learning using python R and octave for financial analysis ? Do i really need a NVIDIA card for it ? As i just started getting in ML i lean towards RTX 3090, RTX 3080 and RX 6900 XT as i noticed it has 16 GB VRAM.
I'm working on NLP related deep learning project on 6 GB RTX 3060 for notebook PC. Frequently running into errors like "CUDA out of memory". Any suggestions? Should I change the hardware or is there a workaround?
what would be cheaper in the long run? buying your own $10,000 gpu or just outsourcing hardware into the cloud?
Hi I already have a GEFORCE RTX 2080 Ti and planning to expand by adding a RTX 3080 / 3090 . Is it possible to link two GPUs with NVLink?
To setup data center with 2 x Intel Xeon silver 4114 (10 cores) server in local data center, which GPU from nvidia will be best ?
what about tenser cores and FP16 32 INT8....
I had hoped to pick up a 30-series GPU at launch, but that didn't happen. Right now I'm waiting to see what kind of availability we will see when the new SKUs come to market featuring different RAM/core count combinations. Like you, I prefer the larger RAM sizes in order to work with larger models and I'll trade off core counts for better RAM. Unfortunately, 20-series cards aren't an option right now because of the lack of availability of 30-series cards for the general market.
Have similar thoughts myself on this.
Same here.. thinking about the A6000, overkill for me at the moment but it is actually available for purchase it seems. I had hoped to get two 3090s for my multi GPU setup but that seems like a pipedream at this point.
i have budget for either an RTX2060 or GTX1660TI , both 6 GB but RTX has 'tensor cores'.
how much performance gain can I expect from rtx over gtx for memory size as less as 6GB? is it worth spending more on RTX2060?
Edit: for laptops only
I am a student and is RTX 3060 6GB okay for machine learning?
Hi,
i'm working on computer vision and i need to improve my set, what about e-GPU?
I like the portability of working with a laptop but I have some doubts about the performance of using a laptop with an external RTX (e-GPU).
I would like to know your thoughts on using deep learning with a laptop + eGPU.
Best regards
Dr. Jeff, thoughts on the GeForce RTX 3070 in the new Razer Blade 14?
That is actually one of the computers I am considering for my own laptop, when I upgrade my Windows machine. Looks like a solid machine.
hey Jeff! awesome content, really enjoyed the video.
I've got a question about the stylegan training table - what do they mean by 25000 kimg? Is it 25 thousand or 25 million images?
Thanks!
Correct me if I am wrong. But typical software requires consistency across similar hardware that's more meaningful than just the brand (2 MSI 3090, 2 ASUS 3090. Can't be 1:1 each or even 2 different brand/models.
This is what I've come to understand. I've never SLI. Own a 3090 and a 1080 before this. But such software as "OpenMPI" I thought requires system consistency when combining. (I have another PC I would bring in to the mix).
I honestly haven't confirmed this, just seen it in documentation at various times either implied or stated. And am curious if this is *technically* not the case.
Are quadro cards good for machine learning?
Yes, very much so.
I hope more people start developing for Ampere using tensorflow 2.x I wish I could train Stylegan2-ada on my Ampere card, but the Nvidia code is only using old CUDA10 and Tensorflow 1.x
Check out the new StyleGAN2 ADA, now for Pytorch. It uses Ampere. th-cam.com/video/BCde68k6KXg/w-d-xo.html
@@HeatonResearch Thank you! Going to try this out!
HI Jeff! Thx for useful information. Sorry if there's already question about gaming laptops, i saw MSI GE76 Raider model with nvidia 3080 16Gb memory, it can be compare with desktop or mobile cards never competing pc versions? Thank you
Supermicro has workstations that carry 4 x DW GPUs.
Sir, I am about to start my master's program in AI. Could you please recommend which workstation I should buy with a limited budget (5-6k)?
I used Google Colab, but it is difficult to do computer vision projects. Thank you!
i have money for buy 1 RTX3090. but. i can buy 2 Rtx 3080 with this money. what is your choose?
The nvidia 3060 (non-ti) has been announced and it will come with 12 gb($329). This is the lowest tier so far from the 3060 series. Does it make sense to get a slower gpu if it has more memory compared to 3080 and 3070?
I will be starting my learning now and not sure if I should invest on something less powerful now and upgrade next year to a high end 4000 series.
RAM matters more. What type of model are you training, how much RAM does it need? 12GB should be great to start.
If you just want to try out and learn ML, then yes, more RAM means more options to choose and to experience.
Hi Jeff, what do you think about Jetson? Can you make a video comparing Nano, NX, AGX and TX2? Regards from Argentina!
I really have not gotten into Jetson at this point. Comparison videos can be difficult unless I own all of the things compared, so I generally focus on how to use technology.
What is Quadro RTX P4000 F64 performance ?
You could make a video on how to deal with big data. For example 100,000,000 rows. That would be interesting. With this size even simple sql count takes forever in postgresql
Run vaccum or use teradata/spark/redshift
Mr Heaton THANKYOU, I work mostly in MacBook Pro 2020(thunderbolt), what is your recommendation for external GPU Nvidia
If it's an M1 I think they're not compatible with eGPU.
Thankyou Sir
macbooks don't work with nvidia..... There are tricks but generally apple do not support nvidia drivers.
@@denysivanov3364 to my knowledge, they do if you install Windows on them using bootcamp and if they have an intel cpu...
@@habasch51 maybe its just mac os thing.
Nvidia does not recommend using the RTX 30 series for DL saying it's a consumer grade card for gaming and cannot be used for DL. For DL they recommend using Tesla and Quadro series card which are much costlier with almost same amount of RAM and cuda cores. They say you can use consumer cards for DL but their performance will drop over a period of time. If this is true this might be due to form factor or other parameters of the circuit. But I'm not sure. I'd appreciate if you can explore and make a video on the feasibility of consumer cards for DL.
good luck with 14k usd a100 and 7k usd used v100.
@@denysivanov3364 haha Nvidia performance cuts in 3090 and 3080. i used 3080 and tesla K80.
M also looking for clarity over this. Please do mention here if anyone has answer to this question.
@@RAVIRAJPRAJAPATmech Nvidia sofware-locks geforce cards for certain tasks in order to be able to sell pro series at stellar prices. However, for ever 90% of the tasks, you won’t notice.
But I also want to point out that if you’re going to use the GPU intensly, non-stop, always working etc, you may go for pro series cards. Because they are built to last longer under stress, tight and hot spaces for a long time. They are easier to stack up and you’ll get better customer support in emergencies. Geforce is for consumers, who aren’t keeping their PC’s running high end tasks non stop.
Thanks for the video. As someone whos trying to get in on a budget, and with the no end in sight on the GPU shortages, what do you think about picking up an old Tesla K80. They can be had for a pittance on ebay right now (under 200 bucks)
new PyTorch doesn't support k80 anymore, but you can use the older version, but it is painful.
There is a problem with choosing a RTX 3080 over a RTX 3070. The additional RAM is nice, but it comes at the cost of buswidth. To this day I don't know why the 3080 has lower buswidth than the 3070 or 3090, but I do think that this is worth considering when buying a GPU. It is one of the (if not the) main bottleneck for throughput on optimized pipelines.
Hello , great content, I am getting into data science and would be aiming to go for deep learning. I have heard that cloud services like google colab does give nice performance for gpu works for deep learning. I am on a budget and can't afford RTX version of Nvidia card which is actually able to do deep learning work. I am getting a good deal on non gpu laptop Lenovo Thinkpad E series gen 3 with AMD ryzen 7 5800U and vega 8 on board gpu. Will it be sufficient to learn data science and do machine learning ? Thanks in advance.
Thank you for the nice explanation
Is it possible to use eGPU with thunderbolt 3/4 to training deep learning model on laptop?
That is not something I've tried. Though I am thinking of trying an eGPU with my previous GPU.
@@HeatonResearch Thank you so much
So coming at this from a gamer and AI cloud engineer, I am curious. I was thinking of a setup for a 2 x 3090 NVlinked with a 3rd GPU using the RTX A6000 (this will be my main GPU for all deep/machine learning duties). My main OS will be a Linux OS but curious to know if anyone has a similar setup and what their experience is.
That would be an interesting setup, and I guess in theory that might work. I've never tried a heterogeneous multi-GPU setup
Following...I was wondering the same thing (but running on Win 10)
Steve. Were you able to complete you build with 2 x 3090 NVL together and a standalone A6000. Asking because I am along that pathway with currently 2x 3090 NVLink and contemplating on getting an A6000 for DL primarily. Also what motherboard would you recommend (Intel processor)? Thanks
@@evolvere9021 Not yet. I actually built most of it but but have not settled on local storage. And I am using the 3995WX AMD Ryzen Threadripper pro (require lots of cores and high amount of RAM). The motherboard I am using is the ASUS pro WS WRX80E-SAGE Se (extended). Required a large motherboard (and large case) so it can fit the 3 GPU's ( 2 x 3090 NV Linked) and the A6000. Got it all running on a single M.2 NVME SSD drive but with all of the work I will be doing, looking for a hardware raid card where I can take up to 4 NVME SSD drives (Samsung pro) 2 TB drives and raid 10 it together. On the single drive, the setup works pretty good. I haven't test benched it really but it screams. I was able to run some decent speed machine learning instances. Will be looking to run some Neural Net cases on the A6000 but so far so good. The NV Linked 3090's work good but keep in mind, I only use them for video editing, watching some 8k released movies (not alot so far) and the cards work fine. I put this project on hold as I am waiting on the 3090 Ti cards as well as the hardware raid card for the nvme drives (raid 10). Also, I am waiting on getting 4 x 8k monitors. currently using a friend's and it's beautiful. Also, my contacts at AMD is telling me to wait another 6-12 months as the new threadripper pro CPU is coming out with double the cores. I require alot as I also am running multiple VM's (windows and other linux distro's) for other purposes as well as some docker clusters for various container related work I do. I think I am still thinking about secondary storage (SATA SSD drives 8 x 4 TB drives raid 10). This is where I will store my VM's and container related work. If all goes well, hopefully by the end of 2022, it will be done. With the high cost, I am treading carefully here to make sure I spec out the right high-end workstation.
In gaming SLI is dead for some years basically. The differences between the 3090 and 3080 are marginally, until the 3080 runs into mem limits like with Microsoft Flight Simulator more often then not... THX! 💙
Which GPU to choose for Laptop - Quadro 5000 16GB or RTX 3080 - 16GB?
I've been considering that myself. I would go with 3080, so you have the new Ampere architecture.
What do you think about eGPUs?
They seem very interesting; however, it is not something that I have worked with.
If you get the 3090, dont forget to "copper mod" your memory. GDDR6x could run like 110°C and after applying copper it could go as low as 85° c or something.
Great video, unfortunately the price of GPU's in 2021 are very expensive :(
Can't buy a GPU card? Try Continuous Gray Code Optimization (pfd online) on a CPU cluster. Each CPU has the full neural model and part of the training set (which can be local and private to the CPU.) Each CPU is sent the same short sparse list of mutations to make and returns the cost for its part of the training set. The costs are summed and if an improvement an accept message is sent to each CPU else a reject message. Very little data is moving around per second hence the CPUs can be anywhere on the internet. The mutation operator is random plus or minus
a.exp(-p.rnd())
If the weights are constrained between -1 and 1 then a=2 to match the interval. The so called precision p is a problem dependent positive number. The function rnd() returns a uniform random between 0 and 1. If the mutation is too big it is simplest just to reject it.
Dude, ReLU is a switch. f(x)=x connect, f(x)=0 disconnect. The dot product of a number of dot products is still a dot product. Go figure things out from there.
Would you choose a NVIDIA Jetson over say a cheapie RTX 3060ti if you're just a student?
Maybe, I've thought about the thought really has crossed my mind. However, I am not real familiar with Jetson and have never owned one.
Analysing the price... just make sense buy a expensive graphic card if you can get this money back working with this gpu ... because the depreciation is too high ... take a look to TITAN RTX .. it has 24gb memory and 4608 cuda cores and cost $2499..release em 2018 .. then after 2 year rtx 3090 with same amount of memory but with twice more cuda cores (10496) and cost $1499 .. so it is to much money to spend, if you don't get this return back. It could be cheaper... it could move more the market, making deep learning environments more accessible and people would change more frequently their graphic cards. But it depends on Nvidia.
Resuming ...buy rtx 3090 just if you have this return back or if you have money enough to burn
I think of it this way: will my cloud compute costs exceed the price of a GPU? If not, then just do cloud. If yes, just get the GPU. You own it , not rent it. You could sell it used later. With cloud, you could spend thousands and have no equipment afterwards. I tend to keep computers for many many years so it's not like I buy a new one every year.
totally worth it
Amusing and sad at the same time, watching this at the end of April. These prices, lol.
They dropped 😁
@@hamzaabdikadir2657 Yus! Let me tell all my dogecoin friends! Been waiting to get into Cryptomining!!! XD
Thanks for taking the trouble to make this
no mention of AMD's SSG
I have been looking at your videos to learn about Nvidia GPUs. I recently upgraded from a Quadro P2000 and Quadro M1200 moving to a 3090. It is a serious jump. Backwards it seems. It may because I use remote desktop and I use Matlab for deep learning, and I can't take advantage of parallélisation as I use large LSTMs. Anyway, the new system is slower and sometimes half as fast in training. Did I make a mistake by not buying an A5000? (I had originally ordered an A5000, but gave up getting it delivered) Can I recover if I go to the office? I'd love to have your expert advice or that of any NVidia contacts please. Thanks in advance.
How do I make money from this
Thx for the info. I am just starting my ML/Deep Learning and just recently ordered a notebook with RTX 3050 Ti due to my limited budget. I guess with this RTX 3050 I am pretty limited on doing DL. bummer!
why do you say so? is it due to the vram memory? then what do you think about my gtx1650?
man i wish i could afford an A100' rn
In my test the 1080ti is 5 times FASTER than the 3080. 64sec per epoch 1080ti, 324sec per epoch 3080. The same RNN model, the same code and data. It seems like software (Conda, TensorFlow) is lagging badly. I configured the system following this video guide and left a comment there as well th-cam.com/video/nMmCu3Nm1xA/w-d-xo.html
Came to the same conclusion as you to get 2x 3090's... now... has anyone seen one???
If I were building (and paying for myself)the GPUs, that is likely what I would get.
Only if we can find one :-(
@@sohailshaikh7454 3090's aren't so bad to find. It's the lower end cards which are impossible
Rtx 3060 ti 399$?!😭😭😭😭😭😭
MSRP...
This year, who even knows.
No NVLINK on rtx 3080 :)
Yes quite true.
Well, it's all unobtanium now, so those prices are purely hypothetical. Crypto-miners and scalpers have bought all the good cards. Desperate gamers have bought the rest.
Mastery of Colab I believe is paramount