Downgrading My GPU For More Performace

แชร์
ฝัง
  • เผยแพร่เมื่อ 25 พ.ค. 2023
  • Checking out a older nvidia tesla card that can meet my needs for AI.
    ○○○ LINKS ○○○
    Nvidia Tesla M40 ► ebay.us/ED5oqB
    Nvidia Tesla P40 ► ebay.us/HWpCZO
    ○○○ SHOP ○○○
    Novaspirit Shop ► teespring.com/stores/novaspir...
    Amazon Store ► amzn.to/2AYs3dI
    ○○○ SUPPORT ○○○
    💗 Patreon ► goo.gl/xpgbzB
    ○○○ SOCIAL ○○○
    🎮 Twitch ► / novaspirit
    🎮 Pandemic Playground ► / @pandemicplayground
    ▶️ novaspirit tv ► goo.gl/uokXYr
    🎮 Novaspirit Gaming ► / @novaspiritgaming
    🐤 Twitter ► / novaspirittech
    👾 Discord chat ► / discord
    FB Group Novaspirit ► / novasspirittech
    ○○○ Send Me Stuff ○○○
    Don Hui
    PO BOX 765
    Farmingville, NY 11738
    ○○○ Music ○○○
    From Epidemic Sounds
    patreon @ / novaspirittech
    Tweet me: @ / novaspirittech
    facebook: @ / novaspirittech
    Instagram @ / novaspirittech
    DISCLAIMER: This video and description contains affiliate links, which means that if you click on one of the product links, I’ll receive a small commission.
  • วิทยาศาสตร์และเทคโนโลยี

ความคิดเห็น • 104

  • @syspowertools3372
    @syspowertools3372 3 หลายเดือนก่อน +7

    I picked one up on Ebay for $45 shipped. I also had a FTW 980ti cooler laying arround. As long as the cooler fits the stock PCB of any 970 to titan X card, you can just swap it. You may need to cut out or re-solder the 12v power connector in the other orientation tho, in my case I moved it from the back to the top. I also thermal glued heat sinks on the backplate because not beingin a server case means that vram gets warm.

    • @yungdaggerdikkk
      @yungdaggerdikkk 2 หลายเดือนก่อน

      holy molly bro, 45? any link or tip to get one that cheap? ty and hope u enjoy it x)

    • @joshuachiriboga5305
      @joshuachiriboga5305 2 หลายเดือนก่อน

      ​@@yungdaggerdikkk Newegg has them at about that price

    • @joshuachiriboga5305
      @joshuachiriboga5305 2 หลายเดือนก่อน

      Running Stable Diffusion does it run out of vram at 12gb or at 24gb?
      The tech docs claim the system is 2 systems of Cuda and vram etc...

  • @KomradeMikhail
    @KomradeMikhail ปีที่แล้ว +16

    SD, GPT, and other AI apps _still_ not taking advantage of AI Tensor cores...
    Literally what they were invented for.

    • @gardenerofthesun
      @gardenerofthesun 4 หลายเดือนก่อน +1

      As long as I know, llama-cpp can use tensor cores

  • @gregorengelter1165
    @gregorengelter1165 ปีที่แล้ว +8

    I also got myself an M40 a few months ago. But cooling with air is not really a good solution in my opinion. I was lucky enough to get a Titan X (Maxwell) water block from EK for 40€/~44USD. With it, the part runs perfectly and comes under full load to a maximum of 60 ° C / 140 °F.
    If you are not so lucky, I would still recommend using these AiO CPU to GPU adapters (e.g. from NZXT).
    Air cooling is comparatively huge and extremely loud (most of the time).

  • @simpernchong
    @simpernchong 11 หลายเดือนก่อน

    Great video. Thanks!!

  • @StitchTheOtter
    @StitchTheOtter ปีที่แล้ว +7

    I did get myself a P40 for 170€. RTX 2080 gaming performance and 24gb GDDR5 694.3 GB/s. Stable diffusion on my 2080 runs around 5-10x Faster than on the P40. But it would make a good price/performance cloud gaming GPU.

  • @KiraSlith
    @KiraSlith ปีที่แล้ว +16

    I'm using a trio of P40s in my headless Z840, kinda risking running into the PSU's power limit, but there's nothing like having a nearly real-time conversation with a 13b or 30b parameter model like Meta's LLaMA.

    • @jaffmoney1219
      @jaffmoney1219 11 หลายเดือนก่อน +1

      I am looking into buying a Z840 also, how are you able to keep the P40s cool enough?

    • @KiraSlith
      @KiraSlith 11 หลายเดือนก่อน +3

      @@jaffmoney1219 Air ducting and cramming the PCIe zone intakes to 100%. If you buy the HP branded P40s supposedly their BIOS will tell the motherboard to ramp the fans automatically. I'm using a pair supposedly from PNY so I don't know.

    • @strikerstrikerson8570
      @strikerstrikerson8570 11 หลายเดือนก่อน +2

      @@KiraSlith Hello! Can you make a short video on how it works for you from the side of hardware and a language model such as LLAMA?
      If you can’t or don’t want to make a video, you can briefly describe here your hardware configuration, and what is better to take for this?
      I'm looking at an old platform 2011-v3 18-22 core cpu, gaming motherboard from asus or asrock with 128/256gb ddr4 ecs ram. At first I wanted to buy a modern video card RTX 30xx / 40xx line, but then I came across Tesla server accelerators, which have a large amount of VRAM 16/24/32 GB
      which we have about 150/250/400 euros
      Unfortunately, there is somehow little information, and if you come across videos on TH-cam, then people start stable diffusion, which gives very deplorable results even at tesla V 100, which the RTX3060 bypasses.
      Thanks in advance!

    • @KiraSlith
      @KiraSlith 11 หลายเดือนก่อน

      ​@@strikerstrikerson8570 Sure, when it comes down for maintenance next. It's currently training a model. If you want new cards only and don't have a fat wallet to spend from, you're stuck with Consumer cards either way. Otherwise, what you want depends entirely on what your primary goal is. Apologies in advance for the sizable wall of text you're about to read, but it's necessary to understand how to actually pick a card.
      I'll start by breaking it down by task demand:
      - image recognition and voice synthesis models want fast CUDA cores but still benefit from higher core counts, and the larger the input or output, the more VRAM they need.
      - Image generation and voice recognition models also want fast CUDA cores, but their VRAM demands expand exponentially faster.
      - LLMs want enough VRAM to fit the whole model uncompressed and lots of CUDA cores. They aren't as affected by core speed but still benefit.
      - Model training always requires lots of VRAM and CUDA cores to complete in a reasonable amount of time. Doesn't really matter what the model you're training does.
      Some models bottleneck harder than others (though the harshest bottleneck is always VRAM capacity), but ALL CUDA Compute capable GPUs (basically anything made after 2016) are able to run all models to some degree. So I'll break it down by their degree of capability, within their same generation and product tier.
      - Tesla cards have the most CUDA cores and VRAM, but have the slowest cores and require your own high CFM cooling solution to keep them from roasting themselves to death. They're reliably the 2nd cheapest card option for their performance used and the only really "good" option for training models.
      - Tesla 100 variants trade VRAM capactiy for faster HBM2 memory, but don't benefit much from that faster memory outside enterprise environments with remote storage. They're usually the 2nd most expensive card in spite.
      - Quadro cards strike a solid balance between Tesla and Consumer. Fewer CUDA than Tesla but more than Consumer. Faster CUDA cores than Tesla but slower than Consumer. More VRAM than consumer, but usually less than Tesla. Thanks to "RTX Experience" providing solid gaming on these cards too, they're the true "Jack of all trades" option and appropriately end up with a used price right in the middle.
      - Quadro "G" variants (eg GP100) trade their VRAM advantage over consumer for HBM2 VRAM at absurd clock speeds, giving them a unique advantage in Image generation (and video editing). They're also reliably the most expensive card in their tier.
      - Consumer cards are the best used option for the price if you want bulk image generation, voice synthesis, and voice recognition. They're slow with LLMs, and if you try to feed them a particularly big model (30b or more) will bottleneck even more harshly on their lacking VRAM (be it capacity or speed) and potential to bottleneck even further paging out to significantly slower system RAM.

    • @og_tokyo
      @og_tokyo 7 หลายเดือนก่อน

      Stuffed a z440 mobo into a 3u case, will be putting 2x p40s in here shortly.

  • @joo9125
    @joo9125 ปีที่แล้ว +46

    Turing, not TURNing lol

    • @fffUUUUUU
      @fffUUUUUU ปีที่แล้ว +3

      He's a Pro, don't tell him he's wrong 😂

    • @igyysdaddy191
      @igyysdaddy191 2 หลายเดือนก่อน +1

      you just turinged him on

    • @subsubl
      @subsubl 2 หลายเดือนก่อน

      😂

  • @zilog1
    @zilog1 10 หลายเดือนก่อน +4

    They are going for $50 currently. get a server rack and fill them up!

  • @schifferu
    @schifferu 11 หลายเดือนก่อน +1

    Got my Tesla M40 a while back, and now have a fan cooling on it (EVGA SC GTX 980ti cooler) to mess around with, but just seeing the power consumption 😅😅

  • @timomustamaki5407
    @timomustamaki5407 ปีที่แล้ว +3

    I have been planning this move as well as the M40 is dirt cheap on ebay. But I worry about one thing you did not touch on this video (or at least I did not notice if you did): How did you solve the power cabling issue? I believe the M40 does not take a regular pcie gpu power cable but needs something different, an 8-pin cable?

    • @KiraSlith
      @KiraSlith ปีที่แล้ว +1

      That's right, the Tesla M40 and P40 use an EPS (aka "8-pin CPU") cable, which can thankfully be resolved using an adapter cable. Just a note, the 6-pin PCI power to 8-pin EPS cables some chinese sellers offer should ONLY be used with a dedicated cable run from the PSU to avoid cable meltdowns! Thankfully this isn't an issue if you're using a HP Z840 (which also conveniently solves the airflow issue too), or a custom modular PSU with plenty of PCI power connections, but it can quickly become an issue for something like a Dell T7920.

  • @SpottedHares
    @SpottedHares 7 หลายเดือนก่อน +2

    So according to Nvidia own specs the m40 uses the same board as the titan x and 900 series. So theoretical any cooling system that works for either of those two should also work on the M40.

  • @edgecrush3r
    @edgecrush3r ปีที่แล้ว +3

    I just purchased a Telsa P4 some weeks ago, and having a blast with it. The Low Profile even fits in the QNAP 472XT chassis. Passthrough works fine (minor tweaks). Currently compiling kernel to get support for vGPU (if i ever succeed).

  • @vap0rtranz
    @vap0rtranz หลายเดือนก่อน +1

    Great explanation. Basically a Gamers vs AI hackers. The AI models want to fit into V/RAM, but are huge, so the 8G or 12G VRAM cards can't run them. Getting a new + huge VRAM GPU is hella expensive right now. So an older card with lots of VRAM works. Also, the Gamers tend to overclock/overheat, but the Tesla and Quadro are usually datacenter liquidations, so there's less risk of getting a fried GPU. BTW: the P40 is newer version of the M40.

  • @KratomSyndicate
    @KratomSyndicate ปีที่แล้ว +6

    I just bought a rtx 4090 last night and all the parts for a new desktop, i9 13900K, MSI Meg Z790, ddr5 128gb, 4 - samsung 990 pros, to just do SD and AI, maybe over kill

    • @Mark300win
      @Mark300win 6 หลายเดือนก่อน +1

      Dude you’re loaded 😁$

    • @sa_med
      @sa_med 4 หลายเดือนก่อน +1

      Definitely not overkill if it's for professional use

  • @madman1397
    @madman1397 9 หลายเดือนก่อน +3

    Tesla P40 24gb cards are on ebay for sub $200 now. Considering one for my server

  • @beholder4465
    @beholder4465 10 หลายเดือนก่อน +2

    i have asus h410 hdv m.2 intel chipset, compatibilty good with the tesla m40?
    ty

  • @carlosmiguelpimientatovar8458
    @carlosmiguelpimientatovar8458 5 หลายเดือนก่อน

    Excellent video.
    In my case I have a workstation with an msi X99A TOMAHAWK motherboard with an Intel Xeon E5-2699 v3 processor, (and I currently use 3 monitors). Because of this I installed a GPU, AMD firepro w7100 which works very well for me in Solidworks.
    The RAM is Non-ECC 32 gigabytes.
    The problem is that I am learning to use ANSYS, and this software is married to Nvidea, and for GPU calculation acceleration, looking at the Ansys GPU compatibility lists, I see that the K80 is used, and taking into account the second-hand price, I am interested in purchasing one.
    How can I configure my system to install an Nvidea Tesla K80 and have the AMD GPU work as an image or video generator for my monitors as it currently does? Does the Nvidea K80 gpu have 24 GB of ram, can this be affected when using this gpu in conjunction with the AMD GPU that only has 8 GB of ram? Would the K80 be restricted to the RAM of the Firepro w7100?
    My PSU is 700 watts.
    Thank you.

  • @DanRegalia
    @DanRegalia 8 หลายเดือนก่อน +3

    So, I picked up a P40 after watching this video... Thanks! Do you have any videos that talk about loading these LLMs, or if I should go with linux/windows/etc... maybe install Jetpack from the Nvidia downloads? I've screwed around a little with hugging face, and that made me want to get the card to run better models, but rabbit hole after rabbit hole, I'm questioning my original strategy.

    • @NovaspiritTech
      @NovaspiritTech  8 หลายเดือนก่อน +3

      i'm glad you were able to pick up a p40 and not the m40 since pascal arch can run 4bit modes which is most llm models but llm's changes so rapidly i can't even keep up myself but i have been running the docker container for github.com/Atinoda/text-generation-webui-docker . but yes this is a deep rabbit hole i feel your pain

    • @vap0rtranz
      @vap0rtranz หลายเดือนก่อน

      Easiest out-of-box apps for running local LLMs are GPT4All and AnythingLLM. Huggingface requires lots of hugging to not sink into rabbit holes :) The apps like I mention keep things simple. Both have active Discord channels that are helpful too.

  • @FlexibleToast
    @FlexibleToast 11 หลายเดือนก่อน +3

    You say you need a newer motherboard to use the P40. Does any motherboard with PCIe x16 3.0 work?

    • @k-osmonaut8807
      @k-osmonaut8807 9 หลายเดือนก่อน +2

      Yes, as long as it supports above 4g decoding

  • @seanoneill9130
    @seanoneill9130 ปีที่แล้ว +9

    Home Depot has free delivery.

    • @NovaspiritTech
      @NovaspiritTech  ปีที่แล้ว +4

      😂

    • @garthkey
      @garthkey ปีที่แล้ว +4

      With them having the choice of worst wood, no thanks

  • @TheRainbowdashy
    @TheRainbowdashy 3 หลายเดือนก่อน

    How does the p40 perform for video editing and 3D design programs like Blender?

  • @charleswofford5515
    @charleswofford5515 10 หลายเดือนก่อน +3

    For anyone wanting to do this. I found the best cooling solution is a Zotac gtx 980 amped edition 4 Gb model. It has the exact same footprint. The circuit board is nearly identical. Bolts right on with very little modifications. You will need to use parts from tesla and zotac gpuGPU to make it work. Been running mine for a while now without issue.

  • @bopal93
    @bopal93 5 หลายเดือนก่อน

    What's idle power consumption the m40. I'm thinking to use in my server but can't find details on internet. Thanks

  • @fuba44
    @fuba44 ปีที่แล้ว +2

    But wait, i was under the impression that both the M40 and the P40 are dual GPU cards, so the 24gb of vram is split between the to gpu's. or am i mistaken ? when i look up the specs it looks like only 12gb per gpu.

    • @unicronbot
      @unicronbot 11 หลายเดือนก่อน +4

      M40 and P40 GPU are single CPU

    • @yb801
      @yb801 9 หลายเดือนก่อน +2

      I think you are talking about K80 gpu.

  • @MWcrazyhorse
    @MWcrazyhorse 8 หลายเดือนก่อน

    How does this compare to an RTX A2000?

  • @Robstercraw
    @Robstercraw 11 หลายเดือนก่อน +1

    You can't just plug that card in and go. There are driver issues. Did you get it working?

    • @gardenerofthesun
      @gardenerofthesun 4 หลายเดือนก่อน +1

      Owner of P40 and 3090 in the same PC.
      No problems whatsoever, just install Studio driver

  • @joshuachiriboga5305
    @joshuachiriboga5305 2 หลายเดือนก่อน

    The Tesla K80 with 24gb vram, claims a setup of 2 system each with it own Cuda and vram. When running Stable Diffusion does it behave as one GPU with 24gb or does it behave as 2? Does it run out of vram at 12gb or 24gb in image production?

    • @truehighs7845
      @truehighs7845 2 หลายเดือนก่อน

      That's exactly my question.

  • @cultureshock5000
    @cultureshock5000 9 หลายเดือนก่อน

    is the 8gb lopro good for my sff dell i like my rx550 but i could play alot more stuff i bet i could lay starfield 1080 on low on the 8gb m4 .... is it worth th e90 bucks

  • @sergiodeplata
    @sergiodeplata ปีที่แล้ว +4

    You can use both card simultaneously. There will be two CUDA devices.

  • @davidburgess2673
    @davidburgess2673 6 หลายเดือนก่อน

    What about hbcc on a vega 64 to "unlimited" boost in ram all be it a little slower but with video out etc

  • @alignedfibers
    @alignedfibers 10 หลายเดือนก่อน +3

    I went with K80 but stable diffusion only runs with torch 1.12 and cuda 11.3 and right now only runs on 12GB half memory and half gpu in the k80 because it is Dual GPU. M40 should allow modern cuda and nvidia driver and also no work around needed to access full 24GB on K80.

    • @joshuachiriboga5305
      @joshuachiriboga5305 2 หลายเดือนก่อน

      Thank you, I have been looking for this info

    • @truehighs7845
      @truehighs7845 2 หลายเดือนก่อน

      Does it use the whole 25Gb VRam, because it's basically 4 cores put together, is the Vram working as 1?

  • @alignedfibers
    @alignedfibers 10 หลายเดือนก่อน

    m40?

  • @bulcub
    @bulcub 11 หลายเดือนก่อน

    I have server that I'm going repurpose as a video renderer to a multiple storage drive bay (24) I wanted to know if this is possible? would I need proxmox etc would the p40 model be sufficient?

    • @NovaspiritTech
      @NovaspiritTech  11 หลายเดือนก่อน +1

      I have a video on this topic with using tdarr

  • @joshuascholar3220
    @joshuascholar3220 4 หลายเดือนก่อน

    I'm about to try it with a 32 gb Radeon Instinct Mi 50.

  • @jerry5566
    @jerry5566 ปีที่แล้ว +1

    P40 is good, but only concern is that it had probably been used for mining

    • @Antassium
      @Antassium 10 หลายเดือนก่อน +4

      Mining has been proven to not cause any more significant wear than regular duty cycles..
      In fact, in some situations the mining rig would be a cleaner and safer environment than in a PC case, on the floor in some persons home with toddlers sloshing their chocky milk around, for example 😂

  • @chjpiu
    @chjpiu 3 หลายเดือนก่อน

    Can you suggest a desktop workstation can include tesla m40? Thank you so much

    • @truehighs7845
      @truehighs7845 2 หลายเดือนก่อน

      look for an HP z840, but buy a GPU separately because you are probably going to pay way more if included.

  • @R0TFEAST
    @R0TFEAST 11 หลายเดือนก่อน +1

    2:21 Actually, that Tesla card has 1150 more cuda cores than that 2070...
    3,072-1,922= 1150
    The only thing im curious about is how well it can mine. 🤔
    If anything, why the hell wouldnt you just get a 3090ti? It has 10,496 cuda cores which is far and beyond the tesla in both capabilities for work and gaming.
    If its due to sheer prices, i get it but the specs are still beyond what you currently have.

    • @Antassium
      @Antassium 10 หลายเดือนก่อน

      Cost:Performance...

  • @MrHasie
    @MrHasie ปีที่แล้ว

    Now, I have Fit, what’s its comparison? 🤭

  • @zygge
    @zygge ปีที่แล้ว +3

    Pc dont need HDMI output to boot. Any display interface is ok. VGA, DVI or DP

  • @user-pq8tn8yu1k
    @user-pq8tn8yu1k ปีที่แล้ว +2

    what is the power draw "idle" of that?! if on 24/7 in a server. can it power down? cant find info on that online.

    • @execration_texts
      @execration_texts ปีที่แล้ว +1

      My M40 idled at ~30 watts, P40 is closer to 20

  • @tomaszmaciaszczyk2116
    @tomaszmaciaszczyk2116 2 หลายเดือนก่อน

    cuda cores my frend .ihave this card on my table right now.g f pol

  • @robertfontaine3650
    @robertfontaine3650 5 หลายเดือนก่อน

    That is a heck of a lot cheaper than the 3090's

  • @gileneusz
    @gileneusz 5 หลายเดือนก่อน

    isn't 4090 faster?

  • @garrettnilan5609
    @garrettnilan5609 ปีที่แล้ว

    Can you run a stable diffusion test and show us how to set it up please!

  • @hardbrocklife
    @hardbrocklife 11 หลายเดือนก่อน +1

    So P40 > M40?

    • @b_28_vaidande_ayush93
      @b_28_vaidande_ayush93 10 หลายเดือนก่อน +2

      Yes

    • @ghardware_3034
      @ghardware_3034 10 หลายเดือนก่อน

      For training or FP16 inference get the P100, it got decent FP16 performance, the P40 is horrible at that, it was specialised for INT8 inference@@b_28_vaidande_ayush93

  • @FreakyDudeEx
    @FreakyDudeEx 7 หลายเดือนก่อน

    kind of sad that the price of these cards in my region is ridiculous.... its actually cheaper to get a rtx3090 2nd hand rather than getting the p40.... and the m40 is double the price compared to the one in this video....

  • @idcrafter-cgi
    @idcrafter-cgi ปีที่แล้ว +6

    My 4090 takes 2 seconds to make a 512x512 at 25 steps. It only has 24gb vrm which means that i can only like make 2000x2000 inages with no upscaling

  • @TheRealBossman7309
    @TheRealBossman7309 ปีที่แล้ว +1

    Great video👍

  • @jameswubbolt7787
    @jameswubbolt7787 ปีที่แล้ว +1

    I never knew .THANKS.

  • @akissot1402
    @akissot1402 ปีที่แล้ว +1

    Finally, i will be able to fine-tune and upgrade my Gynoid. Btw 3090 has 10496 cudas, and its about 850$ the cheapest in the market brand new.

  • @mateuslima788
    @mateuslima788 14 วันที่ผ่านมา

    You could've made an actual comparison.

  • @markconger8049
    @markconger8049 ปีที่แล้ว +2

    The Ford F150 of graphics cards. Slick!

  • @112Famine
    @112Famine ปีที่แล้ว +4

    Did anyone able to get this server graphic card able to play video games? Or only able to get it to only work how you have, running tasks, its a "smart" card, like how cars are able to drive.

    • @llortaton2834
      @llortaton2834 ปีที่แล้ว +5

      All tesla cards can play games, the problem with those is the cooling because there is no heatsink fan, you have to either buy your own 3D printed shroud or have a server that shoots air across the chassis

  • @trumpsextratesticle8590
    @trumpsextratesticle8590 7 หลายเดือนก่อน

    Too bad you could slap this thing in with a gaming GPU in an SLI Config and use the Vram off the secondary card and Computational power.
    MODDERS WHERE ARE YOU!!!

  • @shlomitgueta
    @shlomitgueta ปีที่แล้ว

    i have NVIDIA GeForce GTX 1080 Ti 3584 CUDA Cores. and i was thinking it is so old lol

  • @skullpoly1967
    @skullpoly1967 ปีที่แล้ว

    Yay rmiddle

  • @user-gu2sh1ke8n
    @user-gu2sh1ke8n หลายเดือนก่อน

    Купил Максвелл и хвастается. Хоть бы Паскаль...

  • @MaikeLDave
    @MaikeLDave ปีที่แล้ว +3

    First!

  • @itoxic-len7289
    @itoxic-len7289 ปีที่แล้ว +3

    Second!

  • @unclejeezy674
    @unclejeezy674 11 หลายเดือนก่อน

    Try --medvram or --lowvram. 24gb should be able to get 2048x2048 with --lowvram.