1. Take the cover off the gpu, 2. plug the little fan into a fan extension cable and route cable out the cover as you reinstall the cover 3. plug extension cable into mobo after you put the card back in the computer 4. ~5 watts unlocked. 5/70 is over 7%, so you can get a nice little boost on the card's performance freeing up that bit of the power envelope
okay i saw this and had to try it with my 4060 low profile card. i found that the fans draw ~3.2W at max power. still waiting on some parts before i'm able to test in the rig but this is definitely a great suggestion! Its also only 3.2/115W so less than 2% but every little bit helps with these. going to be experimenting with some overclocking as well.
Just a minor point? The application 'nvtop' is not Nvidia top. It's 'Neat Videocard Top' as it works on all GPUs. Everyone just assumes it's nvidia top but it's not.
@@heirtothethrone000 it's so tempting. If I wasn't on assignment in Tokyo for the next year I'd build it this summer. Computer parts are so hard to acquire here, and even when they are available the markup is insane.
That's a relatively recent development. Iirc amd support only came late last year. th-cam.com/video/ehm2hmk_p3E/w-d-xo.htmlfeature=shared the 1.0 release was top for nvidia gpus. But I'm glad it's evolved into neat video with support for more than NV
7:16 Props to Wendell for calling out the incredible danger of curl piped into SH. Did you know, for example, that a server can often tell via side-channels if you just curl a command, or if you curl it into a shell? It's very possible to create a downloadable URL that delivers one script if you just curl to stdout/a file, and another if you curl pipe bash. Always download shell scripts, review them, *then* run them!
@@fgregerfeaxcwfeffece Ah, almost! The user agent can tell you that somebody is using curl, but not if they are looking at the output! What I'm talking about here are things like if you pipe curl into a shell, the shell accepts inputs in a very specific pattern, which can be detected. It's not about detecting that you're using curl, it's about detecting the difference between `curl whatever | sh` and `curl whatever > output.sh` . One should never pipe curl to shell, even if you've verified the script before, since the script you have reviewed is not the script being run! Even something like `set -x` can be trivially worked around(e.g. ANSI escape code). Using `curl | sh` is literately giving the remote webserver complete shell access, and it knows if you're looking at the script or not!
I am willing to bet my eye balls, that barely anyone ever reviews what the homebrew setup script really does. I'd argue most users can't even tell. And that's just one example.
@@Maxjoker98 If you are talking about timing. I would say: Yes, absolutely possible and also: Is there a known case of that having happendcin the wild? Most attacks are boringly low skill low effort. A custom webserver looking for timings seems way above the average TH-cam paygrade. Of course that doesn't mean people should be careless, I am just curious.
I don't understand half of what wendell is saying...but my god, its well spoken, depp techtalk, and i'm a simple guy... i see a new wendell video, i click on it, i watch it
For years I have asked about better graphics cards for imaging, design, 3D, etc. I use a lot of apps, and I do things that aren’t normally done with graphics. Whenever I ask around on forums, I’m either flamed or given vague answers, always boiling down to “just get a gaming GPU”. THIS video touches on some of the stuff I’ve hoped someone, somewhere, would cover. I love my macs, but I’ve wanted to build a powerful machine so I can do more iterations, of higher quality, as fast as I can think of them. Fitting in SFF and not using a ton of wattage would be great too. Thank you! More content like this please!
I have the RTX A2000 12GB for similar purpose. It’s less performant in basically every regard but also much cheaper and can easily chill in a 2U chassis :)
Another interesting aspect of this lower power usage is that you could install several of them into a workstation without having to seek out specialist power supplies. Four of these would come to 80 gigs of VRAM at just 280 watts. Too bad they're so expensive.
As Wendall mentioned though, in Quadro land -- this is a bargain. RTX 6000, when it launched, I think was on the order of like $10k USD EACH. Something like that. And if my memory serves, the original DGX1 system which had eight DGX GPUs, was priced at $150k USD. So, this is 1% of that price. PER CARD, it might have 1/3rd the compute resources, but you buy enough of these, the efficiency is significantly higher than the bigger (and more powerful/power hungry) cousins. So unless your model NEEDS 80 GB of VRAM on a single card, having a BUNCH of 20 GB cards would be absolutely AWESOME. You'll run out of PCIe slots (being that the card is double slot width) way faster than you'll run out of anything else. Get one of the 4U or 5U 8-10 double-slot GPU server chassis and just drop a whole bunch of these in. Would be a killer system.
Is there enough power budget to do that? To run multiple of theses w/o power connectors? I have no idea what total power the board might have available for each slot.
@@chrisspellman5952 "Is there enough power budget to do that? To run multiple of theses w/o power connectors? I have no idea what total power the board might have available for each slot." Depends on the motherboard, but I mean, I would imagine that if a Threadripper motherboard can have 7 PCIe x16 (physical) and a mix of x16 and x8 (electrical) slots, presumably, if they build it, then the board needs to be able to supply that much power. Plus the fact that a Threadripper, for example, can pull 300-400 W via the CPU socket, so I would imagine that a motherboard is already designed to be able to supply this much power to the things that are plugged into it.
@@chrisspellman5952 Motherboards are designed to supply 75 watts to every PHYSICAL x16 slot and 25 watts to every PHYSICAL x1/2/4/8 slot. If a motherboard has >3 x16 physical slots, they have a PEG connector or other supplementary input for additional wattage.
@chrisspellman5952 Chris replied and is nearly right, to clarify a liitle: WS motherboards tend to have additional pcie power inputs on the board - this is the type of board that will power multiple with no issue. A consumer motherboard will Not work with 8 of these (even if you had the lanes, they do not provide enough power)
I help contribute to open-webui, and we recently refactored the Dockerfile and added the feature to include ollama using build args and build the container for GPU accelerated embedding. So you don't have to install Ollama in a separate container, unless you require that.
Damn that's a pretty potent little professional class GPU, might be the same price on paper as a 4090 but you can pretty much guarantee it will work with any recent Server or Workstation without any concerns for new components like a PSU or generating crazy amount or simply fitting in your existing systems/taking up multiple PCI-E slots or risers. So in that sense it actually makes a lot of sense over something like a 4090.
Woah! Super interesting card! Since its a card with such a focus on power, I would've loved to also see some numbers on its idle consumption (e.g. for always on homesystems). Still such an amazing video! Keep up the great work @ level1techs!
If you run into a W6800 (older card now but similar price) or hopefully one of the W7000 series cards it'd be interesting to see how the value proposition works out for those playing with Rocm
This card potentially has a huge market for enterprises that want to upgrade their non-GPU servers to add AI inference support. There's tons of Dell/HPE/etc servers out there that have 75W PCIe slots, but no PSU/PCIe-12V-connector for higher-power GPU (gaming or otherwise). And upgrading a rack of 20x 2U servers only adds 1400W, which you probably wouldn't even need to upgrade PDUs or add cooling to support.
I was really looking at possibly a pair of these to build a personal geforce now + virtual AI workstation but at $1250 each it's a bit much for me to spend in my homelab. Then I considered a pair of the A2000 12G but for a price between them you can get a pair of 4060ti 16gb. At the moment I'm tempted to pull the trigger on the 4060ti route next payday unless I run into a reason to go another direction.
just get the 4060Ti... its a gem of a card. All of youtube, reddit shits on it but that card can do wonders. I use mine undervolted and it consumes ~90-100 watts max.
Looks pretty cool. I've been looking for a new GPU for machine learning, but other than some Tesla P100's on eBay I've not found much. I want about 16GB or more of VRAM(preferably HBM2).
Definitely not cheap but still can't buy a 4090 (will be last gen card in less than a year) Which it's kinda disturbing seeing this kind of price we have on non Quadro GPU right now
In Quadro land where the RTX 6000 launched at $10k USD, it is cheap. And the original DGX1 system that Nvidia made, which had eight DGX GPUs in it, was priced at $150k USD. So this is 1% of that price. To that end, this is very cheap, relatively speaking.
@@Splarkszter "YOU DON'T NEED HIGH-END HARDWARE FOR YOUR HOME SERVER!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!" I find it amusing what you think people do or don't need, for their home server. (This ASSUMES that you know what people are using/running on their said home server.) Meanwhile, I'm trying to get SR-IOV up and running on my Mellanox ConnectX-4 dual port 100 Gbps Infiniband network card, so that I can containerise my HPC applications because IBM decided to kill CentOS. (One CFD application is certified to migrate to Rocky Linux 9.3, one still needs at least CentOS 7.9.2009, and another uses Ubuntu 22.04 LTS.)
I was very intrigued by this card, but then I saw the price, and I IMMEDIATELY lost all interest. Fuck you, Nvidia. I get that there's a premium on Quadro cards but give me a fucking break. Also, while I'm talking about this, bring back the Quadro branding already. Your idea to axe it was dumb as hell.
7 หลายเดือนก่อน +5
One consumes 70W, the other 600W, the math checks out.
You have to consider performance per watt. This thing runs on the PCIE slot without supplemental power, meaning no more than 74 watts. What is it going to cost you to run a 4090 versus 2 RTX 4000's that draw, together, less than one third the power to do the same task. For professional work these things crush the 4090.
@@KenS1267 I really don't understand the performance/watt argument. The RTX4000 is 1/3rd performance and 1/5th the power usage of a RTX4090. That would be an interesting trade-off if it was half the price. But it's not. (Unless you plan to run it non-stop for 8-9 years, then it will break even)
@@Ardren I agree, the cost of the hardware makes it useful only in circumstances where the power consumption is critical imo... otherwise the value prop for the 4090 is just better.
I don't plan on using any nvidia hardware and their propietary drivers, so I just hope this brings in a bit more competition on the SFF space, I love small and tiny gpus, as I don't play any AAA games and love SFF PCs.
I think that this is probably THE most powerful SFF card, in existence, right now. A used GV100 would be a little bit more than this, price-wise, but it also consumes 250 W vs. 70 W, and it's a FHFL card rather than a HHHL card, so it doesn't fit in a lot of shorter server chassis.
@@tolpacourt I am fairly confident that there are other SFF cards that aren't public, that are probably more powerful than this. It's just not public. (By the time this card hit the shelves, you can bet that Nvidia is already working on the next one that's going to be even more powerful than this.)
I just built a system with a Threadripper 7960X and one of these cards. The software isn't all there yet, but I'm planning on running a bunch of VMs with XCP-ng including an Ollama instance. It's insane that you can get something like this for 70 watts, even if it is a bit expensive.
Very cool, but I agree with the cons also. This could be quite good for a multi-model scenario (e.g.. autogen). It is a potentially good sign for something better to come, also.
I couldn't find a preorder for less than $1200 this generation versus the A2000 before it that was often available for well well below MSRP due to the Crypto bust.
May you run these with or without gaming GPUs on the TRX50 Gigabyte with a 24-core thread-ripper please? I would love to see you stream a demanding game while running image stuff in the background
I am building a Z790-A I-9 with 2 fan cpu radiator and a 1500 psu, 2tb m.2...and 64GB RAM (possible 128GB in a month~s/h). I couldn't imagine needing a $1800-2200 GPU. If I wanted to make an ornate wood enclosure PC for someone, I would recommend. Not exactly something you can get replaced on the fly, I am already two-three weeks out on shipping as is, I would want onboard graphics in that case.
Do you have any tips for enabling ollama amd compute for the gpu integrated into AM5 chips? I've got a Ryzen 5 7600 in an asrock rack b650 motherboard running my proxmox homeserver with an ollama instance. It works fine running on cpu, but i'd be curious to compare the token generation rate between the iGpu and the Cpu.
AMD doesn't support integrated graphics for ROCm. In fact, there's a lot of conflicts(which is why they don't want to touch it). People have gotten it working with actually good iGPUs(780M, not the bad Ryzen 5 7600 iGPU which is essentially just a video out driver), it's not useful at this point, introducing new bottlenecks in place of old and shifting "what is slow" somewhere else.
Would love if Wendell , since he mentioned Plex, would test out an Intel Arc Pro A40 (AXX series), as it's the only card(s) I'm aware of that claims to support Dolby Vision... It seems the PC world has seemingly given up on Home Theatre... pretty much no Blu-ray decoding since the Intel Core Gen 10, no HDR10+ or Dolby Vision, and I think only one company makes a CEC device for PCs, you'd think by now since SBCs like the Pi have had CEC for 5 years, PC Motherboard makers would be trying to add this feature and the relevant hype to get their pound of flesh... so much for progress.. maybe the AI on the RTX 4000 can figure out a decent movie or entertainment
Any GPU only uses max wattage when necessary. It does not always use max watts. It's a common misconception that the power requirement is constant. In my experience, many prompts may never use the full power and resources of the GPU. Less than 100 watts is very impressive!
18:13 But I think 3x4000 wouldn't match a 4090, the model would be split across the devices and be even slower. You could run much bigger models though, but at x3 the price too.
I'm thinking of buying a W6800 rather than this RTX 4000 ADA as they are pretty close in pricing where I live. However, I don't know if that's going to be a useful purchase given that this is a bit of an older card too. W6800 PRO has 32 GB VRAM vs the 20 GB VRAM on the RTX 4000 ADA. Any thoughts about this?
Cool product demo! I’ll have to pay closer attention to Quadro features and support as opposed to the gaming cards. Newer to ML, most recommend the 4070 Super it seems for a homelab, but limited to 12gb of RAM and 200W
I hope they make a Blackwell rtx version of this. With a little less vram and without ecc so it’s more affordable. And then obviously a Quadro version too. There would be quite a few gamers that would want this for ssf, home lab users, and I bet some system integrators would make some cool systems too.
The next thing that is going into my homelab rack is either a Dell R7910 or a Dell R7920. When the that decision is made, I want one of these cards in it.
Proof that a 70-watt, 128-bit bus RTX 3060TI-class card is possible. 👍 Can't wait for 35-50W CPU + 70W iGPU Zen5 Strix Halo APUs and whatever comes after for Zen6!
great video! I have been planning to build an easily nomadic egpu setup for running stable diffusion (with ComfyUI), doing pure inference, just image generation. Would you have any speed comparison on that use case vs a 4090... without ollama running at the same time? or would you say the performance in this video would apply to me? Thanks!
still waiting for someone to use this with non-ECC memory, non-pro microcode and big passive heatsink at a reasonable price keep the PCB though, it's important for the airflow also, it's a beast, it's like 4070 kinda with 70W, come on, it doesn't get any better no matter how much you pay extra
Whats the best next level lower GPU in this series that I can buy as a consumer? I am not a gamer and building a creator build with the AMD 7950X. I've been looking at a GTX 4070 Super, but that card is more gamer focused. I plan to edit old film and use AI tools alot like Topaz.
I have one in my Lenovo P3 Ultra wish it supported vGPU profiles so I could move it to my server. Handles games nicely. Extremely quiet and 75w who can complain.
did the output of dolphincoder already run on the rtx 4000? If so, did you have anything else claiming GPU ressources at the same time? Reason I ask is that my 6900XT on ROCm 6 is way faster than that, which is surprising to me. Setup is ollama inside of a ubuntu 22.04 distrobox running on an ubuntu 22.04 host.
swapped out my 1060 6gb for an RTX A2000 in my Unraid server a while back and its been working perfectly and i can do at least 6 transcodes at a time no issues, im sure it could even handle more. it was a steal at $270 after tax.
Hard to find distributors that sell this masterpiece for eGPU and mini itx build in India. Nvidia ada naming for workstation is a mess you've 4000,5000 and 6000 Ada. Very confusing.
It's just too bad it costs as much as a 4090, where the A2000 12GB was MUCH more affordable. Granted, they came out with the Ada A2000, but it's extremely cut down from the A4000 and I've not seen anyone testing that yet to see how it compares.
70 (SEVENTY) WATTS !!! wew the vram + efficiency is just,, it’s rly wild hard to imagine preferring bulk deployments of 4090s over these the efficiency is just,, wow like maybe for a workstation type video editing or game dev rig,, ok 4090 might make more sense but this is a rly cool card
why not run ollama in docker aswell? makes setup & cleanup way easier... for the easy of networking shove every additional tool into the same compose file and connect them by the service name :)
using an A2000 for main gfx in an unraid setup Windows VM. IT's no 4090, but it does MED/HIGH?ULTRA settings in most games on 1080 steady down to 30FPS, barley any stutters. Solid little card. Looks like time for an upgrade.
I’m a gamer. I am a quiet freak. I am space-constrained. Power is expensive, and I use my gaming PC for my accounting work also, which uses no GPU horsepower. I figure I’m pretty sensitive to Nvidia’s stingy VRAM buffers on their gaming cards, and I was in the market in APril 2023 for an Nvidia card. I was upgrading from an Rx 6650 XT. I did not want to stop at 12 GB, but I hated the 4080, and didn’t much like the 7900 series from AMD at the time either. This meant my options in the super-efficient side of the market were, indeed, the 4090 or this. This was taken away from me because I could not find one. I got a 4090 and strangled it to 33% power limit, where I leave it unless I notice poor performance, which has only happened in Portal with RTX so far. I’m very pleased with the 4090’s behavior. My MSI Suprim X air-cooled model is very nearly silent outside of games, and quiet enough that you won’t notice it while playing. All the same, I would love the lower idle power consumption this boasts. I have a fanless power supply that probably isn’t specced to handle the 4090 at full load, but at reduced power, it’s more than enough. Even when I boost the 4090’s power, I don’t exceed 66% / 300 watts.
I wish NVIDIA would enable OC Scanner on Linux, you can shave quite a bit of power consumption off the 4000 series by altering the voltage curve. Did that to run F@H on my Windows box with a 4090 and with the clock speed down to around 2430Mhz and it only loses about 8% performance at 2/3 the power consumption. I probably should see how that compares on Stable Diffusion.
Was my slight critique about workstation dGPUs not having HDMI outputs, requiring potentially janky adapters (when it’s needed to go to HDMI 2.1) too much for the comment section? Strange.
The next generation of NAS devices will move to M.2.... but they will also need strong GPU for the AI indexing... So low-power, small form factor GPU is great!
Make a review of the Intel Arc a310 or a380 for transcoding and stuff like that. Also, can you make a video about what AMD and Intel are doing with regards to their implementation of CUDA?
I have the smaller cousin (A2000 6 GB SFF) for my Plex Server. Bought it off Amazon @ $250, absolutely worth it! The lower VRAM CAN be potentially a limiting factor, but if you're not trying to do such LLM/AI stuff, for Plex -- its works great! I have that running via a Ubuntu 22.04 LXC container, running on Proxmox 7.4-17.
@@ewenchan1239 Same here, but the 12GB Model on TrueNAS Scale Dragonfish. If you're not doing AI work I don't know what this card would get you over the A2000 for plex alone other than potentially more transcodes?.
Be kinda cool using this as a gaming video card if you didn't care about the $$$. In my case I have an i7 Intel Beast Canyon NUC that at first had an 8gb 3060 that I then updated to a 4070 (yes it actually fits and works fine). Would be fascinating to see the 4000 compared to a 4070. I think it would be more an RTX 6000 vs the 4070 for closer performance.
1. Take the cover off the gpu,
2. plug the little fan into a fan extension cable and route cable out the cover as you reinstall the cover
3. plug extension cable into mobo after you put the card back in the computer
4. ~5 watts unlocked. 5/70 is over 7%, so you can get a nice little boost on the card's performance freeing up that bit of the power envelope
I read about someone doing this on the A2000 but I cant remember who it was to give credit
5. Poop on it
okay i saw this and had to try it with my 4060 low profile card. i found that the fans draw ~3.2W at max power. still waiting on some parts before i'm able to test in the rig but this is definitely a great suggestion! Its also only 3.2/115W so less than 2% but every little bit helps with these. going to be experimenting with some overclocking as well.
4000 ADA! Literally my favorite GPU of the last few years. Glad you got one to play with finally 🙂
can't wait for the next Beer review at the end of your video 😁
Just a minor point? The application 'nvtop' is not Nvidia top. It's 'Neat Videocard Top' as it works on all GPUs. Everyone just assumes it's nvidia top but it's not.
nvitop is far better.
btop is my go-to, but I like seeing all these other tools in case I'm missing something by using btop.
Nvtop does not seem to work with Intel iGPU according to the app. When it detects an Intel iGPU, it will throw a warning at the start.
It's an older meme, sir, but it checks out
"kids these days just don't appreciate rubber hand puppets."
Only level 40+ Memers catch that reference
Timestamp?
He never addresses whether or not he actually pooped on it. The people deserve to know.
That's what she said.
RTX 4000 ADA + Minisforum BD790i = the dreamiest ITX build combination.
I'd love to see what that combination would do.
@@heirtothethrone000 it's so tempting. If I wasn't on assignment in Tokyo for the next year I'd build it this summer. Computer parts are so hard to acquire here, and even when they are available the markup is insane.
^^^Current dream build. And a Noctua passive cooler.
Note that NVTop stands for "Neat Video TOP", not "NVidia TOP"
It works on AMD and NVIDIA, and recently got support for Intel!
@@Real-Name..Maqavoy Those two monopolies are exactly why AMD and ATI were allowed to merge
Aha! I was wondering about that. I only knew about nvidia-smi and thought it was the only option from Nvidia.
That's a relatively recent development. Iirc amd support only came late last year. th-cam.com/video/ehm2hmk_p3E/w-d-xo.htmlfeature=shared the 1.0 release was top for nvidia gpus. But I'm glad it's evolved into neat video with support for more than NV
7:16 Props to Wendell for calling out the incredible danger of curl piped into SH. Did you know, for example, that a server can often tell via side-channels if you just curl a command, or if you curl it into a shell? It's very possible to create a downloadable URL that delivers one script if you just curl to stdout/a file, and another if you curl pipe bash. Always download shell scripts, review them, *then* run them!
The easiest example would be the user agent header.
Yeah I do not like when websites give a curl to sh pipe as primary install method
@@fgregerfeaxcwfeffece Ah, almost! The user agent can tell you that somebody is using curl, but not if they are looking at the output! What I'm talking about here are things like if you pipe curl into a shell, the shell accepts inputs in a very specific pattern, which can be detected. It's not about detecting that you're using curl, it's about detecting the difference between `curl whatever | sh` and `curl whatever > output.sh` . One should never pipe curl to shell, even if you've verified the script before, since the script you have reviewed is not the script being run! Even something like `set -x` can be trivially worked around(e.g. ANSI escape code). Using `curl | sh` is literately giving the remote webserver complete shell access, and it knows if you're looking at the script or not!
I am willing to bet my eye balls, that barely anyone ever reviews what the homebrew setup script really does. I'd argue most users can't even tell. And that's just one example.
@@Maxjoker98 If you are talking about timing. I would say: Yes, absolutely possible
and also:
Is there a known case of that having happendcin the wild? Most attacks are boringly low skill low effort. A custom webserver looking for timings seems way above the average TH-cam paygrade.
Of course that doesn't mean people should be careless, I am just curious.
I don't understand half of what wendell is saying...but my god, its well spoken, depp techtalk, and i'm a simple guy... i see a new wendell video, i click on it, i watch it
That m.2 being held in by the pressure of the gpu made me chuckle :)
More like cringe.
Got an A2000 16gb on the way for my trans-coding galore, can't wait to tinker with it. Wendell dropping appropriately timed videos as usual.
there's a big difference between A2000 and A2000 Ada btw
@@TazzSmk There is no 16gb A2000 except for ADA Gen :)
@@SuperMari026 So it's a franken card with the mobile GPU? I've got the 12GB desktop variant.
@@linuxpirate Pretty much. I got it for €600 which is similarly priced as the non-ada 12gb variant.
@@TazzSmk Which is why one is called "RTX A2000" and the other is called "RTX 2000 Ada."
For years I have asked about better graphics cards for imaging, design, 3D, etc. I use a lot of apps, and I do things that aren’t normally done with graphics. Whenever I ask around on forums, I’m either flamed or given vague answers, always boiling down to “just get a gaming GPU”.
THIS video touches on some of the stuff I’ve hoped someone, somewhere, would cover.
I love my macs, but I’ve wanted to build a powerful machine so I can do more iterations, of higher quality, as fast as I can think of them. Fitting in SFF and not using a ton of wattage would be great too.
Thank you! More content like this please!
I have the RTX A2000 12GB for similar purpose. It’s less performant in basically every regard but also much cheaper and can easily chill in a 2U chassis :)
Another interesting aspect of this lower power usage is that you could install several of them into a workstation without having to seek out specialist power supplies. Four of these would come to 80 gigs of VRAM at just 280 watts. Too bad they're so expensive.
As Wendall mentioned though, in Quadro land -- this is a bargain.
RTX 6000, when it launched, I think was on the order of like $10k USD EACH.
Something like that.
And if my memory serves, the original DGX1 system which had eight DGX GPUs, was priced at $150k USD.
So, this is 1% of that price.
PER CARD, it might have 1/3rd the compute resources, but you buy enough of these, the efficiency is significantly higher than the bigger (and more powerful/power hungry) cousins.
So unless your model NEEDS 80 GB of VRAM on a single card, having a BUNCH of 20 GB cards would be absolutely AWESOME.
You'll run out of PCIe slots (being that the card is double slot width) way faster than you'll run out of anything else.
Get one of the 4U or 5U 8-10 double-slot GPU server chassis and just drop a whole bunch of these in.
Would be a killer system.
Is there enough power budget to do that? To run multiple of theses w/o power connectors? I have no idea what total power the board might have available for each slot.
@@chrisspellman5952
"Is there enough power budget to do that? To run multiple of theses w/o power connectors? I have no idea what total power the board might have available for each slot."
Depends on the motherboard, but I mean, I would imagine that if a Threadripper motherboard can have 7 PCIe x16 (physical) and a mix of x16 and x8 (electrical) slots, presumably, if they build it, then the board needs to be able to supply that much power.
Plus the fact that a Threadripper, for example, can pull 300-400 W via the CPU socket, so I would imagine that a motherboard is already designed to be able to supply this much power to the things that are plugged into it.
@@chrisspellman5952 Motherboards are designed to supply 75 watts to every PHYSICAL x16 slot and 25 watts to every PHYSICAL x1/2/4/8 slot.
If a motherboard has >3 x16 physical slots, they have a PEG connector or other supplementary input for additional wattage.
@chrisspellman5952 Chris replied and is nearly right, to clarify a liitle: WS motherboards tend to have additional pcie power inputs on the board - this is the type of board that will power multiple with no issue. A consumer motherboard will Not work with 8 of these (even if you had the lanes, they do not provide enough power)
I help contribute to open-webui, and we recently refactored the Dockerfile and added the feature to include ollama using build args and build the container for GPU accelerated embedding. So you don't have to install Ollama in a separate container, unless you require that.
That’s handy, thanks!
excellent revelatory comment
I can only afford the RTX 2000 ADA. The RTX 4000 ADA seem to be a pretty good option. Good video!
I own one, for a small pc gaming build, its more powerful than my 2080ti its amazing.
And with DLSS3
wut really
Awesome video like always but you are just teasing us with $1500 hardware, For the next video please do a ARC 770 video showing how to enable SRIOV!
same thoughts
Yesssa
That was mentioned like 7 months ago. He never made a video about it?!
Damn that's a pretty potent little professional class GPU, might be the same price on paper as a 4090 but you can pretty much guarantee it will work with any recent Server or Workstation without any concerns for new components like a PSU or generating crazy amount or simply fitting in your existing systems/taking up multiple PCI-E slots or risers.
So in that sense it actually makes a lot of sense over something like a 4090.
i cant wait to pick this up in a few years for my plex box
This should be a $500 product.
Woah! Super interesting card!
Since its a card with such a focus on power, I would've loved to also see some numbers on its idle consumption (e.g. for always on homesystems).
Still such an amazing video! Keep up the great work @ level1techs!
Factor in the external watercooling I need for my dual 4090s, and I'm really regretting not having had this card as an option when I built :D
Lol do you really NEED the water-cooling? 😂
@clutchboi4038 indeed, not enough room on the motherboard (h13) for dual stock coolers
in 2 years this will be an awesome deal
20:22 -- no NV link, no vgpu . . . imagine if it had that, too ?! Level1Techs has been on fire 🔥 lately. Incredible content 👍😎👏
If you had a ton of surveillance cameras (a mall?) this would make for a great realtime tensor object detection tool
I don't understand any of the stuff going on but I believe Wendell when he says it's cool and awesome
Lol
Time to get two for a home server. One for Plex, another for everything else.
virtualize plex..or run it in docker...then just 2 homelab servers, 1 for backup.
@@rezenclowd3 TrueNAS exists and uses graphics cards in the same fashion.
If you run into a W6800 (older card now but similar price) or hopefully one of the W7000 series cards it'd be interesting to see how the value proposition works out for those playing with Rocm
This card potentially has a huge market for enterprises that want to upgrade their non-GPU servers to add AI inference support. There's tons of Dell/HPE/etc servers out there that have 75W PCIe slots, but no PSU/PCIe-12V-connector for higher-power GPU (gaming or otherwise). And upgrading a rack of 20x 2U servers only adds 1400W, which you probably wouldn't even need to upgrade PDUs or add cooling to support.
I was really looking at possibly a pair of these to build a personal geforce now + virtual AI workstation but at $1250 each it's a bit much for me to spend in my homelab. Then I considered a pair of the A2000 12G but for a price between them you can get a pair of 4060ti 16gb. At the moment I'm tempted to pull the trigger on the 4060ti route next payday unless I run into a reason to go another direction.
just get the 4060Ti... its a gem of a card. All of youtube, reddit shits on it but that card can do wonders. I use mine undervolted and it consumes ~90-100 watts max.
Looks pretty cool. I've been looking for a new GPU for machine learning, but other than some Tesla P100's on eBay I've not found much. I want about 16GB or more of VRAM(preferably HBM2).
$1,250 not cheap !!!!!
Definitely not cheap
but still can't buy a 4090 (will be last gen card in less than a year)
Which it's kinda disturbing seeing this kind of price we have on non Quadro GPU right now
Compared to a $2k RTX 4090,, it is!!!
In Quadro land where the RTX 6000 launched at $10k USD, it is cheap.
And the original DGX1 system that Nvidia made, which had eight DGX GPUs in it, was priced at $150k USD.
So this is 1% of that price.
To that end, this is very cheap, relatively speaking.
An A580 is overkill for plex.
YOU DON'T NEED HIGH-END HARDWARE FOR YOUR HOME SERVER!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!
@@Splarkszter
"YOU DON'T NEED HIGH-END HARDWARE FOR YOUR HOME SERVER!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!"
I find it amusing what you think people do or don't need, for their home server.
(This ASSUMES that you know what people are using/running on their said home server.)
Meanwhile, I'm trying to get SR-IOV up and running on my Mellanox ConnectX-4 dual port 100 Gbps Infiniband network card, so that I can containerise my HPC applications because IBM decided to kill CentOS.
(One CFD application is certified to migrate to Rocky Linux 9.3, one still needs at least CentOS 7.9.2009, and another uses Ubuntu 22.04 LTS.)
Question: How does the AMD W7900 (48gb) compare to the Nvidia RTX A6000 ADA (48gb) for a similar "Swerk-station" Local AI setup?
twerkstation
$2,269 AUD makes it in between the price of the RTX 4080 and RTX 4090, with the RTX 4090 outperforms RTX 4000 SFF Ada Generation by an impressive 88%.
I was very intrigued by this card, but then I saw the price, and I IMMEDIATELY lost all interest. Fuck you, Nvidia. I get that there's a premium on Quadro cards but give me a fucking break. Also, while I'm talking about this, bring back the Quadro branding already. Your idea to axe it was dumb as hell.
One consumes 70W, the other 600W, the math checks out.
You have to consider performance per watt. This thing runs on the PCIE slot without supplemental power, meaning no more than 74 watts. What is it going to cost you to run a 4090 versus 2 RTX 4000's that draw, together, less than one third the power to do the same task.
For professional work these things crush the 4090.
@@KenS1267 I really don't understand the performance/watt argument. The RTX4000 is 1/3rd performance and 1/5th the power usage of a RTX4090. That would be an interesting trade-off if it was half the price. But it's not.
(Unless you plan to run it non-stop for 8-9 years, then it will break even)
@@Ardren I agree, the cost of the hardware makes it useful only in circumstances where the power consumption is critical imo... otherwise the value prop for the 4090 is just better.
I don't plan on using any nvidia hardware and their propietary drivers, so I just hope this brings in a bit more competition on the SFF space, I love small and tiny gpus, as I don't play any AAA games and love SFF PCs.
The form factor is the big advantage here. If you want to build a nice little home server with inference capabilities, this card is a great choice.
I think that this is probably THE most powerful SFF card, in existence, right now.
A used GV100 would be a little bit more than this, price-wise, but it also consumes 250 W vs. 70 W, and it's a FHFL card rather than a HHHL card, so it doesn't fit in a lot of shorter server chassis.
I think this is probably THE most powerful SFF card in existence right now.
Fixed.
@@tolpacourt
I am fairly confident that there are other SFF cards that aren't public, that are probably more powerful than this.
It's just not public.
(By the time this card hit the shelves, you can bet that Nvidia is already working on the next one that's going to be even more powerful than this.)
I just built a system with a Threadripper 7960X and one of these cards. The software isn't all there yet, but I'm planning on running a bunch of VMs with XCP-ng including an Ollama instance. It's insane that you can get something like this for 70 watts, even if it is a bit expensive.
I’ve learned so much but watching these videos, I need to check out their forums
It's nice to see some non-gaming GPU discussion!
I'm very happy with my RTX A4500, 200W max, 20GB ram, etc.
I think it's often overlooked.
Very cool, but I agree with the cons also. This could be quite good for a multi-model scenario (e.g.. autogen). It is a potentially good sign for something better to come, also.
I couldn't find a preorder for less than $1200 this generation versus the A2000 before it that was often available for well well below MSRP due to the Crypto bust.
May you run these with or without gaming GPUs on the TRX50 Gigabyte with a 24-core thread-ripper please? I would love to see you stream a demanding game while running image stuff in the background
I am building a Z790-A I-9 with 2 fan cpu radiator and a 1500 psu, 2tb m.2...and 64GB RAM (possible 128GB in a month~s/h). I couldn't imagine needing a $1800-2200 GPU. If I wanted to make an ornate wood enclosure PC for someone, I would recommend. Not exactly something you can get replaced on the fly, I am already two-three weeks out on shipping as is, I would want onboard graphics in that case.
Perhaps it would be the equivalent of replacing a Tesla S glass roof vs the windshield. Start buying insurance, right?
Stable diffusion can easily use more VRAM if you are generating multiple images in parallel (use batch generation).
Do you have any tips for enabling ollama amd compute for the gpu integrated into AM5 chips? I've got a Ryzen 5 7600 in an asrock rack b650 motherboard running my proxmox homeserver with an ollama instance. It works fine running on cpu, but i'd be curious to compare the token generation rate between the iGpu and the Cpu.
AMD doesn't support integrated graphics for ROCm. In fact, there's a lot of conflicts(which is why they don't want to touch it).
People have gotten it working with actually good iGPUs(780M, not the bad Ryzen 5 7600 iGPU which is essentially just a video out driver), it's not useful at this point, introducing new bottlenecks in place of old and shifting "what is slow" somewhere else.
Fan sound output differences between the cards are an important factor too.
I'm still trying to decide if the mispronunciation of "epitome" was deliberate or not? :)
Would love if Wendell , since he mentioned Plex, would test out an Intel Arc Pro A40 (AXX series), as it's the only card(s) I'm aware of that claims to support Dolby Vision... It seems the PC world has seemingly given up on Home Theatre... pretty much no Blu-ray decoding since the Intel Core Gen 10, no HDR10+ or Dolby Vision, and I think only one company makes a CEC device for PCs, you'd think by now since SBCs like the Pi have had CEC for 5 years, PC Motherboard makers would be trying to add this feature and the relevant hype to get their pound of flesh... so much for progress.. maybe the AI on the RTX 4000 can figure out a decent movie or entertainment
Any GPU only uses max wattage when necessary. It does not always use max watts. It's a common misconception that the power requirement is constant. In my experience, many prompts may never use the full power and resources of the GPU. Less than 100 watts is very impressive!
18:13 But I think 3x4000 wouldn't match a 4090, the model would be split across the devices and be even slower. You could run much bigger models though, but at x3 the price too.
I'm thinking of buying a W6800 rather than this RTX 4000 ADA as they are pretty close in pricing where I live.
However, I don't know if that's going to be a useful purchase given that this is a bit of an older card too.
W6800 PRO has 32 GB VRAM vs the 20 GB VRAM on the RTX 4000 ADA.
Any thoughts about this?
Cool product demo! I’ll have to pay closer attention to Quadro features and support as opposed to the gaming cards. Newer to ML, most recommend the 4070 Super it seems for a homelab, but limited to 12gb of RAM and 200W
I could see this added to a NAS. Containerize here and there and you have a solid home lab
I hope they make a Blackwell rtx version of this. With a little less vram and without ecc so it’s more affordable. And then obviously a Quadro version too. There would be quite a few gamers that would want this for ssf, home lab users, and I bet some system integrators would make some cool systems too.
The next thing that is going into my homelab rack is either a Dell R7910 or a Dell R7920. When the that decision is made, I want one of these cards in it.
I'm REALLY interested in ROCm and AMD's W7500 RDNA3 based workstation GPU for my home server!
Proof that a 70-watt, 128-bit bus RTX 3060TI-class card is possible. 👍
Can't wait for 35-50W CPU + 70W iGPU Zen5 Strix Halo APUs and whatever comes after for Zen6!
great video! I have been planning to build an easily nomadic egpu setup for running stable diffusion (with ComfyUI), doing pure inference, just image generation. Would you have any speed comparison on that use case vs a 4090... without ollama running at the same time? or would you say the performance in this video would apply to me? Thanks!
For AI and lower power? I would love to get RTX 4000 ADA over RTX 4090! I need this for low power home lab!
still waiting for someone to use this with non-ECC memory, non-pro microcode and big passive heatsink at a reasonable price
keep the PCB though, it's important for the airflow
also, it's a beast, it's like 4070 kinda with 70W, come on, it doesn't get any better no matter how much you pay extra
Whats the best next level lower GPU in this series that I can buy as a consumer? I am not a gamer and building a creator build with the AMD 7950X. I've been looking at a GTX 4070 Super, but that card is more gamer focused. I plan to edit old film and use AI tools alot like Topaz.
I have one in my Lenovo P3 Ultra wish it supported vGPU profiles so I could move it to my server. Handles games nicely. Extremely quiet and 75w who can complain.
I wonder if the SFF PNY version will fit in a Dell R30/R640? If it would, it could be amazing with Proxmox
did the output of dolphincoder already run on the rtx 4000? If so, did you have anything else claiming GPU ressources at the same time? Reason I ask is that my 6900XT on ROCm 6 is way faster than that, which is surprising to me. Setup is ollama inside of a ubuntu 22.04 distrobox running on an ubuntu 22.04 host.
I was just talking about triupth the insult dog recently. Nice , regardless i Love these tyype of vids , dont need it , but why not? well done
Almost dream specifications, needs to be single slot to fit into my server.
swapped out my 1060 6gb for an RTX A2000 in my Unraid server a while back and its been working perfectly and i can do at least 6 transcodes at a time no issues, im sure it could even handle more. it was a steal at $270 after tax.
for a 12GB A2000?
Hard to find distributors that sell this masterpiece for eGPU and mini itx build in India. Nvidia ada naming for workstation is a mess you've 4000,5000 and 6000 Ada. Very confusing.
Thank you so much for the comparison. I just need to find an A2000 comparison for AI now.
Been using the 12GB A2000 for a while now, its perfect for me so far.
I wish I had such a card for my SFF PC, without paying the workstation premium.
I kinda want to see this compared to AMDs small encoder card. Just see how many AV1 streams each can reasonably handle at quality.
Can we get a 4070 version for around USD 600 less, but with native HDMI 2.1 ports?
Pretty please.
Cant you use 2-3 rtx 4000 to reach the calculation power of the 4090? I dont know if its possible to do the inteference on multi gpu
Good question 🤔
I really wanted to get one, but they released RTX 2000 Ada Generation 16gb and 75 watts for $625.00.
It's just too bad it costs as much as a 4090, where the A2000 12GB was MUCH more affordable. Granted, they came out with the Ada A2000, but it's extremely cut down from the A4000 and I've not seen anyone testing that yet to see how it compares.
70 (SEVENTY) WATTS !!!
wew
the vram + efficiency is just,,
it’s rly wild
hard to imagine preferring bulk deployments of 4090s over these the efficiency is just,,
wow
like maybe for a workstation type video editing or game dev rig,, ok 4090 might make more sense but
this is a rly cool card
and honestly i’m happily surprised it’s 1,250 not like 6,000 (nvidia don’t read this)
why not run ollama in docker aswell? makes setup & cleanup way easier... for the easy of networking shove every additional tool into the same compose file and connect them by the service name :)
yeah with the ADA media stack the card is a no brainer for a plex server
Great Video Wendell .... I wonder if it'll fit in a MS-01 ... Prob. too big ... but I can Dream 🙂
I have the A2000 (Ampere) 12GB version.
Can’t wait for this to turn into the future Quadro K620 that’s dime a dozen all over eBay.
using an A2000 for main gfx in an unraid setup Windows VM. IT's no 4090, but it does MED/HIGH?ULTRA settings in most games on 1080 steady down to 30FPS, barley any stutters. Solid little card. Looks like time for an upgrade.
I’m a gamer. I am a quiet freak. I am space-constrained. Power is expensive, and I use my gaming PC for my accounting work also, which uses no GPU horsepower. I figure I’m pretty sensitive to Nvidia’s stingy VRAM buffers on their gaming cards, and I was in the market in APril 2023 for an Nvidia card. I was upgrading from an Rx 6650 XT. I did not want to stop at 12 GB, but I hated the 4080, and didn’t much like the 7900 series from AMD at the time either. This meant my options in the super-efficient side of the market were, indeed, the 4090 or this. This was taken away from me because I could not find one. I got a 4090 and strangled it to 33% power limit, where I leave it unless I notice poor performance, which has only happened in Portal with RTX so far. I’m very pleased with the 4090’s behavior. My MSI Suprim X air-cooled model is very nearly silent outside of games, and quiet enough that you won’t notice it while playing. All the same, I would love the lower idle power consumption this boasts. I have a fanless power supply that probably isn’t specced to handle the 4090 at full load, but at reduced power, it’s more than enough. Even when I boost the 4090’s power, I don’t exceed 66% / 300 watts.
I wish NVIDIA would enable OC Scanner on Linux, you can shave quite a bit of power consumption off the 4000 series by altering the voltage curve. Did that to run F@H on my Windows box with a 4090 and with the clock speed down to around 2430Mhz and it only loses about 8% performance at 2/3 the power consumption. I probably should see how that compares on Stable Diffusion.
Use Nvidia Container runtime. This helps you spin up Ollama inside a container. This you can now pass to a reverse proxy and use a hostname
Here's something weird. I was on wrong system and installed nvtop. It still shows GPU info foe Intel Iris Xe. How cool.
nvtop is not Nvidia software
Was my slight critique about workstation dGPUs not having HDMI outputs, requiring potentially janky adapters (when it’s needed to go to HDMI 2.1) too much for the comment section?
Strange.
The next generation of NAS devices will move to M.2.... but they will also need strong GPU for the AI indexing... So low-power, small form factor GPU is great!
Hope this could fit in a Lenovo m70s chassis and make a sff ai powerhouse!
Sorry, 13mm too long, 168mm vs 155mm :c Might be able to squeeze it in with some cutting though.
However you can get a second one and link them together, the vram will pool.
Brother I gotta say I love seeing the AMD wraith RGB cooler!! I'm still using mine from 2018
@Level1techs possible to cover Alveo MA35D ? 32x AV1 streams at 35 watt for ~1599 usd ?
I really do prefer this type of cooler over the dinky little milled/extruded axial coolers generally on low profile geforce gpus
I picked up a refurbished Precision 3660 with an RTX A4000 16GB for barely more than the price of the card. Is it weird that is my idea of summer fun?
Make a review of the Intel Arc a310 or a380 for transcoding and stuff like that. Also, can you make a video about what AMD and Intel are doing with regards to their implementation of CUDA?
I want that card for a plex server.... 😊
me to lol
really? big server lol
I have the smaller cousin (A2000 6 GB SFF) for my Plex Server. Bought it off Amazon @ $250, absolutely worth it!
The lower VRAM CAN be potentially a limiting factor, but if you're not trying to do such LLM/AI stuff, for Plex -- its works great!
I have that running via a Ubuntu 22.04 LXC container, running on Proxmox 7.4-17.
@@ewenchan1239 Same here, but the 12GB Model on TrueNAS Scale Dragonfish. If you're not doing AI work I don't know what this card would get you over the A2000 for plex alone other than potentially more transcodes?.
Get an A2000 instead. A lot cheaper on ebay and plenty of horsepower for plex.
Be kinda cool using this as a gaming video card if you didn't care about the $$$. In my case I have an i7 Intel Beast Canyon NUC that at first had an 8gb 3060 that I then updated to a 4070 (yes it actually fits and works fine). Would be fascinating to see the 4000 compared to a 4070. I think it would be more an RTX 6000 vs the 4070 for closer performance.
Hmmm. Wonder if I could mod the Minisforum MS-01 and toss this bad boy in?