1. Take the cover off the gpu, 2. plug the little fan into a fan extension cable and route cable out the cover as you reinstall the cover 3. plug extension cable into mobo after you put the card back in the computer 4. ~5 watts unlocked. 5/70 is over 7%, so you can get a nice little boost on the card's performance freeing up that bit of the power envelope
okay i saw this and had to try it with my 4060 low profile card. i found that the fans draw ~3.2W at max power. still waiting on some parts before i'm able to test in the rig but this is definitely a great suggestion! Its also only 3.2/115W so less than 2% but every little bit helps with these. going to be experimenting with some overclocking as well.
Just a minor point? The application 'nvtop' is not Nvidia top. It's 'Neat Videocard Top' as it works on all GPUs. Everyone just assumes it's nvidia top but it's not.
7:16 Props to Wendell for calling out the incredible danger of curl piped into SH. Did you know, for example, that a server can often tell via side-channels if you just curl a command, or if you curl it into a shell? It's very possible to create a downloadable URL that delivers one script if you just curl to stdout/a file, and another if you curl pipe bash. Always download shell scripts, review them, *then* run them!
@@fgregerfeaxcwfeffece Ah, almost! The user agent can tell you that somebody is using curl, but not if they are looking at the output! What I'm talking about here are things like if you pipe curl into a shell, the shell accepts inputs in a very specific pattern, which can be detected. It's not about detecting that you're using curl, it's about detecting the difference between `curl whatever | sh` and `curl whatever > output.sh` . One should never pipe curl to shell, even if you've verified the script before, since the script you have reviewed is not the script being run! Even something like `set -x` can be trivially worked around(e.g. ANSI escape code). Using `curl | sh` is literately giving the remote webserver complete shell access, and it knows if you're looking at the script or not!
I am willing to bet my eye balls, that barely anyone ever reviews what the homebrew setup script really does. I'd argue most users can't even tell. And that's just one example.
@@Maxjoker98 If you are talking about timing. I would say: Yes, absolutely possible and also: Is there a known case of that having happendcin the wild? Most attacks are boringly low skill low effort. A custom webserver looking for timings seems way above the average TH-cam paygrade. Of course that doesn't mean people should be careless, I am just curious.
That's a relatively recent development. Iirc amd support only came late last year. th-cam.com/video/ehm2hmk_p3E/w-d-xo.htmlfeature=shared the 1.0 release was top for nvidia gpus. But I'm glad it's evolved into neat video with support for more than NV
@@heirtothethrone000 it's so tempting. If I wasn't on assignment in Tokyo for the next year I'd build it this summer. Computer parts are so hard to acquire here, and even when they are available the markup is insane.
I don't understand half of what wendell is saying...but my god, its well spoken, depp techtalk, and i'm a simple guy... i see a new wendell video, i click on it, i watch it
I have the RTX A2000 12GB for similar purpose. It’s less performant in basically every regard but also much cheaper and can easily chill in a 2U chassis :)
I help contribute to open-webui, and we recently refactored the Dockerfile and added the feature to include ollama using build args and build the container for GPU accelerated embedding. So you don't have to install Ollama in a separate container, unless you require that.
For years I have asked about better graphics cards for imaging, design, 3D, etc. I use a lot of apps, and I do things that aren’t normally done with graphics. Whenever I ask around on forums, I’m either flamed or given vague answers, always boiling down to “just get a gaming GPU”. THIS video touches on some of the stuff I’ve hoped someone, somewhere, would cover. I love my macs, but I’ve wanted to build a powerful machine so I can do more iterations, of higher quality, as fast as I can think of them. Fitting in SFF and not using a ton of wattage would be great too. Thank you! More content like this please!
Another interesting aspect of this lower power usage is that you could install several of them into a workstation without having to seek out specialist power supplies. Four of these would come to 80 gigs of VRAM at just 280 watts. Too bad they're so expensive.
As Wendall mentioned though, in Quadro land -- this is a bargain. RTX 6000, when it launched, I think was on the order of like $10k USD EACH. Something like that. And if my memory serves, the original DGX1 system which had eight DGX GPUs, was priced at $150k USD. So, this is 1% of that price. PER CARD, it might have 1/3rd the compute resources, but you buy enough of these, the efficiency is significantly higher than the bigger (and more powerful/power hungry) cousins. So unless your model NEEDS 80 GB of VRAM on a single card, having a BUNCH of 20 GB cards would be absolutely AWESOME. You'll run out of PCIe slots (being that the card is double slot width) way faster than you'll run out of anything else. Get one of the 4U or 5U 8-10 double-slot GPU server chassis and just drop a whole bunch of these in. Would be a killer system.
Is there enough power budget to do that? To run multiple of theses w/o power connectors? I have no idea what total power the board might have available for each slot.
@@chrisspellman5952 "Is there enough power budget to do that? To run multiple of theses w/o power connectors? I have no idea what total power the board might have available for each slot." Depends on the motherboard, but I mean, I would imagine that if a Threadripper motherboard can have 7 PCIe x16 (physical) and a mix of x16 and x8 (electrical) slots, presumably, if they build it, then the board needs to be able to supply that much power. Plus the fact that a Threadripper, for example, can pull 300-400 W via the CPU socket, so I would imagine that a motherboard is already designed to be able to supply this much power to the things that are plugged into it.
@@chrisspellman5952 Motherboards are designed to supply 75 watts to every PHYSICAL x16 slot and 25 watts to every PHYSICAL x1/2/4/8 slot. If a motherboard has >3 x16 physical slots, they have a PEG connector or other supplementary input for additional wattage.
@chrisspellman5952 Chris replied and is nearly right, to clarify a liitle: WS motherboards tend to have additional pcie power inputs on the board - this is the type of board that will power multiple with no issue. A consumer motherboard will Not work with 8 of these (even if you had the lanes, they do not provide enough power)
Woah! Super interesting card! Since its a card with such a focus on power, I would've loved to also see some numbers on its idle consumption (e.g. for always on homesystems). Still such an amazing video! Keep up the great work @ level1techs!
Damn that's a pretty potent little professional class GPU, might be the same price on paper as a 4090 but you can pretty much guarantee it will work with any recent Server or Workstation without any concerns for new components like a PSU or generating crazy amount or simply fitting in your existing systems/taking up multiple PCI-E slots or risers. So in that sense it actually makes a lot of sense over something like a 4090.
did the output of dolphincoder already run on the rtx 4000? If so, did you have anything else claiming GPU ressources at the same time? Reason I ask is that my 6900XT on ROCm 6 is way faster than that, which is surprising to me. Setup is ollama inside of a ubuntu 22.04 distrobox running on an ubuntu 22.04 host.
Do you have any tips for enabling ollama amd compute for the gpu integrated into AM5 chips? I've got a Ryzen 5 7600 in an asrock rack b650 motherboard running my proxmox homeserver with an ollama instance. It works fine running on cpu, but i'd be curious to compare the token generation rate between the iGpu and the Cpu.
AMD doesn't support integrated graphics for ROCm. In fact, there's a lot of conflicts(which is why they don't want to touch it). People have gotten it working with actually good iGPUs(780M, not the bad Ryzen 5 7600 iGPU which is essentially just a video out driver), it's not useful at this point, introducing new bottlenecks in place of old and shifting "what is slow" somewhere else.
Whats the best next level lower GPU in this series that I can buy as a consumer? I am not a gamer and building a creator build with the AMD 7950X. I've been looking at a GTX 4070 Super, but that card is more gamer focused. I plan to edit old film and use AI tools alot like Topaz.
I'm thinking of buying a W6800 rather than this RTX 4000 ADA as they are pretty close in pricing where I live. However, I don't know if that's going to be a useful purchase given that this is a bit of an older card too. W6800 PRO has 32 GB VRAM vs the 20 GB VRAM on the RTX 4000 ADA. Any thoughts about this?
If you run into a W6800 (older card now but similar price) or hopefully one of the W7000 series cards it'd be interesting to see how the value proposition works out for those playing with Rocm
Definitely not cheap but still can't buy a 4090 (will be last gen card in less than a year) Which it's kinda disturbing seeing this kind of price we have on non Quadro GPU right now
In Quadro land where the RTX 6000 launched at $10k USD, it is cheap. And the original DGX1 system that Nvidia made, which had eight DGX GPUs in it, was priced at $150k USD. So this is 1% of that price. To that end, this is very cheap, relatively speaking.
@@Splarkszter "YOU DON'T NEED HIGH-END HARDWARE FOR YOUR HOME SERVER!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!" I find it amusing what you think people do or don't need, for their home server. (This ASSUMES that you know what people are using/running on their said home server.) Meanwhile, I'm trying to get SR-IOV up and running on my Mellanox ConnectX-4 dual port 100 Gbps Infiniband network card, so that I can containerise my HPC applications because IBM decided to kill CentOS. (One CFD application is certified to migrate to Rocky Linux 9.3, one still needs at least CentOS 7.9.2009, and another uses Ubuntu 22.04 LTS.)
Looks pretty cool. I've been looking for a new GPU for machine learning, but other than some Tesla P100's on eBay I've not found much. I want about 16GB or more of VRAM(preferably HBM2).
@bamtoday mentioned the BD790i in his comment and that's actually what my brain was revolving around except, how would this RTX 4000 SFF fare in encoding/transcoding against the Alveo MA35D? i am really hoping myself to be able to use that particular setup for two system streaming but i have yet to confirm software support for such a configuration. obviously there is already software support for NVENC so this card will work but should the MA35D get support, will it outperform this card in the intended arena?
Very cool, but I agree with the cons also. This could be quite good for a multi-model scenario (e.g.. autogen). It is a potentially good sign for something better to come, also.
great video! I have been planning to build an easily nomadic egpu setup for running stable diffusion (with ComfyUI), doing pure inference, just image generation. Would you have any speed comparison on that use case vs a 4090... without ollama running at the same time? or would you say the performance in this video would apply to me? Thanks!
I was really looking at possibly a pair of these to build a personal geforce now + virtual AI workstation but at $1250 each it's a bit much for me to spend in my homelab. Then I considered a pair of the A2000 12G but for a price between them you can get a pair of 4060ti 16gb. At the moment I'm tempted to pull the trigger on the 4060ti route next payday unless I run into a reason to go another direction.
just get the 4060Ti... its a gem of a card. All of youtube, reddit shits on it but that card can do wonders. I use mine undervolted and it consumes ~90-100 watts max.
I just built a system with a Threadripper 7960X and one of these cards. The software isn't all there yet, but I'm planning on running a bunch of VMs with XCP-ng including an Ollama instance. It's insane that you can get something like this for 70 watts, even if it is a bit expensive.
Real question but kinda dumb. Can i put x2 of these in a build and run a GPU per game, sure might have issues here and there but if its doable, sounds kinda neat for my use case, i have black desert often running and do have another GPU dedicated to play another game at the same time would be insane for my silly ideas.
I was very intrigued by this card, but then I saw the price, and I IMMEDIATELY lost all interest. Fuck you, Nvidia. I get that there's a premium on Quadro cards but give me a fucking break. Also, while I'm talking about this, bring back the Quadro branding already. Your idea to axe it was dumb as hell.
6 หลายเดือนก่อน +5
One consumes 70W, the other 600W, the math checks out.
You have to consider performance per watt. This thing runs on the PCIE slot without supplemental power, meaning no more than 74 watts. What is it going to cost you to run a 4090 versus 2 RTX 4000's that draw, together, less than one third the power to do the same task. For professional work these things crush the 4090.
@@KenS1267 I really don't understand the performance/watt argument. The RTX4000 is 1/3rd performance and 1/5th the power usage of a RTX4090. That would be an interesting trade-off if it was half the price. But it's not. (Unless you plan to run it non-stop for 8-9 years, then it will break even)
@@Ardren I agree, the cost of the hardware makes it useful only in circumstances where the power consumption is critical imo... otherwise the value prop for the 4090 is just better.
May you run these with or without gaming GPUs on the TRX50 Gigabyte with a 24-core thread-ripper please? I would love to see you stream a demanding game while running image stuff in the background
I think that this is probably THE most powerful SFF card, in existence, right now. A used GV100 would be a little bit more than this, price-wise, but it also consumes 250 W vs. 70 W, and it's a FHFL card rather than a HHHL card, so it doesn't fit in a lot of shorter server chassis.
@@tolpacourt I am fairly confident that there are other SFF cards that aren't public, that are probably more powerful than this. It's just not public. (By the time this card hit the shelves, you can bet that Nvidia is already working on the next one that's going to be even more powerful than this.)
18:13 But I think 3x4000 wouldn't match a 4090, the model would be split across the devices and be even slower. You could run much bigger models though, but at x3 the price too.
Cool product demo! I’ll have to pay closer attention to Quadro features and support as opposed to the gaming cards. Newer to ML, most recommend the 4070 Super it seems for a homelab, but limited to 12gb of RAM and 200W
I couldn't find a preorder for less than $1200 this generation versus the A2000 before it that was often available for well well below MSRP due to the Crypto bust.
This card potentially has a huge market for enterprises that want to upgrade their non-GPU servers to add AI inference support. There's tons of Dell/HPE/etc servers out there that have 75W PCIe slots, but no PSU/PCIe-12V-connector for higher-power GPU (gaming or otherwise). And upgrading a rack of 20x 2U servers only adds 1400W, which you probably wouldn't even need to upgrade PDUs or add cooling to support.
It doesn't support vGPU. Shame. As it's most likely to be used in systems with just one PCI-e slot. So having a second GPU for passthrough to VMs is out of the question. Any other way to use this card for both the host and guest VM?
I am building a Z790-A I-9 with 2 fan cpu radiator and a 1500 psu, 2tb m.2...and 64GB RAM (possible 128GB in a month~s/h). I couldn't imagine needing a $1800-2200 GPU. If I wanted to make an ornate wood enclosure PC for someone, I would recommend. Not exactly something you can get replaced on the fly, I am already two-three weeks out on shipping as is, I would want onboard graphics in that case.
I hope they make a Blackwell rtx version of this. With a little less vram and without ecc so it’s more affordable. And then obviously a Quadro version too. There would be quite a few gamers that would want this for ssf, home lab users, and I bet some system integrators would make some cool systems too.
Proof that a 70-watt, 128-bit bus RTX 3060TI-class card is possible. 👍 Can't wait for 35-50W CPU + 70W iGPU Zen5 Strix Halo APUs and whatever comes after for Zen6!
Would love if Wendell , since he mentioned Plex, would test out an Intel Arc Pro A40 (AXX series), as it's the only card(s) I'm aware of that claims to support Dolby Vision... It seems the PC world has seemingly given up on Home Theatre... pretty much no Blu-ray decoding since the Intel Core Gen 10, no HDR10+ or Dolby Vision, and I think only one company makes a CEC device for PCs, you'd think by now since SBCs like the Pi have had CEC for 5 years, PC Motherboard makers would be trying to add this feature and the relevant hype to get their pound of flesh... so much for progress.. maybe the AI on the RTX 4000 can figure out a decent movie or entertainment
why not run ollama in docker aswell? makes setup & cleanup way easier... for the easy of networking shove every additional tool into the same compose file and connect them by the service name :)
swapped out my 1060 6gb for an RTX A2000 in my Unraid server a while back and its been working perfectly and i can do at least 6 transcodes at a time no issues, im sure it could even handle more. it was a steal at $270 after tax.
I don't plan on using any nvidia hardware and their propietary drivers, so I just hope this brings in a bit more competition on the SFF space, I love small and tiny gpus, as I don't play any AAA games and love SFF PCs.
Make a review of the Intel Arc a310 or a380 for transcoding and stuff like that. Also, can you make a video about what AMD and Intel are doing with regards to their implementation of CUDA?
The next thing that is going into my homelab rack is either a Dell R7910 or a Dell R7920. When the that decision is made, I want one of these cards in it.
70 (SEVENTY) WATTS !!! wew the vram + efficiency is just,, it’s rly wild hard to imagine preferring bulk deployments of 4090s over these the efficiency is just,, wow like maybe for a workstation type video editing or game dev rig,, ok 4090 might make more sense but this is a rly cool card
It's just too bad it costs as much as a 4090, where the A2000 12GB was MUCH more affordable. Granted, they came out with the Ada A2000, but it's extremely cut down from the A4000 and I've not seen anyone testing that yet to see how it compares.
Any GPU only uses max wattage when necessary. It does not always use max watts. It's a common misconception that the power requirement is constant. In my experience, many prompts may never use the full power and resources of the GPU. Less than 100 watts is very impressive!
Try the Sparkle Eco as it is a low cost around $100 USD, OHH and its low profile single slot. For me I was wanting AV1 and a line in sound card for a used Dell Optiplex 7060.
Ok, but this isn't "how to get a really cheap display output with AV1 acceleration", it's "the best compact AI accelerator add in card for power and space efficiency". The A310 is not a high performance AI accelerator
The next generation of NAS devices will move to M.2.... but they will also need strong GPU for the AI indexing... So low-power, small form factor GPU is great!
1. Take the cover off the gpu,
2. plug the little fan into a fan extension cable and route cable out the cover as you reinstall the cover
3. plug extension cable into mobo after you put the card back in the computer
4. ~5 watts unlocked. 5/70 is over 7%, so you can get a nice little boost on the card's performance freeing up that bit of the power envelope
I read about someone doing this on the A2000 but I cant remember who it was to give credit
5. Poop on it
okay i saw this and had to try it with my 4060 low profile card. i found that the fans draw ~3.2W at max power. still waiting on some parts before i'm able to test in the rig but this is definitely a great suggestion! Its also only 3.2/115W so less than 2% but every little bit helps with these. going to be experimenting with some overclocking as well.
4000 ADA! Literally my favorite GPU of the last few years. Glad you got one to play with finally 🙂
can't wait for the next Beer review at the end of your video 😁
Just a minor point? The application 'nvtop' is not Nvidia top. It's 'Neat Videocard Top' as it works on all GPUs. Everyone just assumes it's nvidia top but it's not.
nvitop is far better.
btop is my go-to, but I like seeing all these other tools in case I'm missing something by using btop.
Nvtop does not seem to work with Intel iGPU according to the app. When it detects an Intel iGPU, it will throw a warning at the start.
It's an older meme, sir, but it checks out
"kids these days just don't appreciate rubber hand puppets."
Only level 40+ Memers catch that reference
Timestamp?
He never addresses whether or not he actually pooped on it. The people deserve to know.
That's what she said.
7:16 Props to Wendell for calling out the incredible danger of curl piped into SH. Did you know, for example, that a server can often tell via side-channels if you just curl a command, or if you curl it into a shell? It's very possible to create a downloadable URL that delivers one script if you just curl to stdout/a file, and another if you curl pipe bash. Always download shell scripts, review them, *then* run them!
The easiest example would be the user agent header.
Yeah I do not like when websites give a curl to sh pipe as primary install method
@@fgregerfeaxcwfeffece Ah, almost! The user agent can tell you that somebody is using curl, but not if they are looking at the output! What I'm talking about here are things like if you pipe curl into a shell, the shell accepts inputs in a very specific pattern, which can be detected. It's not about detecting that you're using curl, it's about detecting the difference between `curl whatever | sh` and `curl whatever > output.sh` . One should never pipe curl to shell, even if you've verified the script before, since the script you have reviewed is not the script being run! Even something like `set -x` can be trivially worked around(e.g. ANSI escape code). Using `curl | sh` is literately giving the remote webserver complete shell access, and it knows if you're looking at the script or not!
I am willing to bet my eye balls, that barely anyone ever reviews what the homebrew setup script really does. I'd argue most users can't even tell. And that's just one example.
@@Maxjoker98 If you are talking about timing. I would say: Yes, absolutely possible
and also:
Is there a known case of that having happendcin the wild? Most attacks are boringly low skill low effort. A custom webserver looking for timings seems way above the average TH-cam paygrade.
Of course that doesn't mean people should be careless, I am just curious.
Note that NVTop stands for "Neat Video TOP", not "NVidia TOP"
It works on AMD and NVIDIA, and recently got support for Intel!
@@Real-Name..Maqavoy Those two monopolies are exactly why AMD and ATI were allowed to merge
Aha! I was wondering about that. I only knew about nvidia-smi and thought it was the only option from Nvidia.
That's a relatively recent development. Iirc amd support only came late last year. th-cam.com/video/ehm2hmk_p3E/w-d-xo.htmlfeature=shared the 1.0 release was top for nvidia gpus. But I'm glad it's evolved into neat video with support for more than NV
RTX 4000 ADA + Minisforum BD790i = the dreamiest ITX build combination.
I'd love to see what that combination would do.
@@heirtothethrone000 it's so tempting. If I wasn't on assignment in Tokyo for the next year I'd build it this summer. Computer parts are so hard to acquire here, and even when they are available the markup is insane.
^^^Current dream build. And a Noctua passive cooler.
Got an A2000 16gb on the way for my trans-coding galore, can't wait to tinker with it. Wendell dropping appropriately timed videos as usual.
there's a big difference between A2000 and A2000 Ada btw
@@TazzSmk There is no 16gb A2000 except for ADA Gen :)
@@SuperMari026 So it's a franken card with the mobile GPU? I've got the 12GB desktop variant.
@@linuxpirate Pretty much. I got it for €600 which is similarly priced as the non-ada 12gb variant.
@@TazzSmk Which is why one is called "RTX A2000" and the other is called "RTX 2000 Ada."
I don't understand half of what wendell is saying...but my god, its well spoken, depp techtalk, and i'm a simple guy... i see a new wendell video, i click on it, i watch it
That m.2 being held in by the pressure of the gpu made me chuckle :)
More like cringe.
I have the RTX A2000 12GB for similar purpose. It’s less performant in basically every regard but also much cheaper and can easily chill in a 2U chassis :)
I help contribute to open-webui, and we recently refactored the Dockerfile and added the feature to include ollama using build args and build the container for GPU accelerated embedding. So you don't have to install Ollama in a separate container, unless you require that.
That’s handy, thanks!
excellent revelatory comment
For years I have asked about better graphics cards for imaging, design, 3D, etc. I use a lot of apps, and I do things that aren’t normally done with graphics. Whenever I ask around on forums, I’m either flamed or given vague answers, always boiling down to “just get a gaming GPU”.
THIS video touches on some of the stuff I’ve hoped someone, somewhere, would cover.
I love my macs, but I’ve wanted to build a powerful machine so I can do more iterations, of higher quality, as fast as I can think of them. Fitting in SFF and not using a ton of wattage would be great too.
Thank you! More content like this please!
Another interesting aspect of this lower power usage is that you could install several of them into a workstation without having to seek out specialist power supplies. Four of these would come to 80 gigs of VRAM at just 280 watts. Too bad they're so expensive.
As Wendall mentioned though, in Quadro land -- this is a bargain.
RTX 6000, when it launched, I think was on the order of like $10k USD EACH.
Something like that.
And if my memory serves, the original DGX1 system which had eight DGX GPUs, was priced at $150k USD.
So, this is 1% of that price.
PER CARD, it might have 1/3rd the compute resources, but you buy enough of these, the efficiency is significantly higher than the bigger (and more powerful/power hungry) cousins.
So unless your model NEEDS 80 GB of VRAM on a single card, having a BUNCH of 20 GB cards would be absolutely AWESOME.
You'll run out of PCIe slots (being that the card is double slot width) way faster than you'll run out of anything else.
Get one of the 4U or 5U 8-10 double-slot GPU server chassis and just drop a whole bunch of these in.
Would be a killer system.
Is there enough power budget to do that? To run multiple of theses w/o power connectors? I have no idea what total power the board might have available for each slot.
@@chrisspellman5952
"Is there enough power budget to do that? To run multiple of theses w/o power connectors? I have no idea what total power the board might have available for each slot."
Depends on the motherboard, but I mean, I would imagine that if a Threadripper motherboard can have 7 PCIe x16 (physical) and a mix of x16 and x8 (electrical) slots, presumably, if they build it, then the board needs to be able to supply that much power.
Plus the fact that a Threadripper, for example, can pull 300-400 W via the CPU socket, so I would imagine that a motherboard is already designed to be able to supply this much power to the things that are plugged into it.
@@chrisspellman5952 Motherboards are designed to supply 75 watts to every PHYSICAL x16 slot and 25 watts to every PHYSICAL x1/2/4/8 slot.
If a motherboard has >3 x16 physical slots, they have a PEG connector or other supplementary input for additional wattage.
@chrisspellman5952 Chris replied and is nearly right, to clarify a liitle: WS motherboards tend to have additional pcie power inputs on the board - this is the type of board that will power multiple with no issue. A consumer motherboard will Not work with 8 of these (even if you had the lanes, they do not provide enough power)
Factor in the external watercooling I need for my dual 4090s, and I'm really regretting not having had this card as an option when I built :D
Woah! Super interesting card!
Since its a card with such a focus on power, I would've loved to also see some numbers on its idle consumption (e.g. for always on homesystems).
Still such an amazing video! Keep up the great work @ level1techs!
I own one, for a small pc gaming build, its more powerful than my 2080ti its amazing.
And with DLSS3
wut really
Damn that's a pretty potent little professional class GPU, might be the same price on paper as a 4090 but you can pretty much guarantee it will work with any recent Server or Workstation without any concerns for new components like a PSU or generating crazy amount or simply fitting in your existing systems/taking up multiple PCI-E slots or risers.
So in that sense it actually makes a lot of sense over something like a 4090.
Awesome video like always but you are just teasing us with $1500 hardware, For the next video please do a ARC 770 video showing how to enable SRIOV!
same thoughts
Yesssa
That was mentioned like 7 months ago. He never made a video about it?!
I can only afford the RTX 2000 ADA. The RTX 4000 ADA seem to be a pretty good option. Good video!
i cant wait to pick this up in a few years for my plex box
did the output of dolphincoder already run on the rtx 4000? If so, did you have anything else claiming GPU ressources at the same time? Reason I ask is that my 6900XT on ROCm 6 is way faster than that, which is surprising to me. Setup is ollama inside of a ubuntu 22.04 distrobox running on an ubuntu 22.04 host.
20:22 -- no NV link, no vgpu . . . imagine if it had that, too ?! Level1Techs has been on fire 🔥 lately. Incredible content 👍😎👏
Do you have any tips for enabling ollama amd compute for the gpu integrated into AM5 chips? I've got a Ryzen 5 7600 in an asrock rack b650 motherboard running my proxmox homeserver with an ollama instance. It works fine running on cpu, but i'd be curious to compare the token generation rate between the iGpu and the Cpu.
AMD doesn't support integrated graphics for ROCm. In fact, there's a lot of conflicts(which is why they don't want to touch it).
People have gotten it working with actually good iGPUs(780M, not the bad Ryzen 5 7600 iGPU which is essentially just a video out driver), it's not useful at this point, introducing new bottlenecks in place of old and shifting "what is slow" somewhere else.
Whats the best next level lower GPU in this series that I can buy as a consumer? I am not a gamer and building a creator build with the AMD 7950X. I've been looking at a GTX 4070 Super, but that card is more gamer focused. I plan to edit old film and use AI tools alot like Topaz.
I'm thinking of buying a W6800 rather than this RTX 4000 ADA as they are pretty close in pricing where I live.
However, I don't know if that's going to be a useful purchase given that this is a bit of an older card too.
W6800 PRO has 32 GB VRAM vs the 20 GB VRAM on the RTX 4000 ADA.
Any thoughts about this?
in 2 years this will be an awesome deal
Question: How does the AMD W7900 (48gb) compare to the Nvidia RTX A6000 ADA (48gb) for a similar "Swerk-station" Local AI setup?
twerkstation
If you run into a W6800 (older card now but similar price) or hopefully one of the W7000 series cards it'd be interesting to see how the value proposition works out for those playing with Rocm
Time to get two for a home server. One for Plex, another for everything else.
virtualize plex..or run it in docker...then just 2 homelab servers, 1 for backup.
@@rezenclowd3 TrueNAS exists and uses graphics cards in the same fashion.
Cant you use 2-3 rtx 4000 to reach the calculation power of the 4090? I dont know if its possible to do the inteference on multi gpu
Good question 🤔
The form factor is the big advantage here. If you want to build a nice little home server with inference capabilities, this card is a great choice.
$1,250 not cheap !!!!!
Definitely not cheap
but still can't buy a 4090 (will be last gen card in less than a year)
Which it's kinda disturbing seeing this kind of price we have on non Quadro GPU right now
Compared to a $2k RTX 4090,, it is!!!
In Quadro land where the RTX 6000 launched at $10k USD, it is cheap.
And the original DGX1 system that Nvidia made, which had eight DGX GPUs in it, was priced at $150k USD.
So this is 1% of that price.
To that end, this is very cheap, relatively speaking.
An A580 is overkill for plex.
YOU DON'T NEED HIGH-END HARDWARE FOR YOUR HOME SERVER!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!
@@Splarkszter
"YOU DON'T NEED HIGH-END HARDWARE FOR YOUR HOME SERVER!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!"
I find it amusing what you think people do or don't need, for their home server.
(This ASSUMES that you know what people are using/running on their said home server.)
Meanwhile, I'm trying to get SR-IOV up and running on my Mellanox ConnectX-4 dual port 100 Gbps Infiniband network card, so that I can containerise my HPC applications because IBM decided to kill CentOS.
(One CFD application is certified to migrate to Rocky Linux 9.3, one still needs at least CentOS 7.9.2009, and another uses Ubuntu 22.04 LTS.)
Looks pretty cool. I've been looking for a new GPU for machine learning, but other than some Tesla P100's on eBay I've not found much. I want about 16GB or more of VRAM(preferably HBM2).
@bamtoday mentioned the BD790i in his comment and that's actually what my brain was revolving around except, how would this RTX 4000 SFF fare in encoding/transcoding against the Alveo MA35D? i am really hoping myself to be able to use that particular setup for two system streaming but i have yet to confirm software support for such a configuration. obviously there is already software support for NVENC so this card will work but should the MA35D get support, will it outperform this card in the intended arena?
I don't understand any of the stuff going on but I believe Wendell when he says it's cool and awesome
Very cool, but I agree with the cons also. This could be quite good for a multi-model scenario (e.g.. autogen). It is a potentially good sign for something better to come, also.
great video! I have been planning to build an easily nomadic egpu setup for running stable diffusion (with ComfyUI), doing pure inference, just image generation. Would you have any speed comparison on that use case vs a 4090... without ollama running at the same time? or would you say the performance in this video would apply to me? Thanks!
I was really looking at possibly a pair of these to build a personal geforce now + virtual AI workstation but at $1250 each it's a bit much for me to spend in my homelab. Then I considered a pair of the A2000 12G but for a price between them you can get a pair of 4060ti 16gb. At the moment I'm tempted to pull the trigger on the 4060ti route next payday unless I run into a reason to go another direction.
just get the 4060Ti... its a gem of a card. All of youtube, reddit shits on it but that card can do wonders. I use mine undervolted and it consumes ~90-100 watts max.
I’ve learned so much but watching these videos, I need to check out their forums
I just built a system with a Threadripper 7960X and one of these cards. The software isn't all there yet, but I'm planning on running a bunch of VMs with XCP-ng including an Ollama instance. It's insane that you can get something like this for 70 watts, even if it is a bit expensive.
It's nice to see some non-gaming GPU discussion!
I'm very happy with my RTX A4500, 200W max, 20GB ram, etc.
I think it's often overlooked.
If you had a ton of surveillance cameras (a mall?) this would make for a great realtime tensor object detection tool
sorry i think i missed it but can't find it in the video.... how many transcodes can it handle?
Can we get a 4070 version for around USD 600 less, but with native HDMI 2.1 ports?
Pretty please.
Real question but kinda dumb.
Can i put x2 of these in a build and run a GPU per game, sure might have issues here and there but if its doable, sounds kinda neat for my use case, i have black desert often running and do have another GPU dedicated to play another game at the same time would be insane for my silly ideas.
@level1techs will 3 of these run well on trx50 gigabyte?
$2,269 AUD makes it in between the price of the RTX 4080 and RTX 4090, with the RTX 4090 outperforms RTX 4000 SFF Ada Generation by an impressive 88%.
I was very intrigued by this card, but then I saw the price, and I IMMEDIATELY lost all interest. Fuck you, Nvidia. I get that there's a premium on Quadro cards but give me a fucking break. Also, while I'm talking about this, bring back the Quadro branding already. Your idea to axe it was dumb as hell.
One consumes 70W, the other 600W, the math checks out.
You have to consider performance per watt. This thing runs on the PCIE slot without supplemental power, meaning no more than 74 watts. What is it going to cost you to run a 4090 versus 2 RTX 4000's that draw, together, less than one third the power to do the same task.
For professional work these things crush the 4090.
@@KenS1267 I really don't understand the performance/watt argument. The RTX4000 is 1/3rd performance and 1/5th the power usage of a RTX4090. That would be an interesting trade-off if it was half the price. But it's not.
(Unless you plan to run it non-stop for 8-9 years, then it will break even)
@@Ardren I agree, the cost of the hardware makes it useful only in circumstances where the power consumption is critical imo... otherwise the value prop for the 4090 is just better.
May you run these with or without gaming GPUs on the TRX50 Gigabyte with a 24-core thread-ripper please? I would love to see you stream a demanding game while running image stuff in the background
I think that this is probably THE most powerful SFF card, in existence, right now.
A used GV100 would be a little bit more than this, price-wise, but it also consumes 250 W vs. 70 W, and it's a FHFL card rather than a HHHL card, so it doesn't fit in a lot of shorter server chassis.
I think this is probably THE most powerful SFF card in existence right now.
Fixed.
@@tolpacourt
I am fairly confident that there are other SFF cards that aren't public, that are probably more powerful than this.
It's just not public.
(By the time this card hit the shelves, you can bet that Nvidia is already working on the next one that's going to be even more powerful than this.)
18:13 But I think 3x4000 wouldn't match a 4090, the model would be split across the devices and be even slower. You could run much bigger models though, but at x3 the price too.
@Level1Techs any ideas why/when could we hope to use the AD104 20G as vgpu on pve? Thanks, great video!
I wonder if the SFF PNY version will fit in a Dell R30/R640? If it would, it could be amazing with Proxmox
This should be a $500 product.
Does it fit in Minisforum MS-01? Edit: Nevermind, lost me when you said it costs as much as RTX4090. :(
Cool product demo! I’ll have to pay closer attention to Quadro features and support as opposed to the gaming cards. Newer to ML, most recommend the 4070 Super it seems for a homelab, but limited to 12gb of RAM and 200W
Maybe a stupid question, but can this GPU fit into a 1U server?
How would this compare to any of the Intel graphics cards for a media server?
Is this gpu not massive overkill for plex or jellyfin etc?
I couldn't find a preorder for less than $1200 this generation versus the A2000 before it that was often available for well well below MSRP due to the Crypto bust.
How is it running these GPUs in an external GPU case? Most times i don't need the power. But the days i do i would love to plug in and use it.
Almost dream specifications, needs to be single slot to fit into my server.
Fan sound output differences between the cards are an important factor too.
Stable diffusion can easily use more VRAM if you are generating multiple images in parallel (use batch generation).
I'm REALLY interested in ROCm and AMD's W7500 RDNA3 based workstation GPU for my home server!
Thank you so much for the comparison. I just need to find an A2000 comparison for AI now.
This card potentially has a huge market for enterprises that want to upgrade their non-GPU servers to add AI inference support. There's tons of Dell/HPE/etc servers out there that have 75W PCIe slots, but no PSU/PCIe-12V-connector for higher-power GPU (gaming or otherwise). And upgrading a rack of 20x 2U servers only adds 1400W, which you probably wouldn't even need to upgrade PDUs or add cooling to support.
It doesn't support vGPU. Shame. As it's most likely to be used in systems with just one PCI-e slot. So having a second GPU for passthrough to VMs is out of the question.
Any other way to use this card for both the host and guest VM?
Brother I gotta say I love seeing the AMD wraith RGB cooler!! I'm still using mine from 2018
I am building a Z790-A I-9 with 2 fan cpu radiator and a 1500 psu, 2tb m.2...and 64GB RAM (possible 128GB in a month~s/h). I couldn't imagine needing a $1800-2200 GPU. If I wanted to make an ornate wood enclosure PC for someone, I would recommend. Not exactly something you can get replaced on the fly, I am already two-three weeks out on shipping as is, I would want onboard graphics in that case.
Perhaps it would be the equivalent of replacing a Tesla S glass roof vs the windshield. Start buying insurance, right?
I hope they make a Blackwell rtx version of this. With a little less vram and without ecc so it’s more affordable. And then obviously a Quadro version too. There would be quite a few gamers that would want this for ssf, home lab users, and I bet some system integrators would make some cool systems too.
I could see this added to a NAS. Containerize here and there and you have a solid home lab
Ist is possible to run two in parallel for inference? To use a LLM model > 20GB ?
@Level1techs possible to cover Alveo MA35D ? 32x AV1 streams at 35 watt for ~1599 usd ?
Proof that a 70-watt, 128-bit bus RTX 3060TI-class card is possible. 👍
Can't wait for 35-50W CPU + 70W iGPU Zen5 Strix Halo APUs and whatever comes after for Zen6!
Would love if Wendell , since he mentioned Plex, would test out an Intel Arc Pro A40 (AXX series), as it's the only card(s) I'm aware of that claims to support Dolby Vision... It seems the PC world has seemingly given up on Home Theatre... pretty much no Blu-ray decoding since the Intel Core Gen 10, no HDR10+ or Dolby Vision, and I think only one company makes a CEC device for PCs, you'd think by now since SBCs like the Pi have had CEC for 5 years, PC Motherboard makers would be trying to add this feature and the relevant hype to get their pound of flesh... so much for progress.. maybe the AI on the RTX 4000 can figure out a decent movie or entertainment
why not run ollama in docker aswell? makes setup & cleanup way easier... for the easy of networking shove every additional tool into the same compose file and connect them by the service name :)
swapped out my 1060 6gb for an RTX A2000 in my Unraid server a while back and its been working perfectly and i can do at least 6 transcodes at a time no issues, im sure it could even handle more. it was a steal at $270 after tax.
for a 12GB A2000?
What about running the LLM's under Proxmox? Is that feasible at this point with this GPU?
Via docker? Sure. Via pcie passthrough also works if you just need a docker host to share it
why should I get this of the A380 low profile cards which is a fraction of the cost?
Been using the 12GB A2000 for a while now, its perfect for me so far.
I don't plan on using any nvidia hardware and their propietary drivers, so I just hope this brings in a bit more competition on the SFF space, I love small and tiny gpus, as I don't play any AAA games and love SFF PCs.
I was just talking about triupth the insult dog recently. Nice , regardless i Love these tyype of vids , dont need it , but why not? well done
How are they with Folding@Home? :)
Make a review of the Intel Arc a310 or a380 for transcoding and stuff like that. Also, can you make a video about what AMD and Intel are doing with regards to their implementation of CUDA?
The next thing that is going into my homelab rack is either a Dell R7910 or a Dell R7920. When the that decision is made, I want one of these cards in it.
the 4000 and 2000 are not on amazon so 🙂 where do you get it with international shipping
I'm still trying to decide if the mispronunciation of "epitome" was deliberate or not? :)
I picked up a refurbished Precision 3660 with an RTX A4000 16GB for barely more than the price of the card. Is it weird that is my idea of summer fun?
Can’t wait for this to turn into the future Quadro K620 that’s dime a dozen all over eBay.
70 (SEVENTY) WATTS !!!
wew
the vram + efficiency is just,,
it’s rly wild
hard to imagine preferring bulk deployments of 4090s over these the efficiency is just,,
wow
like maybe for a workstation type video editing or game dev rig,, ok 4090 might make more sense but
this is a rly cool card
and honestly i’m happily surprised it’s 1,250 not like 6,000 (nvidia don’t read this)
Great Video Wendell .... I wonder if it'll fit in a MS-01 ... Prob. too big ... but I can Dream 🙂
It's just too bad it costs as much as a 4090, where the A2000 12GB was MUCH more affordable. Granted, they came out with the Ada A2000, but it's extremely cut down from the A4000 and I've not seen anyone testing that yet to see how it compares.
Any GPU only uses max wattage when necessary. It does not always use max watts. It's a common misconception that the power requirement is constant. In my experience, many prompts may never use the full power and resources of the GPU. Less than 100 watts is very impressive!
What kind of Jellyfin or Plex server do you run?
I wish I had such a card for my SFF PC, without paying the workstation premium.
Doesn't docker require like 100 gb storage for install? Not a fan.
Try the Sparkle Eco as it is a low cost around $100 USD, OHH and its low profile single slot. For me I was wanting AV1 and a line in sound card for a used Dell Optiplex 7060.
Ok, but this isn't "how to get a really cheap display output with AV1 acceleration", it's "the best compact AI accelerator add in card for power and space efficiency". The A310 is not a high performance AI accelerator
@@bosstowndynamics5488 AV1 is all I need it for. maybe for your application it is not very good.
hows it handle gaming and streaming of games?
The next generation of NAS devices will move to M.2.... but they will also need strong GPU for the AI indexing... So low-power, small form factor GPU is great!