These AI Accelerator Cards Hope To Be The Next 3dfx

PCWorld

มุมมอง 82 993

เพิ่มลงใน
- เพลย์ลิสต์ของฉัน
- ดูภายหลัง
แชร์

แชร์

ฝัง

ขนาดวิดีโอ:

แสดงแผงควบคุมโปรแกรมเล่น

เล่นอัตโนมัติ

เล่นใหม่

เผยแพร่เมื่อ 12 ม.ค. 2024
AI was a huge topic at CES 2024, and there is a crop of companies showing off desktop-based accelerator solutions that hope to make huge waves in the market - ala 3dfx.
Buy PCWorld merch: crowdmade.com/collections/pcw...
Follow PCWorld for all things PC!
--------------------------------
SUBSCRIBE: th-cam.com/users/pcworld?sub_c...
TWITTER: / pcworld
WEBSITE: www.pcworld.com
#ces2024 #ai #lenovo
วิทยาศาสตร์และเทคโนโลยี

ความคิดเห็น • 418

@tr1ck5h07 5 หลายเดือนก่อน ⁺²²⁴
They want to decline and be bought on the cheap by Nvidia?
@CaptainVKanth 5 หลายเดือนก่อน ⁺⁴
This
@BoltRM 5 หลายเดือนก่อน ⁺⁶
Lenovo is a Chinese co & China is seeing countries limiting its supply of ai hardware..
@mattizzle81 5 หลายเดือนก่อน ⁺⁸
Lol yeah not sure why you would want to be the next 3Dfx and only exist for a brief time and then become history. 😂
@lexwaldez 5 หลายเดือนก่อน
Thought exact same.
@Splarkszter 5 หลายเดือนก่อน ⁺⁷
3Dfx failed because of a bad investment desicion at a bad time, it was poor management and bad luck.
@sloo6425 5 หลายเดือนก่อน ⁺²⁰⁶
Wow, last time I heard accelerators, it was for Math co-processor, then 3dfx, then PhysX, now we have AI accelerators, what a massive leap in the space of 25 years.
@RicochetForce 5 หลายเดือนก่อน ⁺⁷
The march of technology never ceases to amaze.
@jwdickieson 5 หลายเดือนก่อน ⁺¹³
And two of those were bought and absorbed by Nvidia so I'm looking forward to this happening again
@harrytsang1501 5 หลายเดือนก่อน ⁺⁷
Still a math co-processor, just that instead of just floating point it focuses on linear algebra
@DaemonForce 5 หลายเดือนก่อน ⁺²
It's just code for enterprise gear.
Seagate WarpDrive? Flash accelerator.
Headless Quadro/Tesla/Radeon/Arc GPU? Graphics accelerator.
nVidia Data processor? General purpose network/storage/service accelerator.
Those cool encoder cards that can do like x264 very slow but LIVE? Specialized accelerator.
There's a bit of AI potential in most of these and a lot of it is lost on the public because nobody notices any of it in the traditional desktop space. Also for what it's worth, GPUs are a co-processor.
@ericneo2 5 หลายเดือนก่อน ⁺⁷
Yeah except PhysX was a proper card with power draw, cooling and drivers. This just screams snake oil. This looks like nothing more than memory chips and a controller on a M.2 board. There isn't even a heat sink for cooling like there is for an NVME.
@Gigalisk 5 หลายเดือนก่อน ⁺⁸⁴
This feels a lot like what Ageia did for PhysX before Nvidia bought them and incorporated them into their GTX Geforce Line.
@hentosama 5 หลายเดือนก่อน ⁺²
and heavily nerfed the application... seriously if you ever tried a ral physx card vs what nvidia released even to date, nothing compares to the original demos with the hardware physx, it was amazing
@najeebshah. 5 หลายเดือนก่อน ⁺³
just not true lmao even gpus just a generation or two after that far outstripped the PhysX cards@@hentosama
@fujinshu หลายเดือนก่อน ⁺¹
@@hentosamaProblem is, not many are using PhysX in games because they’re not cross-compatible with other GPUs, and something like UE5 has physics simulators that are frankly good enough for most people and that are easy to implement across an entire range of computers rather than just NVIDIA GPUs. PhysX is still used in non-gaming applications, though.
@MW97058 5 หลายเดือนก่อน ⁺⁹²
3dFX - dang, that takes
me back 20+ years! I had two running in SLI. Played Quake like no other! Great video!
@ImportRace 5 หลายเดือนก่อน ⁺²
Same over here, Great times
@thelaughingmanofficial 5 หลายเดือนก่อน ⁺⁷
I had 2 of the 12MB Voodoo 2's. Quake and Quake 2 at 2048 x 1024 on a 17" 4:3 CRT. Mmmm so good. 😙👌
@mirek190 5 หลายเดือนก่อน
Lol
Not possible for glide 3d FX driver with sli
Max was using 2 sli cards 1024x768 @@thelaughingmanofficial
@jtjames79 5 หลายเดือนก่อน ⁺¹
@@thelaughingmanofficial I had a friend with a similar rig. All the anti-aliasing in the world couldn't match just using more pixels back then.
@elektrosmokes1911 5 หลายเดือนก่อน
2048x1024 huh? Lol.@thelaughingmanofficial
@wtflolomg 5 หลายเดือนก่อน ⁺¹⁵⁷
Two key things AI accelerator cards will need to provide: processing AND memory. I can foresee an accelerator offloading the need for graphics cards to share VRAM with LLM models. It's an exciting opportunity for card vendors and chip vendors... imagine an AMD or Nvidia AI card, with 24GB+ "IRAM" (Inference Memory), leaving video cards to leverage their AI processing to rendering duties, while the AIA handles the AI for characters (actions and dialog) as well as interpreting inputs speech from the player. Of course, there is much more it could do, I'm just knocking the big obvious things out there now.
@PySnek 5 หลายเดือนก่อน ⁺¹²
They will just integrate it to existing GPUs.
@wtflolomg 5 หลายเดือนก่อน ⁺¹⁶
@@PySnek At the cost of performance and precious VRAM? Again, it's not a bad thing that GPUs have AI support (makes sense for DLSS/XSSS/FSR), but why should a game's render needs compete with an LLM taking up space and processing? A good AI accelerator would probably only need 4 PCI-e lanes and enough memory to host whatever models may be thrown at it. You could drop it in with ANY GPU/CPU, and get an immediate benefit from games and apps that will leverage AI.
@PySnek 5 หลายเดือนก่อน ⁺¹²
@@wtflolomg why? because nvidia wants the whole cake and has enough ressources and brain to reach that goal
@wtflolomg 5 หลายเดือนก่อน ⁺⁹
@@PySnek Nvidia is far better off creating a whole new category of PC add-ons. An AI card won't cannibalize GPU sales, it will add to Nvidia's bottom line by introducing another thing people will purchase for their PCs, and improve the performance of their GPUs in games, since the AI won't be leeching performance. This isn't rocket science, it simple opportunity. LLMs have NOTHING TO DO with graphics. The only reason we are using GPUs to process LLM models is because of the tensor processing on GPUs - and in the process, STEALING PERFORMANCE. Separation is logical and inevitable.
@Mimeniia 5 หลายเดือนก่อน ⁺¹
Il rather call it AIRAM, just limiting it to inference and not training as well is a bit...well limiting.
@jaffarbh 5 หลายเดือนก่อน ⁺²⁵
These "AI Startups" all aim for one thing: to get acquired - otherwise file for Chapter 11!
@percy9228 5 หลายเดือนก่อน ⁺¹
it's strange how the author compares to 3dfx. there is so many things wrong with that anology.
for one no one knew how much demand their would be for graphics card to play games.
no one knew (in the mainstream) how computing will take over.
whereas everyone knows how important and how big AI will be and thus how lucrative it would be.
secondly the other big thing wrong is how powerful Nvidia are and how they will make sure they keep their ai crown.
the only reason they are valued at that level is because of demand for AI. so a startup that has 30 million or whatever isn't competing with Nvidia. they already working out into next 5-10 years of this will pan out.
they already have an advantage with their own tech to guide them.
so no. Intel, AMD, Nvidia are already making sure to get that piece of the pie
@jaffarbh 5 หลายเดือนก่อน
@@percy9228 You make good points. Nvidia actually learned about their gaming cards being used for computing way back. That's why they built CUDA a decade ago. The point with startups in this field is that they can't survive 5 years, let alone 10 years without continued funding, which is difficult. That's why their best bet is to get noticed by Intel or AMD and get acquired in my opinion.
@donboyfisher 5 หลายเดือนก่อน ⁺⁹⁰
Imagine you had a gpu design which had a spare m.2 slot on it which you could add an ai accelerator like those onto. Like a physx addin card.
@Slav4o911 5 หลายเดือนก่อน ⁺²¹
The actual problem with consumer GPUs, there is not enough VRAM, not the power of the GPU, so adding more tflops does not help at all. People add more GPUs because they don't have enough VRAM in 1 or 2 or even 3 GPUs, if RTX 4090 was with 48GB VRAM it would have been much better for AI acceleration, than it is now.
@zivzulander 5 หลายเดือนก่อน ⁺¹¹
For training large models that's definitely the case, but for running inference with smaller, more efficient models you don't necessarily need the same capacity as large GPUs. Most of the near-term uses cases for _average_ consumers running local LLMs will be for things like chat assistants that can schedule tasks, summarize emails, generate images, etc. You don't need a ton of VRAM for that.
A lot of the popular AI services right now are running in the cloud, anyway; the things that are going to run locally will be more for privacy with limited training (meaning smaller datasets/fewer parameters), requiring smaller buffer. The home enthusiasts of course will be driven to buy dedicated GPUs with more VRAM, but that's not really the primary market for these types of machines with these cards.
@auturgicflosculator2183 5 หลายเดือนก่อน ⁺⁵
@@Slav4o911 A6000 and W7900 both have 48GB of VRAM... if you feel like paying 2-3 times as much for a GPU. 😄
@CantankerousDave 5 หลายเดือนก่อน ⁺¹¹
I think it was Asus that demoed a lower-end GPU that had an m.2 slot meant for an SSD. The thinking is that the card isn’t making full use of those pci-e lanes, so why not add in some system storage that can?
@Slav4o911 5 หลายเดือนก่อน
@@auturgicflosculator2183 Paying 3x more for 2x more VRAM is not a good investment. The problem is, the moment you go above VRAM you get from 10 to 20 times less performance. The most efficient AI GPU at the moment is RTX 3090 second hand. Of course if you have unlimited budget you go for the big Nvidia AI accelerators like A100 and H100, but these are totally outside of consumer space.
AI accelerator without big chunk of VRAM is basically useless. These "accelerators" shown in the video are niche products catering to people who don't know what AI accelerators they actually need. They certainly don't need these m.2 accelerators, they would be very frustrated when they buy these and hoping they can run Stable Diffusion or LLM models on them. I see even in this comment section people don't understand what these things are and think they can run LLM models on these.... nah, without VRAM these are useless. Efficient models just run on less VRAM not without VRAM, these "accelerators" don't have any VRAM, going through 4x PCIe to the regular RAM will be extremely slow.
@denvera1g1 5 หลายเดือนก่อน ⁺⁴¹
Coral TPU from like 2019 had something like 5 TOPS and was under $100, though that may have been more of an ASIC because every time i saw it referenced it was about image recognition, though up until chatGPT that was mostly what people were using AI for.
I used photoprism for the same thing identify, organize and search my photos.
@glabifrons 5 หลายเดือนก่อน ⁺⁷
They came down in price a lot. The M.2 version (PCIe, as this guy refers to it) is dramatically more compact and only $25. One with a pair of those chips (so same performance as the one he's holding) is only $40 now.
@quantuminfinity4260 5 หลายเดือนก่อน ⁺³
The M.2 Dual coral one is $39 MSRP with 8 TOPs (4 per chip) Also its only 22 mm x 30 mm (M.2-2230-D3-E)!
@Archie3D 5 หลายเดือนก่อน ⁺³
Coral support seems to be stopped like 3 years ago. Coral chip has 8MB memory which limits the models size significantly, and they operate in 8-bit integers only. Coral dual requires 2 PCIe lanes which most mother boards don't provide (so you'll be able to use one TPU out of two).
@denvera1g1 5 หลายเดือนก่อน ⁺¹
@@Archie3D Most M.2 slots are 2-4, but yes, while a bifurcation card would physically mount 4 of the TPUs for 8 total, you could only address one of them because most board do not support 2x bifurcation and only 4x4x4x4x or 8x8x, some 8x4x4x
@robertlawrence9000 5 หลายเดือนก่อน ⁺⁵²
Cool! They are showing us something that an NPU can be used for! I think I like the idea of having that be a separate device and not be taking up space inside of a CPU.
@TazzSmk 5 หลายเดือนก่อน ⁺²
M.2/PCIe (even 5.0) is relatively very slow though...
@robertlawrence9000 5 หลายเดือนก่อน ⁺⁵
@@TazzSmk yeah but that doesn't hurt the GPUs with AI. Why do we need this in a CPU?
@crazybeatrice4555 5 หลายเดือนก่อน
@@robertlawrence9000 bandwidth is very high on die
@TazzSmk 5 หลายเดือนก่อน
@@robertlawrence9000 I guess it depends on scale, most AI-designed gpus use HBM and tons of VRAM where something like "casual" 24GB GDDR6X is uselessly slow and inefficient,
I can't really answer why do we need AI units in CPU, but smartphone manufacturers have been doing that for years and Apple now does that in therit Macs and it seems to be fairly efficient
@auturgicflosculator2183 5 หลายเดือนก่อน ⁺⁴
@@robertlawrence9000 NPU is faster at processing basic AI than either CPU or GPU, makes sense to have it located centrally.
@QuentinStephens 5 หลายเดือนก่อน ⁺¹²
I'm thinking more PhysX than 3dfx.
@Alice_Fumo 5 หลายเดือนก่อน ⁺²⁰
I would want something like this to bypass vram requirements of very large models, but apparently this chip only takes 4x10MB of model size, which unfortunately makes its uses extremely limited if it doesn't actually let you run stupidly large models... :(
@Slav4o911 5 หลายเดือนก่อน ⁺¹
It can't run even small models, this AI add on card is not for running LLMs.
@Alice_Fumo 5 หลายเดือนก่อน
@@Slav4o911 It's really weird, though. If this thing actually performs like 40 teraflops and you can't use it on large models, it would run literally anything it can run at like 1000 fps. Thus I am inclined to think the 10MB might just be more like Cache instead of RAM.
@Slav4o911 5 หลายเดือนก่อน ⁺¹
@@Alice_Fumo It doesn't matter how fast it is, a GPU is faster than this yet when there's not enough VRAM the model runs slow. These things are like GPUs without any VRAM, which makes their actual usability extremely limited.
@Alice_Fumo 5 หลายเดือนก่อน ⁺¹
@@Slav4o911 I would disagree on the usability argument. It just needs to outperform a CPU by something like 6x.
Not many people run models in CPU mode, but for example mixtral on a ryzen 5600x is almost usable. Same for whisperv3.
@Slav4o911 5 หลายเดือนก่อน
@@Alice_Fumo I don't think 3 min waiting for a single answer is "usable". Everything above 30 seconds is too much. Yes you can run the model... but that does not help much, you can run all GGUF models on any relatively new CPU... but the waiting time gets progressively long. Also from the discussion on the forums it seems the bottleneck is the memory bandwidth and no so much the raw power of the CPU. Also most models have specific optimizations for Nvidia hardware and are not well optimized to run on the CPU or other GPUs. That's why RTX 3060 is as fast as 7800XT in Stable Diffusion and is much faster in LLMs (if you can compile and run them at all), almost nobody runs LLMs on Radeon graphics cards, because they have to know a lot about programming and python code otherwise they might not even be able to run the model. In the consumer space, at least for now only Nvidia is viable for AI.
@tshackelton 5 หลายเดือนก่อน ⁺¹⁴
They wanna be the next PhysX?
@CNC-Time-Lapse 5 หลายเดือนก่อน ⁺¹⁹
I am very excited about this. Anything that can accelerate local AI applications like Stable Diffusion and LLMs will be a game changer. It they can help accelerate and offload tasks from the GPU, these are going to sell. I am wondering if they will eventually allow for parallel processing or dedicated VRAM modals purely with these (to be truly dedicated offloading) as it would be nice to not even required a GPU that has CUDA (or other tech) or use large VRAM sizes.
@Slav4o911 5 หลายเดือนก่อน
Sadly these are not for that. Without onboard VRAM they can't accelerate Stable Diffusion or LLM models.
@jonmichaelgalindo 5 หลายเดือนก่อน ⁺⁴
They literally had Stable Diffusion running on the right monitor, slower than my laptop that doesn't even have a GPU. Without memory, these things are 100% trash.
@CNC-Time-Lapse 5 หลายเดือนก่อน ⁺¹
@@jonmichaelgalindo First gen tech usually is. I'm hoping they can parallelize them and add their own memory to improve it's performance. This is something I'm sure they will eventually do.
@Slav4o911 5 หลายเดือนก่อน ⁺¹
@@CNC-Time-Lapse I don't think that would happen, these are just niche products for visual recognition. Stable Diffusion can run marginally good on almost any GPU. LLM models are the real problem, where it's ridiculously expensive to run them if you don't have your own hardware. You either have to wait 200 seconds or more if you run them on a CPU, or you have to get 3090... better 2 of them to run a good model and not wait a few minutes.
@adamlongaway 5 หลายเดือนก่อน ⁺¹²
Notice NPU percentage wasn't shown. My guess is you don't use it much.
@double0cinco795 5 หลายเดือนก่อน ⁺²⁰
Aren't the Coral AI cards the first "era" of AI accelerator cards?
@CheapSushi 5 หลายเดือนก่อน ⁺¹
thought so too, and they have a few different variants with different connector keying
@365tage7 5 หลายเดือนก่อน ⁺²
TPUs are optimized for specific machine learning algorithms and use cases, while MPU AI accelerators can handle a wider range of AI workloads, but may not be as efficient at specific tasks as TPUs. But I guess the key difference is that TPUs are proprietary hardware developed and used primarily by Google, while MPU AI accelerators are available from a variety of vendors and can be used on any computer with an MPU processor. Another problem is that TPU are mostly supported for Linux systems.
It is also true that some MPU AI accelerators do have dedicated memory specifically designed to handle AI workloads. This "on-chip memory" or "local memory" is designed to store the data and intermediate results of the AI algorithms being processed by the accelerator. This allows the accelerator to access the data it needs quickly, without having to rely on the main system memory.
@smorrow 25 วันที่ผ่านมา
@@365tage7 Is "MPU" a new thing I haven't heard of or are you just consistently misspelling "NPU"?
@365tage7 25 วันที่ผ่านมา
@@smorrow MPU is short for Main Processing Unit, NPU means Neural Processing Unit. Those are different kinds of accelerators. MPU is for multiple tasks. NPU is spezialised on certain tasks in NN or ANN.
@gavincstewart 5 หลายเดือนก่อน ⁺⁴
Wow, awesome coverage! I had no idea this technology was unveiled at CES this year, nobody else seems to have covered it. Thanks for shedding some light on this, I truly appreciate it. Stuff like this makes me really excited, I would love to have an AI accelerator card in my PC!
@bloepje 5 หลายเดือนก่อน ⁺⁶
I can remember that "AI accelerator" cards are available for a while now. Even in M.2 format.
Maybe the difference is that they have added windows support instead of that it just works on linux.
@MaddJakd 5 หลายเดือนก่อน ⁺¹
This.
This is not new by any stretch. Just that what we have are ungodly expensive and different form factors..
Guess someone has to play on the craze though
@MaddJakd 5 หลายเดือนก่อน
@ts757arse I clearly meant "craze" as in capitalizing on the buzzword.
This hardware isn't exactly new. I'm not even sure the form factor is totally new, considering you can find M.2.... annything, if you need it.
I was looking into dedicated accelerators before, and AMD has the Alveo I think their called. Thats the obvious one. Everything else are custom solutions.
Add how everyone's launching CPUs with dedicated AI hardware, and honestly I have to wonder where these really fit in overall outside of special cases and "extra because why not." I guess there are also the folk who simply refuse to update hardware overall but happen to have an open slot, but even then, if their serious about this, step into the future already instead of 1/4 stepping. I can't imagine this would be useful without at least 64 gigs of system ram.
@johndoh5182 5 หลายเดือนก่อน ⁺⁸
These are going to be important. I'm glad all the system builds I've done for myself, an Internet Cafe along with a school have 2 NVMe ports, and all the better quality boards I bought have 3 (all 500 series AM4, either B550(M) or X570), so those 3 NVMe port boards can run dual OS and STILL have an NPU.
It won't matter for the next couple years, but 3+ years it will.
@pig1800 5 หลายเดือนก่อน ⁺⁶
This kind of product already exist in China market for years... First major product is release in around 2019 I think, and provided 8TOPS of AI computing performance.
@deadinside777 5 หลายเดือนก่อน ⁺³
Can't wait to the must have Gamer AI accelerator cards with AI RGB.
@kozad86 5 หลายเดือนก่อน ⁺⁶
Intel has a stand-alone NPU on a PCIe M.2 card already, but it's not something you can easily just buy. I think the best application for these NPUs will be to upgrade older PCs lacking an NPU, or eliminating expensive GPUs from machines focused solely on AI.
@Slav4o911 5 หลายเดือนก่อน ⁺²
You can't do that because there is not enough bandwidth from the PCIe bus to the RAM. GPUs has plenty power but the moment you offload part of the model from VRAM to RAM, there is a very big slowdown. I'm running LLM models and the moment they go outside of VRAM they become progressively slower, even if only for example 20% is "outside" of VRAM. For example the same model takes 10 seconds if fully inside VRAM and 20 seconds if partially outside VRAM... if it's 50% outside it's 60 seconds... and so on... working only on the CPU it's 200 seconds...
@ikemkrueger 5 หลายเดือนก่อน
That depends on the price.
@johnq.public2621 5 หลายเดือนก่อน
@@Slav4o911 What's your setup?
@RifterDask 5 หลายเดือนก่อน ⁺⁵
Feels like a solution looking for a problem at the moment. I could see these things being used to enable some really crazy procedural experiences in games, but it feels like that could be done with the tensor cores on a GPU as-is.
@SaHaRaSquad 5 หลายเดือนก่อน ⁺¹
If Nvidia keeps being this stingy with VRAM their tensor cores won't help.
@Finite-Tuning 5 หลายเดือนก่อน ⁺²⁵
This makes sense I guess for adding AI specific compute power, but I still fail to see or understand how "AI" is any different than any other program or algorithm. If a code base is sufficiently large with enough options and variables, then it's just a matter of picking/placing the correct variable with the correct option, just like any program always has. If it's really actually any different then that, well then I guess I have a lot to learn.
Cheers 🍻
@cxsey8587 5 หลายเดือนก่อน ⁺¹⁵
It’s more that you can create a chip that’s optimized for a specific algorithm. Same way that you use a GPU and a CPU for different operations. You COULD use a CPU to perform graphics operations but it would be very slow.
@granatengeorg 5 หลายเดือนก่อน ⁺³
Well just like graphics can also be run on any cpu, ai can also be run on any cpu. But that doesn’t mean it’s gonna be ideal. CPUs are good as general purpose processors, but since their internals need to be generic enough to be used for anything, they will never be as efficient or optimized for specialized task that other pieces of hardware are built for. Gpus started becoming a thing because loads of parallelized vector maths require a focus different kinds of instructions and overall processor layout to run better. And it just so happened to be that gpus were better suited for ai than cpus out of the box. However, gpus still contain a lot of stuff that ai doesn’t really need, hence why specialized ai accelerator cards can make sense.
Whenever or not it will be big enough of a difference to warrant different hardware vs just keeping it on a gpu, only time will tell. Similar developments happened with raytracing and before that physx, and those all just incorporated into the gpus afterall.
@zivzulander 5 หลายเดือนก่อน ⁺⁵
My layperson explanation: Artificial Intelligence/Machine Learning is largely different from traditional computing algorithms in that it is self-training and stochastic, meaning that it can at least approximate some level of human-like inference. In other words, AI can often (not always - "hallucinations" being a more visible failure) successfully carry out a task in different ways without being deterministically programmed with those decision routes. This is why something like Stable Diffusion or ChatGPT isn't going to necessarily spit out the same image or answer every time, even if given identical prompts: someone didn't program each and every output - the AI model is arriving at those different results on its own.
The reason why you might want dedicated AI hardware is similar to the reason why dedicated graphics were developed even though CPUs can technically do the same functions: hardware that is purpose-built tends to be more efficient and will have resources free just to do the tasks they are given.
AI/ML like Large Language Models (LLMs) rely on matrix multiplication. GPUs happen to be pretty good at this - this is why Nvidia is ruling the roost right now in terms of AI hardware - but a GPU might be busy with other operations, especially if it's a low-end GPU in a power constrained laptop or a all-in-one desktop with only integrated graphics. CPUs can also perform AI functions, but even less efficiently than either GPU or dedicated AI hardware might (some locally running AI software will let you run on CPU but it can be painfully slow to do that).
Right now it's not a big deal to just run AI tasks on GPU, but given a very near future where video games, streaming software, local assistants, etc might all be competing for resources on desktops, it makes sense to at least start off exploring dedicated AI hardware as an accelerator. We have seen some dedicated accelerators pretty much go nowhere in the past - PhysX comes to mind - but AI/ML does have a lot of burgeoning use cases, even for home users.
@TheGuyWhoGamesAlot1 5 หลายเดือนก่อน ⁺³
From my non-expert knowledge (anyone correct me if I am wrong).
In terms of these machine learning algorithms is that they are a lot easier to implement in a way than writing a specific algorithms. Sometimes it is nigh impossible (like writing a deterministic more "traditional" algorithm for a large language model). You, more or less, give training data, and the model gets better and better over time at mimicking the desired results.
Training a ML (machine learning) algorithm to do basic math is sort of worthless, since we already know how to do it in computer science and math. Determining the sentiment of a review and extracting key words or generating an aesthetic image based on a prompt? Much more difficult.
ML is essentially a general brute force sort of method of solving certain kinds of problems that are either too difficult or costly to do in a different way.
Also to note their are a variety of different ML techniques as well, which are better for different kinds of tasks.
LLMs (large language models, like ChatGPT) are increasingly popular and focused on because their ability to do a lot of language understanding based tasks fairly well. And with some newer models being also able to make API calls and execute code, can do in theory a lot. It is the ultimate generalist.
This is sort of why people were and still are excited for Boston Dynamics robots. There are better specific ones, but a generalist one adds a lot of flexibility, since hopefully they can do most tasks that humans can (in the physical sense).
Dedicated NPUs could maybe be useful for keeping costs down in building a system that needs to do ML computations without having to buy a more expensive set up, or in something that needs to be energy efficient (drones or RC like stuff). They are targeting a small niche with it in desktops, IMO, at least currently, because if you were wanting to do AI ML stuff, you would buy either a GPU or just run it off a CPU slowly with more RAM. It seems, at least currently a weird middle ground, which isn't even that middle (more so like running on a CPU than a GPU with large VRAM).
@Finite-Tuning 5 หลายเดือนก่อน
@@cxsey8587:
Yeah, exactly.... Or rather that's what I'm thinking. No different than a CPU vs ARM vs GPU vs APU, it's all just specific hardware and the specific code that drives it, same as it ever was. To me, "AI" just sounds like a new term for some old common everyday tasks. Just like "The Cloud" is a server and storage the same as it ever was, but Microsoft gave those two components a new name and now it's something special.!?
Cheers 🍻
@SuperAaronbennett 5 หลายเดือนก่อน ⁺⁹
So what does a consumer get out of an AI accelerator card exactly? Like, whats the WIFM? (Whats In it For Me)
@POVwithRC 5 หลายเดือนก่อน ⁺¹
You mean you don't know?
@zivzulander 5 หลายเดือนก่อน ⁺⁶
The first gen of these, probably nothing. But that was also true of early SSDs with miniscule capacities and high prices. In the future, though, they might just be another add-in card to handle AI compute loads more efficiently.
Right now most useful AI is prosumer level, but as hardware proliferates then we should see more software proliferate to use it. Chicken-and-egg. But it's early days, otherwise.
@brendago4505 5 หลายเดือนก่อน ⁺²
If you run an open source local hosted home security system with something like frigate, adding an AI card let's you add shape recognition
@pctrashtalk2069 5 หลายเดือนก่อน ⁺⁴
You get to spend money on something you don't need to support the hype.
@marshallmcluhan33 5 หลายเดือนก่อน
An ideal AI accelerator for me would generate more tokens per second in large language models (Llamma 2, Mistral, etc.) and faster image generation with Automatic111/Stable Diffusion. If it had enough RAM/Power it could also be used to train or fine tune AI models.
@trashman1358 5 หลายเดือนก่อน
Sorry, so you need a dedicated motherboard to plug this thing into? Will they work in tandem? Could we externalize these and jack them through a dedicated port?
@4.0.4 5 หลายเดือนก่อน ⁺²
The big problem with AI accelerators is that you need a ton of VRAM for useful AI (I have 24GB GPU and it's barely enough); which is likely the limiting factor in terms of price. Though this might have a market for Raspberry Pi kind of AI stuff.
@Ultrajamz 5 หลายเดือนก่อน
Is this needed if strong gpu is already in the desktop?
@mstreurman 5 หลายเดือนก่อน ⁺³
"We're looking at the Lenovo ThinkPad, which is a desktop"... No, nope, you're not looking at a ThinkPad which is a desktop... You're looking at a ThinkCentre... which is a desktop... A ThinkPad is always a mobile device...
@NetvoTV 5 หลายเดือนก่อน
There's something for desktop but it's build in and it's on the Apple Silicon isn't it? Is this module one going to able to scale with more you install in and does any CPU able to talk with them directly?
@vladislavkaras491 5 หลายเดือนก่อน
Thanks for the video!
@rkeantube 5 หลายเดือนก่อน ⁺¹
Remember in Terminator 2, one plot point was to destroy the AI CPU that Skynet needed to run on...
@sharkinahat 5 หลายเดือนก่อน ⁺¹
'The next 3dfx' sounds like 'the next Hindenburg'.
@NewsLetter-sq1eh 5 หลายเดือนก่อน ⁺¹
Are there more details about the performance with Stable Diffusion you can see on the right-hand screen? An accelerator for Stable Diffusion or LLaMa would be very interesting!
@dibu28 5 หลายเดือนก่อน
Seems like the second is used to generate images with stable diffusion and you can easily compare its performance with the performance of the current GPUs in Stable Diffusion
@popcorny007 5 หลายเดือนก่อน ⁺⁴
Difference between this and Google's Coral TPU?
@365tage7 5 หลายเดือนก่อน ⁺¹
TPUs are optimized for specific machine learning algorithms and use cases, while MPU AI accelerators can handle a wider range of AI workloads, but may not be as efficient at specific tasks as TPUs. But I guess the key difference is that TPUs are proprietary hardware developed and used primarily by Google, while MPU AI accelerators are available from a variety of vendors and can be used on any computer with an MPU processor. Another problem is that TPU are mostly supported for Linux systems.
It is also true that some MPU AI accelerators do have dedicated memory specifically designed to handle AI workloads. This "on-chip memory" or "local memory" is designed to store the data and intermediate results of the AI algorithms being processed by the accelerator. This allows the accelerator to access the data it needs quickly, without having to rely on the main system memory.
@WCF06 5 หลายเดือนก่อน ⁺¹
Would love to see Mythic with it's analog AI processor.
@PracticalAI_ 5 หลายเดือนก่อน ⁺¹²
This will be fantastic for local LLM projects; GPU cards are still expensive.
@Slav4o911 5 หลายเดือนก่อน ⁺⁹
This thing has 10MB memory.... and no onboard VRAM good luck beating any GPU. Nvidia RTX cards have more cash memory than this.
@mirek190 5 หลายเดือนก่อน
LLM needs a lot of fast memory .. nowadays we need 80GB+ to run not compressed LLM .
@PracticalAI_ 5 หลายเดือนก่อน
@@mirek190 you can use a rasperrypy to run it with 8gb, you need 80gb+ to retrain (but you need more than that and few peoples do that)
@mirek190 5 หลายเดือนก่อน
on rasberry you can run only extremally compressed 7B LLM to 4 bit q4.... better use q5k_m or q6 @@PracticalAI_
@vulcan4d 5 หลายเดือนก่อน ⁺²
So my Coral TPU ain't cutting it anymore? :)
@fajarn7052 5 หลายเดือนก่อน
How about a card with additional GDDR5/6 RAM in it and use it as additional VRAM frame buffer before we use system memory, would that be possible?
@huyked 5 หลายเดือนก่อน ⁺¹
Very interesting! What a time to be alive!
@kjyhh 5 หลายเดือนก่อน
lol .the pose sensor pointing to the camera guy.
@jeffrydemeyer5433 5 หลายเดือนก่อน ⁺¹
How generic are these accelerators?
can a work load meant for the npu on say mtl run on one of these?
@cem_kaya 5 หลายเดือนก่อน
The memory limitations are the main problem with add in ai cards. Large ai models require a lot of memory( 2GB-200GB).
@Roriloty 5 หลายเดือนก่อน ⁺¹
not for long the ai accelerate card will expand in performance and also size so that mean in the future we might have an ai card in the size of a graphic card
@hindesite 5 หลายเดือนก่อน
it makes sense to iterate the development of AI hardware outside of the CPU/SOC upgrade cycle and we'll produce less landfill this way.
@denniskliewer4 5 หลายเดือนก่อน
For Raspberry Pi 5 over PciE this would fit quite well. For example tasks like speach recognition on edge would be feaseble
@spankeyfish 5 หลายเดือนก่อน ⁺¹
Google Coral chips have been available as m.2 cards and usb dongles for a year.
@nosirrahx 5 หลายเดือนก่อน
I also remember physics accelerator add in cards.
@MrArrmageddon 5 หลายเดือนก่อน
Will these boost VRAM for AI applications or only processing speed? Love to see these have like 4,8,12GB of VRAM that can combine with your main GPU. And also boost some speed. Be great if they could be used for like Stable Diffusion but also Local LLMs.
@dragosbogdan3450 5 หลายเดือนก่อน
Cool. This can run on linux?
@EnochGitongaKimathi 5 หลายเดือนก่อน ⁺⁷
Unless the desktop doesn't have a dedicated GPU, I don't see their long term usefulness. AMD just announced a desktop APU with integrated NPU, Intel will have something for desktop with Arrow Lake. Even without them Nvidia will tell you every desktop with an RTX 30 onwards has on device AI already.
On desktop we don't really care for efficiency as we are plugged in.
@marshallmcluhan33 5 หลายเดือนก่อน ⁺¹
Yeah what can be integrated into CPUs will be better than this. The new Snapdragon chips are probably stronger and the new androids will all have it by the end of the year.
@marvinmallette6795 5 หลายเดือนก่อน
@@marshallmcluhan33 What can be integrated into CPUs may be less upgradeable. How rapidly are we expecting this technology to evolve? Old school CPU technology is not likely to evolve anywhere near as fast.
And on the GPU front, I don't see these NPU cards replacing GPUs anytime soon, if ever. GPUs are already heavily optimized for paralleled operations and are generally being marketed as AI cards anyway.
NPUs are more likely to replace classical CPUs for operations that a machine learning algorithm can optimize for parallel computing better than an intern seeking a degree in computer programming. So they are more likely to replace the CPU, than the GPU.
@marshallmcluhan33 5 หลายเดือนก่อน
@@marvinmallette6795 Well there are lots of software optimizations that are still being done, quantization can help AI models run on legacy hardware for example. In terms of hardware Windows is doing a push towards ARM chips. Chiplets and RISC-V are things that could unlock better hybrid CPUs. Also we might see a new AVX instruction set made specifically for popular AI applications. There's a large change coming to laptop and desktop CPUs.
@marvinmallette6795 5 หลายเดือนก่อน
@@marshallmcluhan33 I see these M.2 add-on cards as a form of "hybrid CPU".
@marshallmcluhan33 5 หลายเดือนก่อน
@@marvinmallette6795 Yeah these may be a stop gap.
@adamo1139 5 หลายเดือนก่อน ⁺²
Is it really the same kind of FP16/FP32 5-10 TFlops per card that can be generally accessible with OpenCL as with AMD/Nvidia gpu's? So far I am dubious of the claims. It also doesn't seem to have any memory outside of the on-die 10MB, so you can run only shitty tiny models that can be easily handled by CPU anyway.
@dragomirivanov7342 5 หลายเดือนก่อน
TFLOPs, is float point operations, 16 bit or 32 bit floats. TOPS is usually INT8, because for inference you usually can go with INT8. Training AI is another beer, and needs floats. So basically all players market TOPS, because people will use already trained AI models, and INT8 is much more energy and space efficient.
@kamillatocha 5 หลายเดือนก่อน
why is there stable diffusion running in bacground ?
@adriangunnarlauterer4254 5 หลายเดือนก่อน
A low power but high memory GPU like Card specialized for ml, like s tpu, but with proper torch support and like 32gb of fast memory would be really interesting.
@Kwipper 5 หลายเดือนก่อน ⁺⁵
It will be interesting to see Automatic 1111 Stable Diffusion take advantage of this.
@streamtabulous 5 หลายเดือนก่อน ⁺²
it won't work, SD uses cuda or MLA, it would need to be patch to work, and even then a rtx3060 is 12700TF so it want to be very very very cheap to complete, and then you probably be like well few bucks more and I can have a gaming card.
I feel its more business aimed for small form factor PC as shown.
@uroy8665 5 หลายเดือนก่อน
Will it work in laptop ?
@HuntaKiller91 5 หลายเดือนก่อน ⁺¹
Great for apu like 8700G offloading the AI work
@ludovicbon5903 5 หลายเดือนก่อน ⁺¹
"Jensen ! Jensen ! We are here, please buy us !"
@theworddoner 5 หลายเดือนก่อน ⁺¹
How do they solve for memory bandwidth?
I’d gladly buy an ai accelerator card if it has access to high memory bandwidth. The more the better.
If they can make a product with access to 128gb memory and 60 tops then I’m down to buy it. Need to have good software support too.
@danielpetrov9179 5 หลายเดือนก่อน
This is more like NEC PCX2 chip, it was used in Apocalypse 3Dx and Matrox m3D and was PCI video accelerator card without video input or output.
@CuttingEdgeRetro 5 หลายเดือนก่อน ⁺¹
It's Ageia PhysX add-in cards all over again...
@KiraSlith 3 หลายเดือนก่อน
MemryX's product is based on their own MX3 chip, which can handle 10 million parameters per chip, 8 chips per M.2 package, that's still 88 M.2 devices (350 PCIe lanes) to address at once for a wimpy 7 billion parameter language model. In an ideal world, these chips would be able to address their own GDDR6x pools per-chip, but I don't know their architecture.
@user-oz4ud9te4h 3 หลายเดือนก่อน
3dfx legend is Back
@hartoz 4 หลายเดือนก่อน
Looks interesting, but what everyday application supports them. i.e. Why would I currently want one?
@mawkuri5496 5 หลายเดือนก่อน
can it accelerate machine learning and deep learning training?
@dracos24 5 หลายเดือนก่อน
I don't really see the point in these. People are saying "great, now my graphics card is free for my games while I run AI", but in actuality you're still competing for the same VRAM, in $$$ if nothing else. Until there is actually a dominance of games that actually use AI while running, I can't imagine most people would be doing heavy AI generation while tying their graphics cards up to running a game on the same platform. It'd make more sense to have two separate computers if that was the case. Seems like they have to find a niche to fill, maybe some commercial application that wouldn't call for a graphics card normally, but has a need for AI.
@Bob-of-Zoid 5 หลายเดือนก่อน
Why are they calling that thing a "Think Pad" when it's clearly a "Think Box"?
@tombolts 4 วันที่ผ่านมา
When will these NPU add-on cards be available to buy and will they make more powerful versions say 100 TOPs and more?
@reinerheiner1148 5 หลายเดือนก่อน ⁺²
The big problem for desktops is not ai computing power, but enough ram with a high enough bandwidth to gpu and/or the cpu or a dedicated ai chip to load the models into. Just for perspective, a 4090 with 24gig of ram has too small a vram for many llms. Any serious ai accelerator would need to have huge amounts of fast ram to outcompete a gpu. And again, todays gpus are severely vram size limited to what we need for decent performing llms sich as mixtral, which in its original size needs about 48gb. Even a 1080ti could run mixtral okish when it comes to speed, if it had enough vram... I highly doubt these accelerators will be able to run any decent llm. The only one so far going the right way is apple with its m2 192gb ram, 800gb/sec bandwidth hardware. Great for inference of big models. Not the fastest, because its an m2, but outperforms any consumer hardware because of ram size and bandwidth. A 4090 would be much faster, but cannot fit big models in its ram. All amd and nvidia would have to do is give consumer gpus huge vram increases. But they probably wont, because this would make their server gpus less attractive.
@normanmadden 4 หลายเดือนก่อน
"I am sorry Dave, I can't close the DVD door."
/s
@user-mx4ij7hi2c 5 หลายเดือนก่อน ⁺¹
To beat GPUs, these vendors need to provide about an order of magnitude more high bandwidth VRAM!
@sempertard 5 หลายเดือนก่อน
I'd love to just have a compression/decompression add on card that is 50x faster than a "hot" multicore CPU. Something I could hand a 1/2TB VM file to and have it squeezed down in 2 minutes.
@alextrebek5237 5 หลายเดือนก่อน
*Google Coral released in 2018* :Am I a joke to you?
@federicocatelli8785 5 หลายเดือนก่อน ⁺³
Practical use of A.I. 🤔
@brodriguez11000 5 หลายเดือนก่อน ⁺¹
@@SC-hk6ui Digital friends.
@Alan_Gor_Forester 5 หลายเดือนก่อน
Great news!
@Stabby666 5 หลายเดือนก่อน
That first demo is pretty much what we were doing with the Xbox Kinect over 10 years ago :) Not sure why it needs an AI chip.
@SlyNine 5 หลายเดือนก่อน
The A.I. acceleration will be done on video cards just like physX. Tho I do like the idea as I play with local AI a lot.
@Fractal_32 5 หลายเดือนก่อน
Google did it earlier (2019) with Coral ai m.2 (TPU) cards.
@semicuriosity257 5 หลายเดือนก่อน
Does Coral ai show up in the Windows 11 task manager?
@starmanmia 5 หลายเดือนก่อน
Thats it!!....im off to install my 3dfx voodoo graphics card now...cya x
@Steamrick 5 หลายเดือนก่อน
LLMs need a lot of memory. How does that work with accelerator cards like this?
@Slav4o911 5 หลายเดือนก่อน
It doesn't, it's a niche product for image recognition and things like that, it's not for LLMs or Stable Diffusion.
@coderhex1675 5 หลายเดือนก่อน ⁺¹
I think he was one of the legendary people who j3rked off the old Lara. Epic
@SahilP2648 5 หลายเดือนก่อน
People are misunderstanding what this card is doing. This is like an NPU (or Apple neural engine) which runs matrix multiplications hardware accelerated with efficiency in mind. Anyways, if you want to do anything ML related right now, the best option is to get a Mac since the Apple silicon Macs have unified memory which serves the purpose of both RAM and VRAM. I can run 70b models on my work 64GB M2 Max MBP 14" which is just bonkers. Sure that thing costs $3700 if bought new from Apple directly. If you want the absolute best, you can get an M3 Max 128GB unified memory MBP 14" for $4800 which is steep but with 128GB VRAM you can run 120b+ models like Goliath (Goliath takes up 70GB VRAM, and Goliath is amazing). I can see it as a long term investment and if you are a developer, buying a Mac studio Ultra or a MBP with a ton of memory makes sense if you are thinking 5+ years in the future. Macs go up to 192 GB for the ultra btw. The inference speed is also quite nice. My RTX 3070 only has 8GB VRAM for example, but my personal M2 MBA 24GB has memory equivalent to a 4090 (which I got for $1600 on ebay). Also considering how efficient Apple silicon is, you are going to save a lot of money by running your inferences and training on a Mac in terms of energy costs.
@saricubra2867 5 หลายเดือนก่อน
We need dedicated NPUs or neural cards like we have graphics cards.
@Kordanor 5 หลายเดือนก่อน
What is the big advantage over GPUs doing AI? More power? More specifically made for that task?
@marvinmallette6795 5 หลายเดือนก่อน
Seeing as M.2 NPUs are on display here, I would expect significantly less power. It might be intended as an upgrade path forward for "legacy" x86 IBM/ATX computers.
GPUs may be less marketable as an upgrade, being expensive, loud/thermally demanding, and having a reputation for being for "gamers" and not general purpose office work.
I'm seeing them more likely having an advantage over x86 Intel CPUs, using legacy x86 code which is centered around ordered instructions across a limited number of cores. NPUs would provide a larger "core count" for heavily paralleled workloads that could be executed out of order with the oversight of an AI engine.
@Kordanor 5 หลายเดือนก่อน
@@marvinmallette6795 Cool, thank you for the explanation!
@shephusted2714 5 หลายเดือนก่อน ⁺⁴
ai is hype for smb/sme market segments for 5 years until we get affordable cxl4 and more i/o - the hw and sw have to catch up- once it does then expect major economies of scale but ces and this presentation were both depressing and a nod to pc shipment nadir plus nascent ai for biz
@aniksamiurrahman6365 5 หลายเดือนก่อน
Only if these were available as seperate hardware for the end users to be used with any PC!
@marvinmallette6795 5 หลายเดือนก่อน
PC Building experts will likely need to review the hardware to ensure there is enough PCI-Express bandwidth to ensure proper Operating System stability. It seems like just "any PC" would be likely to suffer crashes and reboots due to SSD disconnects should the shared PCI-Express bus experience excessive congestion.
It does look from the thumbnail that such is the notion, separate M.2 hardware compatible with any PC, that has the available shared PCI-Express bandwidth... Probably best not to pair it with an NVIDIA Geforce or AMD Radeon Graphics Processor.
@StreetComp 5 หลายเดือนก่อน
Make your AI Companion not only super hot but super smart with our AI Accelerator cards!!
@LelandHasGames 5 หลายเดือนก่อน
I wonder if this technology could be used to bring ray tracing to older hardware or hardware that's incapable of ray tracing.
@WimukthiBandara 5 หลายเดือนก่อน ⁺¹
PhysX, not 3Dfx. People forget that Physx was originally an add on card.
@highvis_supply 5 หลายเดือนก่อน ⁺¹
my year-old hailo-8 m.2 AI accelerators are sad that they missed out on being called first gen ._.
@frankkratosvlogs3469 5 หลายเดือนก่อน
I didn't know that qas available
@alfblack2 5 หลายเดือนก่อน
Awsome news! wohoo!
@kylehaas864 5 หลายเดือนก่อน
Will this bring significant performance benefits to people who are just running regular desktop PCs? Or is this purely going to be a technology focused on AI-developer PCs? Honestly, a lot of the technologies that people are mentioning, like PhysX or 3dfx, lasted only several years before they were *mostly* made irrelevant by better CPUs/GPUs. In fact, the proprietary drivers associated with both of those technologies have mostly divided the community and created development/support nightmares for the companies who custom-tailor their software to take advantage of them.
@SinisterSpatula 5 หลายเดือนก่อน
If these dedicated NPU's are compelling enough, I imagine nvidia will just add them in to their GPU's rather than us having to buy yet another dedicated additional card.
@wettuga2762 5 หลายเดือนก่อน
I would call it the "next step in evolution" after SSDs and Ray Tracing , not the "next 3Dfx". By the way, I gamed on my 3Dfx Voodoo 2 just this week 🙂
@400ccmiruku4 5 หลายเดือนก่อน ⁺⁴
wake me up when they're running Llama 70b on this
@psygnale 5 หลายเดือนก่อน
Great…now I gotta go to my storage unit and dig up my Quantum Obsidian X-24…
@rodzandz 5 หลายเดือนก่อน
"What are AI accelerator cards? Why the newest gimmick to scam you out of your money of course!" No doubt these will soon go the way of those dedicated physx cards 😂
@wouldntyaliktono 5 หลายเดือนก่อน
Wait, wasn't Coral the first to make an M.2 accelerator?

ต่อไป

เล่นอัตโนมัติ

Unboxing the Tenstorrent Grayskull AI Accelerator!