I was interested in VROC key and I bought one but the compatible M.2 ssd were old and not easy to find in Europe at least. I send it back after 4 days desperated by this non sense. Then I would eventualy follow up on the list of compatible ssd but month and month without any change. Not only there was very little info about the Vroc key but you could not even find any way to inform yourself on the subject at that time. No contact, no service. With an asus extreme deluxe mobo. I'm one of the sad stupid guy who bought the 9980xe 18 cores at its highest price. 6 month after I bought it the price dropped by more than 50%. I'm disgust and don't want to hear anything more from them. Features as Vroc that appeal at first because promessing and then as customer you feel abandoned resourceless and devaluated. What a shame!
@@Clobercow1 indeed. Buy a CPU that does not need to rely on accelerators in first place. Pay for that CPU, use it, it is all yours, no recurring fees or post-sale transactions. According to sales numbers in the enterprise world, most companies have made their decision.
I'm not going to buy something that needs separate licensing, as there will be so much much worse support for it in software. Why buy something I can barely use?
This is a common issue I find with marketing. They dont understand the product so cant understand how to market it. What they need to do is get the engineers and such to set the direction and then the marketing guys come in to dress it nicely for sales. Especially with a technical product theres just too much obfuscation going on.
Depends on the organization how marketing is structured. A large software company I worked for had a dedicated marketing team for sales teams and leveraged us engineers for the deeper technical stuff. They handled all the other things. Intel does go more in depth with partners, but Wendel makes a point. A company like Intel should do a better job with public marketing given their place in the market. Some web marketing for some of these companies tend to stay in the "we've always done it this way" mode.
this video was just dumb.. its not the marketing.. its the exclusivity.. nobody implements features that only run on one type of cpu. thats why openGL and stuff exists.. as long as intel dont open source this and make it a standard.. nobody cares! but what you expect from a guy who thinks microsoftSQL is a thing.. lol..
As one of those engineers you mention at around 14:40, I really appreciated this video. Trust me, we have the same questions and concerns you do about Intels marketing and executive team. Also, thanks for calling them out on including AVX in their accelerator list. We don't get that one either.
The unlock model isn’t just annoying, in the long run, it harms the utility of a company’s solutions, and many customers will migrate elsewhere. Buyers often feel like the company is trying to scam them, and this undermines morale of long term devotees of a product line. To sum things up, I hope the executives enjoy their yachts, because the customers are being made to (needlessly) suffer, so someone should benefit, and it isn’t the average customer or even the average stock holder. Oh well…
i honestly don't remember intel pentiums without mmx. i think the amd k62 300 was around at that time, and i don't think it had it. thank you for the informational video. there is so much yet to learn.
...once again the great act of Intel. Here's these accelerators, also Intel we don't know wtf we're doing, also Intel engineer's know exactly what their doing.
@@donkey1271 that said we can be glad they did because the competition in the market has been super exciting. I've actually been looking forward to upgrading in the last couple of years.
@@donkey1271 if AMD wouldn't have made a comeback, we would still be stuck at 4 cores and purely x86 chips. no thanks 😆(warning: not 100% serious comment. Please refrain from posting stupid "AMD fanboy" comments)
@@bernds6587 If bean counters didn’t run Intel they’d be where they are now, but like 3-4 years ago. AMDs return to market forced said bean counters to let the engineers do their thing. They could be the Nvidia of the x86 world if it wasn’t the top execs worrying about profit margins per SKU. But competition is always good.
Intel either needs to launch a product and keep it around or just remove the licensing burden from their products. They keep making things that sound neat and then trying to build a walled garden around it. VROC, Optane, and now the "Y" pay-as-you-go chips. They need to take the lessons from the last 4 decades of computing and adopt a 2-step approach. Some super simplified "basic" version in all CPUs, and then it's bigger badder paid-for brother in the XEON class of CPUs. This would require the integration to both pieces to be the same to work. Like for example If a basic RAID-0/1 version of VROC was free on all Intel CPUs but a more feature rich version with RAID 5/6/10/etc was available for a $20-$50 via a license key. If VROC was deployed this way instead of everyone has to pay to even test the thing I think it would have been a commercial success for them.
Maybe Intel's messaging is askew because their marketing team doesn't understand their engineering teams and they don't have anyone to translate between them?
The accelerators share the memory bandwidth with the coree. In server workloads, memory bandwidth is almost always the bottleneck. I wonder how useful those cores being idle while the accelerators are doing the work, I imagine them to be crippled, that's a nice idea for benchmark: run a workload that doesn't use accelerators while another workload that uses accelerators is running, and compare the results to when each of them is running alone. I think there is a huge smoking mirror behind those cores looking idle.
You might think that whatever traffic that needs to go through the load balancer has to go through a core, defeating the purpose of the accelerator. But intel has Data Stream Accelerator, which is like a small processor that only executes movs to the WC memory buffer. So if you have a network card, an optimized driver will redirect traffic to the dynamic load balancer with the MOVDIR64B instruction, the only "pollution" to the system would be in the form of the up to two instructions per cycle that would only impact performance if you are scheduling 3 512 bit loads and 2 512 bit stores per cycle. No smoke and mirrors, the documentation is out there, the patents as well, there's a reason why it's so hard to get one of these cpus while every cloud company out there have them.
Yep, it's the new battleground. It's all about effective compute per watt. Sure your general purpose cores are faster per watt, but are they faster and more efficient per the specific use case?
You were able to put into words a thought that had been lingering in my minds for years. oneAPi seems great, but Intel does not really knows how to explain it to developers. It is just a flood of emails, invitation for online events, and you just don't know if it will be around in 5 years. Xeon Phi was also a good product that they scratched with no mercy.
Accelerators are basically ASICs, right? If you can make a die package that gets the right balance of General Compute with Specific Compute, you hit the jackpot.
It is not easy to get the balance right. Who will use AVX-512? I think less than 1% buyer will use this feature and yet you cripple 99% of users with die space taken and power consumed. It is easier said than done.
@@catchnkill for server users, I’d guess more like 75% of users will take advantage of AVX512, it’s just a question of if they know they’re using it or not.
The first 'accelerators' we had was the fpu chip. And with that you had a CPU and then a FPU slot. I can totally see having an optional accelerator slot for very common tasks. GPU encoding/decoding, compression/decompress, RAID, search (minimum length matching or regex) as well as BigInt and cryptography (1024+ bit).
you talked a bit about pci-e peripherals... I wonder how much slower an accelerator-on-a-card would be compared to an on-cpu accelerator if not much slower, this could be worth it to increase market penetration by allowing old systems to support these accelerators, but still add them by default to newer systems so devs can be sure that whatever they develop won't be an instant dead end. is this the return of the oddly specific expension cards? I recall back in the early days of the IBM-compatibles before they took over the "PC" moniker (no i'm not THAT old --barely an adult-- just interested in those fossils don't ask) those were decently common -- separate sound cards, network cards, _even the floppy and hard drive controllers_ were on separate cards, and I'm fairly certain some high-bandwidth devices would have used them. A key problem of this approach was how many expansion slots these took up -- by the time you had a functional system, you already spent 6 of your 8 expansion slots, so if you want to add a new-fangled Network Card, Serial for a mouse and a Parallel for your printer, OOPS, you're all out of slots. I wonder if such a fate can be avoided by using M.2 as a sort of "micro-pcie" slot for accelerators, somewhat like it's being used now for some wifi dongles. PS: Can you talk about the Gaussian Neural Accelerator found in some CPUs? I wonder how fast it is and whether it's a dead end or something useful for _some_ usecase.
IBM has been doing accelerators literally for decades with mainframes. It provided great value for their customers and ultimately worked very well for IBM.
A lot of tech is being obscured and is getting deep in the wool. The reality is that unless the vendors really clear their fog, this stuff won't be understood, won't be bought, and won't survive. The whole cloud and fewer and fewer people with knowledge and access is kinda self defeating. As for accelerators, nothing new here. Bolting on capability is taking place across vendors. If you can get an advantage, it has to be applicable. Making something no one uses is a dead end.
I think the only solution, really, is for the big cloud vendors to take the lead on this, offer (and promote) new cloud instances that use the new hardware incorporating these cool new technologies, and put together a support team to help their customers use those new instances.
I hope accelerators catch on (not just at intel) but I really hope the whole "lock hardware features with software" crap won't catch on. Like, paying to unlock seat heating.... what world is this
Yeah. Communication on Intel side is really lacking here. For couple of months I was not able to get any straight answer what IAA actually does, what will be api etc. As for VROC it’s awful right now. RAID5 write sequential performance is terrible and no fix in sight. Not to mention ping pong with abandoning and restoring it as a product in early 2023. Those features do exist, but adoption rate is terrible. There is no one but Intel to blame here.
That IAA stuff is extremely cool, that is, if you have an application that needs it. It's somewhat of a chicken before the egg problem tho - how are you going to know that you need it unless Intel can somehow figure out how to tell you that it exists, and related to that, until the various software vendors that could really benefit from it, start to implement using it if it is present in your system. But then again, why would they do that if exactly zero of their customers have that IAA feature in their system. This IAA would be totally killer for the various softwares that implement in-memory database systems. Those vendors should be going out and buying an IAA development system, implementing support for IAA in their in-memory database, then update their software marketing material to say "Our software works great in your run-of-the-mill computer, but if you want to set up an in-memory database server that really screams, you have the option of installing our software on a system that has the new Intel IAA ... etc". But don't hold your breath waiting to see that kind of coordination between hardware vendor Intel (who can't even tell you what IAA is) and software vendors. The one IAA question I had that Wendel didn't answer was, if you have (say) some plug-in PCIe card that has an IAA subsystem on it, does that thing share the memory space of the host computer, or would you put a whole bunch of separate memory on the card (and if so, what might be the size limit of such memory).
They've mostly been unused because getting access to them is too hard for developers. If they were in our AMD or Intel consumer systems we could take better advantage of these API's. The buisiness/industry adoption will always fail and economies of scale will too unless intel/amd starts putting them in desktop chips. Years ago they said we'd get FPGA's in CPU's still waiting on this... Still integrating the FPGA and Accelerator tasks into the consumer chips would be the ideal scenario and really a better innovation IMHO. At the minute there is not much reason to use Xeon/Epyc just for this one feature, consider that most contractors would have to buy this CPU and only be able to run this code on the one machine. How can I implement an accelerator if I can't run it on my machine? And I hate when the make a crutch device which just gives you API access like the Jetson for example when they give you a dev kit but you'd prefer to just have the same chip in your CPU.
15:50 the thing that blew my mind the most was learning that those robots were controlled by QR Codes on the floor. thats how they know where they are, not any sort of guidance or telemetry software. and occasionally the bots would get confused because the sticker was dirty or torn and misreport where the thing is so the mechanics would have to go out with a handle and drive the bot to a known good spot, replace the sticker, and restart it.
Everyone else just thought Weezer used cheap video equipment to make their music video. If they had seen it on your machine, they would have been amazed.
Intel needs to do 2 things: 1. Put one person in charge with one vision and one message, and spin off any division that's got a problem with that 2. Fix their website and documentation portal to make it easier for people to find out how to take advantage of their products. Even if that person's not an Intel partner and isn't interested in having an account (yet) And that's all the consulting I'm doing for one of the world's biggest tech companies for free. Call me.
AMD has certainly noticed with the "A.I Engine" that the upcoming Phoenix APUs will have. With the rumoured Phoenix Halo having a much stronger one as well. Accelerators are the way to go for future products without adding a ton of cores for specific features and doing it as efficiently as possible.
sounds almost like Intel has implemented ASICs on their chips, and if you want to use them, you can unlock the microcode for them. this would be an advancement, especially if/when AMD implements them as a standard feature.
Although many or most of us probably agree with Wendel that this idea to pay extra later for something that's already there on your chip, is a dumb idea, there is another way to look at it. That is, why should I pay for something that I'm not using right now. Think of all the (now hundreds, probably) of AV* and other add-on instructions that now both Intel and AMD are pretty much stuck with adding to their CPU chips, lest their chips look weak relative to their competitor. Not that inclusion is uniform, in a lot of cases a given chip only implements some subset, so it's market confusion in any event (did you see that AVX-512 chart Wendel showed!?!). Think about the fact that in today's CPUs, maybe half the chip real estate goes to implementing that stuff, and 98% (estimate) of all users never touch it. To me that's close to a decent argument that variants of tomorrow's CPUs ought to be built without those features, but with twice as many cores instead, but rather than delete those extra instructions completely, you leave a stub on the CPU that just decodes these performance instructions then either dispatches them to an (optional) coprocessor chip that executes them, and if the coprocessor is not there, it emulates them in software. Motherboard makers could include (or not) the socket for this coprocessor chip. Now, I doubt very much whether the industry will go his direction. However, it's at least interesting o ask the question, why not? Why are they making the everyday Tom, Dick and Mary pay for the accelerator instructions that only Sven the Geek is using?
@@jimatperfromix2759 The suggestion you are making was kinda executed on the intel ARC GPUs. Direct X10 and older wasn't directly implemented, more with emulation ( I may be not 100% accurate on that) that's why their performance was on such titles really bad. It'd be interesting how such a software emulation of AVX would perform. Maybe it'd be a solution to place such coprocessors on the motherboard, similiar how they do it with the chipsets... Or even more simple: make such coprocessors in M.2 form factor. got 3 m.2 slots? use two for SSDs, and one for your coprocessor. Or, one for SSD, one for wifi, and the third for your coprocessor. Maybe future boards will have more M.2 slots.
Why would anyone (mainly data centers) give up silicon die space (and performance) for a single use case accelerator, when you can have the silicon use the die space for fast all purpose cores, and have that accelerator be in the PCIe slot? It just doesnt make sense economically.
Because if you have a data center with a specific use it's far easier? If you just need tons of general compute power sure they make no sense. As for vs PCIe, we'll there's a gigachad performance benefit to on die accelerators and their access to the rest of the CPU. You can also get vastly better performance per watt using accelerators, and money is everything in the enterprise. Do you use your CPU to generate images, or do you use your GPU to do that? Same principal.
@@donkey1271 well I would rather have cpu and gpu separately rather than give up my cpu and make space for a gpu. Sure, if it’s specific use cases like Netflix/twitch they can benefit from single purpose CPUs, it’s a niche. But the majority of data centers (AWS, Azure, google) would rather have as much cpu as possible and leave the accelerators to PCIe.
@@asishreddy7729 The amount of money Intel has put into these accelerators-many of which are not new-indicates there is a solid market for it. Things like AVX-512 cannot be done via PCIe. The majority of CPUs in the consumer space have iGPUs, so there's evidence that's also the preferred option for many cases.
@@donkey1271 But if you have a data center with a specific use case why would you waste so much silicon on relatively high performance general compute, not to mention all of the accelerators for tasks other than the one you're focused on...
@5:05 wendell the x58 xeons were overclockeable. All of the w36xx were the silicone equivalent to the famous x56xx but better... cuz they had unlocked multipliers........ but yeah x58 is ancient tehcnology
Intel should have a combination of ASIC and FPGA accelerators. What's the point of buying Altera if you're not going to embed FPGA onto the CPU? Imagine a more configurable CPU that can be updated to support new codecs.
FPGAs aren't that fast though - for application specific tasks they *can* outperform CPUs in some cases but they're never going to hit the point where they're bottlenecked by being a separate add in card, and to even get to that point you need to spend a lot of money on FPGA silicon and developing code for it. And the only benefit is that your CPU might last slightly longer, which is bad for Intel
Intel-Engineer: Hey, we finally have something to make us more competitive with AMD and regain market-share! Intel-Management: Did you figure out a way to glue extra-cores, to the outside of the heat-spreader? Intel-Engineer: No, no, this is actually a clean and innovative solution with like zero-overhead! Intel-Management: Wow, we can charge a butt-load extra for this feature!
Great intro vid! Thanks for posting! Wondering what you believe the easiest, simplest accelerator for A.I. compute would be? Like for Stable Diffusion, etc? Something that would accelerate older hardware possibly? Such as 6th Gen Intel w/ AMD 480/580 8GB GPU's, etc? Thanks!
Far less than what the cores would use to achieve the same throughput without the accelerators, i.e. they free up thermal budget and reduce electricity costs.
That’s a given, I’m wondering about mixed workloads (it’s unlikely you can do everything over accelerators all the time) compared to AMD’s higher multi-core performance with lower power consumption during generalized workloads.
Can you please do a video on AMD's PSP? Its AMD's version of SGX but it's ARM's TrustZone. They also have AMD's SEV (PerVM memory Encryption) Intel is pretty much dead, I'd like to use AMD's accls
Can we please just have a bunch of programmable dsp dma cores inside the cpu? Then do with them as we will, rather than wait to be spoon-feed proprietary black-boxes?
but enterprise raid is a dumpster fire by your own admission... the only way I could see something like the DSA or whatever be implemented that people in the know would actually use is if it were implemented it in a big storage system like preferably zfs, but I really don't see it in the near future.
I'm working on that.. some nvme1.3 drives do actually support the data integrity field! But... There is someone doing it differently you must know about
That's exactly what I'm thinking, it was already like that. Buy a CPU if you need a CPU, buy an accelerator if you need one but don't use space in any of them for what it's not useful. You're still going to pay for those accelerator's even if you don't need it, I doubt intel a "giving" those away for free even if they plan to make money with them later on.
@@EvilDesktop totally! I do think the cost is in IP rather than manufacturing though so I wouldn’t be surprised if they put the transistors on every die and just accepted a low yield (and bin accordingly). There are some obvious bandwidth advantages to having it on-die. I still don’t like it as a customer.
ofcourse big tech prefers more efficient techniques, wth are you talking about?! Especially if it turn out cheaper than getting the same performance by just buyinh more brute force... Also plenty of people bought MMX CPU's... Time to sell those pumped AMD stocks boyz...
Spend as much or more per socket, then pay extra on top of that for/to unlock accelerators. Somehow I don't think there will be much widespread adoption or industry support.
They fact that "pay-as-you-go" made it out into the wild shows that despite Pat G being back in charge the finance people are the ones really running the company. That bodes ill for their long-term future.
I was interested in VROC key and I bought one but the compatible M.2 ssd were old and not easy to find in Europe at least. I send it back after 4 days desperated by this non sense. Then I would eventualy follow up on the list of compatible ssd but month and month without any change. Not only there was very little info about the Vroc key but you could not even find any way to inform yourself on the subject at that time. No contact, no service. With an asus extreme deluxe mobo. I'm one of the sad stupid guy who bought the 9980xe 18 cores at its highest price. 6 month after I bought it the price dropped by more than 50%.
I'm disgust and don't want to hear anything more from them.
Features as Vroc that appeal at first because promessing and then as customer you feel abandoned resourceless and devaluated. What a shame!
This is the scenario for a lot of enthusiasts; vroc had some more success in the OEM space with light rebranding from Dell and HP but.. yeah
pretty crazy to think they're releasing tech without documentation
I gave up on vroc when looking into getting a new server last year. Documentation is beyond bad
Wendell being a better rep than any Intel reps tells many things.
Wendell is just awesome!
In my best cardboard cutout stilted TV reporter voice: Agreed, back to you Steve!
@@franklinlora7380100%
Wendell isn't hamstrung by a stupid marketing boss.
Tells you why Intel's server revenue is down >100%
Between seeing this and the Solidigm video yesterday, I really have to admire just how much Intel gets in Intel's way.
It's not about accelerators being awesome or not-- having them in the silicon but locked out in some licensing scheme feels wrong in several ways 💸
Then don't buy it. Problem solved.
This is how IBM has been doing business for decades
@@Clobercow1 indeed. Buy a CPU that does not need to rely on accelerators in first place. Pay for that CPU, use it, it is all yours, no recurring fees or post-sale transactions. According to sales numbers in the enterprise world, most companies have made their decision.
I'm not going to buy something that needs separate licensing, as there will be so much much worse support for it in software. Why buy something I can barely use?
Then don't buy it. Problem solved.
This is a common issue I find with marketing. They dont understand the product so cant understand how to market it.
What they need to do is get the engineers and such to set the direction and then the marketing guys come in to dress it nicely for sales.
Especially with a technical product theres just too much obfuscation going on.
Depends on the organization how marketing is structured. A large software company I worked for had a dedicated marketing team for sales teams and leveraged us engineers for the deeper technical stuff. They handled all the other things. Intel does go more in depth with partners, but Wendel makes a point. A company like Intel should do a better job with public marketing given their place in the market. Some web marketing for some of these companies tend to stay in the "we've always done it this way" mode.
this video was just dumb.. its not the marketing.. its the exclusivity.. nobody implements features that only run on one type of cpu. thats why openGL and stuff exists.. as long as intel dont open source this and make it a standard.. nobody cares! but what you expect from a guy who thinks microsoftSQL is a thing.. lol..
"execu-lizard bean counter" 🤣
"You don't pay extra for it." Well we probably do, they just decided charge everyone and give it to them needed or not.
Why is this comment 13 days old ?
Sometimes certain people can see videos before they are made public. Maybe this video is that old
@@SaperPl1 patron early access
Must be the reason a equal dellemc amd system with 64 cores and 768gb memory costs 30k less
Except for the ones you need licensing to unlock.
As one of those engineers you mention at around 14:40, I really appreciated this video. Trust me, we have the same questions and concerns you do about Intels marketing and executive team. Also, thanks for calling them out on including AVX in their accelerator list. We don't get that one either.
The unlock model isn’t just annoying, in the long run, it harms the utility of a company’s solutions, and many customers will migrate elsewhere. Buyers often feel like the company is trying to scam them, and this undermines morale of long term devotees of a product line. To sum things up, I hope the executives enjoy their yachts, because the customers are being made to (needlessly) suffer, so someone should benefit, and it isn’t the average customer or even the average stock holder. Oh well…
We need a follow-up on this video. It's been 7 months since the release and I still can't find much of the topic.
Wendel is such an impressive human being
Pretty much everyone in the hardware deep dive journalism space is. Kudos & Congrats! 🥳👍
i honestly don't remember intel pentiums without mmx. i think the amd k62 300 was around at that time, and i don't think it had it. thank you for the informational video. there is so much yet to learn.
...once again the great act of Intel. Here's these accelerators, also Intel we don't know wtf we're doing, also Intel engineer's know exactly what their doing.
If Intels engineers were in charge and not media teams and penny pinchers, its unlikely AMD would ever have made a comeback.
@@donkey1271 it'll be close.
@@donkey1271 that said we can be glad they did because the competition in the market has been super exciting.
I've actually been looking forward to upgrading in the last couple of years.
@@donkey1271 if AMD wouldn't have made a comeback, we would still be stuck at 4 cores and purely x86 chips. no thanks 😆(warning: not 100% serious comment. Please refrain from posting stupid "AMD fanboy" comments)
@@bernds6587 If bean counters didn’t run Intel they’d be where they are now, but like 3-4 years ago.
AMDs return to market forced said bean counters to let the engineers do their thing. They could be the Nvidia of the x86 world if it wasn’t the top execs worrying about profit margins per SKU.
But competition is always good.
isnt this the whole reason why amd bought xillinx ?
Intel either needs to launch a product and keep it around or just remove the licensing burden from their products. They keep making things that sound neat and then trying to build a walled garden around it. VROC, Optane, and now the "Y" pay-as-you-go chips. They need to take the lessons from the last 4 decades of computing and adopt a 2-step approach. Some super simplified "basic" version in all CPUs, and then it's bigger badder paid-for brother in the XEON class of CPUs. This would require the integration to both pieces to be the same to work. Like for example If a basic RAID-0/1 version of VROC was free on all Intel CPUs but a more feature rich version with RAID 5/6/10/etc was available for a $20-$50 via a license key. If VROC was deployed this way instead of everyone has to pay to even test the thing I think it would have been a commercial success for them.
Maybe Intel's messaging is askew because their marketing team doesn't understand their engineering teams and they don't have anyone to translate between them?
This is exactly the problem.
Corporate politics is wack 🤪
The accelerators share the memory bandwidth with the coree. In server workloads, memory bandwidth is almost always the bottleneck. I wonder how useful those cores being idle while the accelerators are doing the work, I imagine them to be crippled, that's a nice idea for benchmark: run a workload that doesn't use accelerators while another workload that uses accelerators is running, and compare the results to when each of them is running alone. I think there is a huge smoking mirror behind those cores looking idle.
You might think that whatever traffic that needs to go through the load balancer has to go through a core, defeating the purpose of the accelerator. But intel has Data Stream Accelerator, which is like a small processor that only executes movs to the WC memory buffer. So if you have a network card, an optimized driver will redirect traffic to the dynamic load balancer with the MOVDIR64B instruction, the only "pollution" to the system would be in the form of the up to two instructions per cycle that would only impact performance if you are scheduling 3 512 bit loads and 2 512 bit stores per cycle. No smoke and mirrors, the documentation is out there, the patents as well, there's a reason why it's so hard to get one of these cpus while every cloud company out there have them.
Yep, it's the new battleground.
It's all about effective compute per watt. Sure your general purpose cores are faster per watt, but are they faster and more efficient per the specific use case?
Intel was planning pay as you go for the consumer desktops before Ryzen 1 came out.
Intel didn't learn their lesson with Itanium
You were able to put into words a thought that had been lingering in my minds for years. oneAPi seems great, but Intel does not really knows how to explain it to developers. It is just a flood of emails, invitation for online events, and you just don't know if it will be around in 5 years. Xeon Phi was also a good product that they scratched with no mercy.
One thing is sure, though... With that CPU under my butt I won't have to pay BMW for heated car seats.
After you graduate from highschool
@@Teluric2 I don't follow...
If i were buying cpu's, i would choose a chip that doesn't artificially limit itself depending on how big my wallet is.
Don''t know why but this is the first time on video I've really noticed Wendells weight loss and he's looking real good!
Accelerators are basically ASICs, right? If you can make a die package that gets the right balance of General Compute with Specific Compute, you hit the jackpot.
It is not easy to get the balance right. Who will use AVX-512? I think less than 1% buyer will use this feature and yet you cripple 99% of users with die space taken and power consumed. It is easier said than done.
@@catchnkill for server users, I’d guess more like 75% of users will take advantage of AVX512, it’s just a question of if they know they’re using it or not.
The first 'accelerators' we had was the fpu chip. And with that you had a CPU and then a FPU slot. I can totally see having an optional accelerator slot for very common tasks. GPU encoding/decoding, compression/decompress, RAID, search (minimum length matching or regex) as well as BigInt and cryptography (1024+ bit).
you talked a bit about pci-e peripherals... I wonder how much slower an accelerator-on-a-card would be compared to an on-cpu accelerator
if not much slower, this could be worth it to increase market penetration by allowing old systems to support these accelerators, but still add them by default to newer systems so devs can be sure that whatever they develop won't be an instant dead end.
is this the return of the oddly specific expension cards? I recall back in the early days of the IBM-compatibles before they took over the "PC" moniker (no i'm not THAT old --barely an adult-- just interested in those fossils don't ask) those were decently common -- separate sound cards, network cards, _even the floppy and hard drive controllers_ were on separate cards, and I'm fairly certain some high-bandwidth devices would have used them.
A key problem of this approach was how many expansion slots these took up -- by the time you had a functional system, you already spent 6 of your 8 expansion slots, so if you want to add a new-fangled Network Card, Serial for a mouse and a Parallel for your printer, OOPS, you're all out of slots.
I wonder if such a fate can be avoided by using M.2 as a sort of "micro-pcie" slot for accelerators, somewhat like it's being used now for some wifi dongles.
PS: Can you talk about the Gaussian Neural Accelerator found in some CPUs? I wonder how fast it is and whether it's a dead end or something useful for _some_ usecase.
Yes, I'd love to see a video on the GNA. At least that seems to be included free of charge :o)
5:32 Yeah, “on stage” it ran at 5 GHz with its 28 cores, cough…
IBM has been doing accelerators literally for decades with mainframes. It provided great value for their customers and ultimately worked very well for IBM.
i haven't been here in a few month and...HOLY SHIT WENDELL LOST SO MUCH WEIGHT 🤩!!!
A lot of tech is being obscured and is getting deep in the wool.
The reality is that unless the vendors really clear their fog, this stuff won't be understood, won't be bought, and won't survive.
The whole cloud and fewer and fewer people with knowledge and access is kinda self defeating.
As for accelerators, nothing new here. Bolting on capability is taking place across vendors. If you can get an advantage, it has to be applicable. Making something no one uses is a dead end.
I think the only solution, really, is for the big cloud vendors to take the lead on this, offer (and promote) new cloud instances that use the new hardware incorporating these cool new technologies, and put together a support team to help their customers use those new instances.
accelerators could be awesome but intel is too greedy to not paywall them and thus, I don't think they'll catch on, at least for a while.
I hope accelerators catch on (not just at intel) but I really hope the whole "lock hardware features with software" crap won't catch on.
Like, paying to unlock seat heating.... what world is this
Intel needs to revamp their entire website from the ground up.
Yeah. Communication on Intel side is really lacking here. For couple of months I was not able to get any straight answer what IAA actually does, what will be api etc.
As for VROC it’s awful right now. RAID5 write sequential performance is terrible and no fix in sight. Not to mention ping pong with abandoning and restoring it as a product in early 2023.
Those features do exist, but adoption rate is terrible. There is no one but Intel to blame here.
That IAA stuff is extremely cool, that is, if you have an application that needs it. It's somewhat of a chicken before the egg problem tho - how are you going to know that you need it unless Intel can somehow figure out how to tell you that it exists, and related to that, until the various software vendors that could really benefit from it, start to implement using it if it is present in your system. But then again, why would they do that if exactly zero of their customers have that IAA feature in their system. This IAA would be totally killer for the various softwares that implement in-memory database systems. Those vendors should be going out and buying an IAA development system, implementing support for IAA in their in-memory database, then update their software marketing material to say "Our software works great in your run-of-the-mill computer, but if you want to set up an in-memory database server that really screams, you have the option of installing our software on a system that has the new Intel IAA ... etc". But don't hold your breath waiting to see that kind of coordination between hardware vendor Intel (who can't even tell you what IAA is) and software vendors. The one IAA question I had that Wendel didn't answer was, if you have (say) some plug-in PCIe card that has an IAA subsystem on it, does that thing share the memory space of the host computer, or would you put a whole bunch of separate memory on the card (and if so, what might be the size limit of such memory).
They've mostly been unused because getting access to them is too hard for developers.
If they were in our AMD or Intel consumer systems we could take better advantage of these API's.
The buisiness/industry adoption will always fail and economies of scale will too unless intel/amd starts putting them in desktop chips. Years ago they said we'd get FPGA's in CPU's still waiting on this... Still integrating the FPGA and Accelerator tasks into the consumer chips would be the ideal scenario and really a better innovation IMHO.
At the minute there is not much reason to use Xeon/Epyc just for this one feature, consider that most contractors would have to buy this CPU and only be able to run this code on the one machine.
How can I implement an accelerator if I can't run it on my machine? And I hate when the make a crutch device which just gives you API access like the Jetson for example when they give you a dev kit but you'd prefer to just have the same chip in your CPU.
Si los ingenieros de Intels estuvieran a cargo y no los equipos de medios y los penny pinchers, es poco probable que AMD haya regresado.
15:50 the thing that blew my mind the most was learning that those robots were controlled by QR Codes on the floor. thats how they know where they are, not any sort of guidance or telemetry software. and occasionally the bots would get confused because the sticker was dirty or torn and misreport where the thing is so the mechanics would have to go out with a handle and drive the bot to a known good spot, replace the sticker, and restart it.
Waooo Wendell you really have lost some weight 🎉
i believe it's a tick bite that started it all. He has a lot of restrictions on what he can consume. so i don't think it's healthy.
Another great video from Wendel. Intel has certainly confused people about these features.
Intel should appoint Wendel to be the official decryptor of Intel's encrypted marketing material.
i have 2 6138Ps xeon scalable with FPGAs
AMD could put some xilinx FPGA chiplet into the CPU die
I would dare to say that's exactly AMD's plan.
I paid extra for a MMX back in the day and LOVED it, smoothest play of the Weezer video of anyone I knew -- but I might have been the outlier
Everyone else just thought Weezer used cheap video equipment to make their music video. If they had seen it on your machine, they would have been amazed.
interesting to see how all this progresses
Intel needs to do 2 things:
1. Put one person in charge with one vision and one message, and spin off any division that's got a problem with that
2. Fix their website and documentation portal to make it easier for people to find out how to take advantage of their products. Even if that person's not an Intel partner and isn't interested in having an account (yet)
And that's all the consulting I'm doing for one of the world's biggest tech companies for free. Call me.
AMD has certainly noticed with the "A.I Engine" that the upcoming Phoenix APUs will have.
With the rumoured Phoenix Halo having a much stronger one as well.
Accelerators are the way to go for future products without adding a ton of cores for specific features and doing it as efficiently as possible.
the rumored one is Strix Halo. Phoenix already released, no other products for that, except maybe a rebrand next year
WE LOVE WENDELL!
2 Wendel : you keep on keeping on 😂
sounds almost like Intel has implemented ASICs on their chips, and if you want to use them, you can unlock the microcode for them.
this would be an advancement, especially if/when AMD implements them as a standard feature.
Although many or most of us probably agree with Wendel that this idea to pay extra later for something that's already there on your chip, is a dumb idea, there is another way to look at it. That is, why should I pay for something that I'm not using right now. Think of all the (now hundreds, probably) of AV* and other add-on instructions that now both Intel and AMD are pretty much stuck with adding to their CPU chips, lest their chips look weak relative to their competitor. Not that inclusion is uniform, in a lot of cases a given chip only implements some subset, so it's market confusion in any event (did you see that AVX-512 chart Wendel showed!?!). Think about the fact that in today's CPUs, maybe half the chip real estate goes to implementing that stuff, and 98% (estimate) of all users never touch it. To me that's close to a decent argument that variants of tomorrow's CPUs ought to be built without those features, but with twice as many cores instead, but rather than delete those extra instructions completely, you leave a stub on the CPU that just decodes these performance instructions then either dispatches them to an (optional) coprocessor chip that executes them, and if the coprocessor is not there, it emulates them in software. Motherboard makers could include (or not) the socket for this coprocessor chip. Now, I doubt very much whether the industry will go his direction. However, it's at least interesting o ask the question, why not? Why are they making the everyday Tom, Dick and Mary pay for the accelerator instructions that only Sven the Geek is using?
@@jimatperfromix2759
The suggestion you are making was kinda executed on the intel ARC GPUs. Direct X10 and older wasn't directly implemented, more with emulation ( I may be not 100% accurate on that) that's why their performance was on such titles really bad.
It'd be interesting how such a software emulation of AVX would perform.
Maybe it'd be a solution to place such coprocessors on the motherboard, similiar how they do it with the chipsets...
Or even more simple: make such coprocessors in M.2 form factor. got 3 m.2 slots? use two for SSDs, and one for your coprocessor.
Or, one for SSD, one for wifi, and the third for your coprocessor. Maybe future boards will have more M.2 slots.
Why would anyone (mainly data centers) give up silicon die space (and performance) for a single use case accelerator, when you can have the silicon use the die space for fast all purpose cores, and have that accelerator be in the PCIe slot? It just doesnt make sense economically.
Because if you have a data center with a specific use it's far easier? If you just need tons of general compute power sure they make no sense.
As for vs PCIe, we'll there's a gigachad performance benefit to on die accelerators and their access to the rest of the CPU.
You can also get vastly better performance per watt using accelerators, and money is everything in the enterprise.
Do you use your CPU to generate images, or do you use your GPU to do that? Same principal.
@@donkey1271 well I would rather have cpu and gpu separately rather than give up my cpu and make space for a gpu. Sure, if it’s specific use cases like Netflix/twitch they can benefit from single purpose CPUs, it’s a niche. But the majority of data centers (AWS, Azure, google) would rather have as much cpu as possible and leave the accelerators to PCIe.
@@asishreddy7729 The amount of money Intel has put into these accelerators-many of which are not new-indicates there is a solid market for it. Things like AVX-512 cannot be done via PCIe. The majority of CPUs in the consumer space have iGPUs, so there's evidence that's also the preferred option for many cases.
Because if your accelerator is on chip then you can avoid the PCIe space and pack in another few nodes.
@@donkey1271 But if you have a data center with a specific use case why would you waste so much silicon on relatively high performance general compute, not to mention all of the accelerators for tasks other than the one you're focused on...
2:30 Data explains how to be second in command. Great TV.
I think fundamentally these would be really interesting as a sort of IO-processor, similar to on mainframes
@5:05 wendell the x58 xeons were overclockeable.
All of the w36xx were the silicone equivalent to the famous x56xx but better... cuz they had unlocked multipliers........
but yeah x58 is ancient tehcnology
“Eyeballs deep” made my day
Can we talk about how wendell just looks fly as fuck these days?
Intel should have a combination of ASIC and FPGA accelerators. What's the point of buying Altera if you're not going to embed FPGA onto the CPU? Imagine a more configurable CPU that can be updated to support new codecs.
For the same reason AMD bought Xilinx. Potential and greed.
FPGAs aren't that fast though - for application specific tasks they *can* outperform CPUs in some cases but they're never going to hit the point where they're bottlenecked by being a separate add in card, and to even get to that point you need to spend a lot of money on FPGA silicon and developing code for it. And the only benefit is that your CPU might last slightly longer, which is bad for Intel
i bought my first computer 2 months before mmx was released. i really missed it
Intel-Engineer: Hey, we finally have something to make us more competitive with AMD and regain market-share!
Intel-Management: Did you figure out a way to glue extra-cores, to the outside of the heat-spreader?
Intel-Engineer: No, no, this is actually a clean and innovative solution with like zero-overhead!
Intel-Management: Wow, we can charge a butt-load extra for this feature!
I love Wendell and Level1Techs!
Great intro vid! Thanks for posting! Wondering what you believe the easiest, simplest accelerator for A.I. compute would be? Like for Stable Diffusion, etc? Something that would accelerate older hardware possibly? Such as 6th Gen Intel w/ AMD 480/580 8GB GPU's, etc? Thanks!
I hope Intel would glue those accelerators for good
Xeon e1600 series also had unlocked CPU's, im still on one now, a 1680v2 8 core @4.4 Ghz 3.9 stock.
Microsoft just dropped the news that their working with AMD directly to spin up AI GPU's code named Athena to help diversify supply
How much power do active accelerators use which is potentially removed from the regular cores’ power budget?
Far less than what the cores would use to achieve the same throughput without the accelerators, i.e. they free up thermal budget and reduce electricity costs.
That’s a given, I’m wondering about mixed workloads (it’s unlikely you can do everything over accelerators all the time) compared to AMD’s higher multi-core performance with lower power consumption during generalized workloads.
Intel has been on a downward trajectory since Andy Grove retired.
Just learned more from your vid than 10 hours of browsing, you RoKK!
11:19 that had me rolling lmao
Can you please do a video on AMD's PSP? Its AMD's version of SGX but it's ARM's TrustZone.
They also have AMD's SEV (PerVM memory Encryption)
Intel is pretty much dead, I'd like to use AMD's accls
I wish I could get an audio file of Wendell just saying 'You're Fabulous!' to use as a ring tone on my cell phone. =^__^=
Can we please just have a bunch of programmable dsp dma cores inside the cpu? Then do with them as we will, rather than wait to be spoon-feed proprietary black-boxes?
looking good Wendz!? Did you lose weight brother?
when will the xeon max 9480 be release?
but enterprise raid is a dumpster fire by your own admission... the only way I could see something like the DSA or whatever be implemented that people in the know would actually use is if it were implemented it in a big storage system like preferably zfs, but I really don't see it in the near future.
I'm working on that.. some nvme1.3 drives do actually support the data integrity field! But... There is someone doing it differently you must know about
Could anyone please properly implement Open ZFS in Windows?
AMD would be using FPGA for their accelerators.
I don’t like accelerators being bundled on the CPU. I wish they were connected via PCIe or other lower latency.
That's exactly what I'm thinking, it was already like that. Buy a CPU if you need a CPU, buy an accelerator if you need one but don't use space in any of them for what it's not useful. You're still going to pay for those accelerator's even if you don't need it, I doubt intel a "giving" those away for free even if they plan to make money with them later on.
@@EvilDesktop totally! I do think the cost is in IP rather than manufacturing though so I wouldn’t be surprised if they put the transistors on every die and just accepted a low yield (and bin accordingly). There are some obvious bandwidth advantages to having it on-die. I still don’t like it as a customer.
It's a bit silly that Intel doesn't know their own tech
Intel is done. Microsoft is pumping big money into AMD for AI chips.
Ill take cores.
ofcourse big tech prefers more efficient techniques, wth are you talking about?! Especially if it turn out cheaper than getting the same performance by just buyinh more brute force...
Also plenty of people bought MMX CPU's...
Time to sell those pumped AMD stocks boyz...
17:30 Trying so hard not to laugh when IPP or I'm going to have to clean up.
Semiconductor inside,conductor outside. Noonkaalamma inside, bodamma outside. Sensor touching inside , baaby touchong outside , inside outside poetry on nookalamma and chingry by jhamvi kapoor
Is this Optical Bridging promised 15yrs ago ?
H.266 for AVX-512
Ipp "you know me".
Spend as much or more per socket, then pay extra on top of that for/to unlock accelerators. Somehow I don't think there will be much widespread adoption or industry support.
very curious for Epyc Turin
PAYG licensing is just wrong. You own the silicon.
Eye ball deep. Clever.
Wow... how did I miss this intel (Intel??), and someone's been working out! That's what's up... ✌🏽
Awesome Video. This is the what I've been waiting for from Intel themselves. Next do a deep dive into their HBM!
I listened for more than 5 minutes, and I am afraid I still have no idea what accelerators are. I give up
I think the rgbleds in the background mess the color of either the video or my screen 😅. Otherwise really great video.
Consider me still confused, I wonder if this is even sellable.
ISA-Extensions are far more easy to understand, a chip has it or it does not.
Memory mapped I/o is ez. It's legit clever... If wide adoption
So they are making cpus obsolete faster now?
I hear accelerator and think Amiga
Intel trying to be like Apple know with their pricing schemes?
More like Sun/Oracle than Apple. Those guys charged out the nose and nickel and dimed for every feature.
I was wondering who Glen Berry was lol
They fact that "pay-as-you-go" made it out into the wild shows that despite Pat G being back in charge the finance people are the ones really running the company. That bodes ill for their long-term future.
After dozens of fake starts ... will Intel finally deliver ?
basically its AMD's fusion concept. right?
makes no difference, the prices are so ridiculous.
wifi better then lan connection?!?