USELESS GARBAGE, STOP RUINING THE PLANET BY CONSUMING ALL THE ELECTRICITY WITH YOU SHIT BOX COMPUTERS, SERVERS SERVER FARMS AND DATA CENTERS AND FUCK YOUR AI. I SEE YOU ARE A IPA DRINKING DOUCHE BAG, IT IS OK TO ADMIT YOU DON'T LIKE IPA, NO ONE LIKES TO DRINK IT AND ONLY HIPSTER DOUCHE BAGS DRINK IT. FUCKING DOUCHE
It appears something in my video has made you upset. Text HOME to 741741 to connect with a volunteer Crisis Counselor, especially if you think you may be a danger to yourself or others. They offer support for Self-Harm, Suicide, Depression, Eating Disorders, Anxiety and Gun Violence.
I run pihole on a raspberry pi, its been working for years now. However I want to pick up some old Xeons for a bit and a byte. Thanks for the video and supporting Intel, lots of FUD out there. Looking at you MLID on snoozetube, his days are numbered . Intel will come back and when they do MLID will go broke. He begs for money on snoozdtube, pathetic
To be honest, I really like the comparison against such old servers almost more because it really shows how much performance improvement you can get a couple of generations later instead of which is the current or last gen top tier and comparing it to that. This is a more real world approach because as you said, most companies or people don't buy a new server every year but more like every 5-7 years. Ofc these synthetic benchmarks don't tell the whole truth but in general a comparison like this is more useful in practical terms
Despite this being your first new flagship server review, this might be my favorite one I've seen so far. I don't have the kind of budget to upgrade hardware constantly, and it really gives me a unique perspective that tech media like Serve The Home does not. I still really like Patrick's content to be clear, but this was just far more applicable to how I'd want to use servers. Also, super well written in my opinion. Like your style! Great job!
Xeon 6 is even more fun with 6TB of 8800MT/s MRDIMMs! I'm currently demoing such a maxed-out server at SC24 in Atlanta, with FluidX3D HPC :) Basically pinned that server to 100% AVX512 load on all 512 threads, 100% memory capacity load, and 100% memory bandwidth load, for an entire month, for one gigantic fluid dynamics simulation.
Hell yeah dude! Put those things to work and greetings from R&D! We had our fun with GNR a while ago, and now it's in the hands of people doing really interesting things with them.
Congrats Jeff! Happy you get to get your hands on server equipment some of us only dream of. Thank you for all of the content over the years and congrats that it's paying off!
As someone who actively manages anything from Haswell Xeons up to Emerald Rapids, I can indeed confirm the SMB space is very stingy and will keep a server around for close to 10 years as long as the parts are still available. The step up from Haswell and Broadwell to even Skylake was pretty decent because of all the extra cores and better pricing on Xeon Silver especially.
See this is the thing though, if your needs haven't changed and the hardware still supports you, and especially when you only have a couple and the power costs are less than a rounding error, why would you commit capital to an upgrade you won't notice rather than using it to build the business?
IT life cycle refreshes can also be seen as a form of preventative maintenance, even if there's no particular need for a performance upgrade, as they might ensure availability of replacements or even support.
The sound is fine and all, but when you have 50 racks of machines like that and you drop a workload on them, that screeching gets to your soul pretty quickly - that is if they have properly configured BMC... This is the type of sound that no amount of professional hearing protection with active NC is going to neutralize, it gets transmitted over your bones. After a while it really grinds on you and you get tired from it extremely quickly. I spent several years as a tech and engineer in a datacenter / server lab, and in the end I had to switch jobs position because despite loving my job I got hardcore revulsion to going into high noise environment. Like I'd get physically sick after 15 minutes in. Stay safe, and never get into datacenter without hearing protection, even for "quick 30 seconds". It's not worth it. This is also why I am so happy about liquid cooling getting more popular, and products like Mitac's all-liquid system, which has literally single fan for PSU distro board electronics reignite my passion. Not only it's more power efficient, but also allows humans back into DC. It also lowers costs of sound-proofing (Imagine NOT having to leave empty floor between offices and datacenter because of noise).
Hello Jeff, openssl bench at ~11:00 SHA256 is not an encryption algorithm, it's a HASHING algorithm ... please use AES next time ^^ RSA is asymetric encryption, is not used much, mostly to encrypt a Symetric Key ^^
You're correct on SHA being hashing, not encryption. It's a minor mistake. But RSA is a fine test to have for a server. If the data throughput per client is low but the number of clients is high, which is common enough, SSL handshake can certainly dominate. Esp considering AES is accelerated on all modern CPUs.
at this point we need to be testing these things with 25 vm's all running cinebench at the same time. i like the perspective you've taken here, love the real world angle instead of just "new number is bigger"
Broadwell and Skylake as comparisons makes a ton of sense. Work still has plenty of boxes nearly that old in production right now, 6-7 is our standard outside of a few groups who run 5 like clockwork and a few run more like 10 years...
The likely reason why SkyLake-SP did so poorly on the pi test is due to AVX-512. The clock speed throttles significantly on AVX-512 workloads that often well optimized AVX2 would be faster. This is on top of SkyLake-SP's already reduced clock speeds from Haswell-EP/Broadwell-EP. As for tests that could load all the cores, there genuinely aren't many outside of HPC that can do it via a single software instance. However with that many cores consolidated into a single system, you can start thinking about running entire production environments. Think a big backend database, plenty of middleware services and front end web serves that are elastic in could to spread the load. Then run metrics like concurrent users or transactions under a minimum SLA time.
Okay, “a eye a eye a eye a eye” on a beer channel was not in my bingo card…. 😂😂😂 Also, it’s not really the cost, but the qualification for newer servers. Newer gens of servers could be cheaper and were most likely cheaper, but they will stick with what they have installed until the tech is obsoleted and they are forced to move.
Thank you for putting forward a long-term review for this. You're right, you can't just run a series of benchmarks to really comprehend how this thing can improve upon the previous generations.
I'd be curious how ollama performs in terms of tokens per second--I didn't understand the chart at 2:45. The larger Qwen 2.5 & Llama 3 models are 70GB, and Llama 3 has a 405B model which is difficult to fit onto a consumer GPU. If the server compares well to a bunch of NVIDIA 4090's, it may still be cheaper than an H100. For reference, it takes 2 4090's to run a 70B model using 4 bit quantization, or 12 4090's to run the 405b model. I rented a computer with four 4090's for a couple hours and tried mistral & llama, and they were pretty slow.
This might be a good reason to subscribe to your Patreon! I have a pair of 2699v3s and 2697v4s! I was wanting to upgrade to a newer platform but this right here is why i haven't. Definitely want to move several generations for sure now. Thanks for posting this Jeff! Long time veiwer first time poster.
Since you enjoyed the DMT (Dare Mighty Things), they have nearly a dozen variations of that beer where they only change the hops being used. You may also want to get your hands on their several other IPAs (hazy, NE, etc) including their flagship Gunpowder IPA.
Jeff, you're completely correct in that CPU power is just one dimension of scalability. In my world (SAAS webapps) CPU is rarely the primary aspect I look at, and frequently storage or networking (or both) is the high pole. If you have a use-case that scales in parallel then the Xeon 6 is your boy. More likely in my cases I have lower CPU, but latency is important, and I would be loading across many boxes for HA and disaster scenarios. You could load this server up for density, but I'd have to see the numbers as to if that even made sense, given other business requirements outside of the pure hardware specs. Xeon 6 crushes, even in single threads, and would be worthwhile for lots of scenarios I'm sure, but I would be wary of too many eggs in one basket, and plan for redundancy. And then things get quite expensive, and my Skylake Gen 2 Xeons make much more sense at absolute cost and power usage.
I'm at 6:07 and I think I have to turn it off or else the broadwells in the rack in the corner of the room will hear what you're about to say and...well...I don't KNOW if robots cry but whatever happens next is gonna be close to that I think....
It would be interesting to see performance per watt and performance per $$ if built today. I’m in the processing of building two Xeon E5 v4 servers and they are dirt cheap to put together.
14:00 One thing i do to test CPUs is grab a bunch of recorded TV and movies that havent been transcoded yet (MPEG2.TS) and use TDARR to transcode them to H265 or AV1 then time a bunch of different systems and see how many each use from the wall during the test(this was how i found out my M1 Mac Mini was less efficient than the Lenovo 4750U based laptops they assign at work though probably only because group policy forces efficiency/disables turbo/something else. You can run these tests concurrently by having a single source folder, and then telling each machine to output the files locally, or their own output folder on the server, this also makes the test easily repeatable if you screw something up like, say, not having enough concurrent transcodes to take advantage of the thread count, or forget to pass through all audio channels and it accidentally strips the 5.1 English an replaces it with with mono/stereo voice over or Spanish because the broadcast mis-labeled those channels, or accidentally forgot to turn of anamorphic, auto-cropping, or forget to remove the output resolution limit
Jeff -- Giving a thumbs up because of your excellent flawless review and because of one rant below. This is America a free country so if you want to run up your electric bill you do not need anybody's permission to do so. Most computer geeks would agree running a bleeding edge server makes for interesting benchmark reviews. Testing for hardware reliability is also important. The numbers on this box are amazing. They had to spend some mating hardware for that performance. Not something a home computer craftier could easily build. I will be back for the update and your analysis on how this all works out. :)
I love your server builds and I love exploring the world of tech with you. I just wanted to share some positivity to a fellow tech enthusiast and true professional.
I think this is a great comparison for a lot of IT folks/homelabbers who are familiar with 'old' xeons but maybe don't get to play with the new hotness as soon as it comes out. Very interested to see what the hosting data looks like, it will probably be a while before I get to play with the new Xeons yet but they look pretty decent regardless of AMD stomping all over them right now (and NVIDIA stomping all over everything).
Back in 2016/2017 I was inspired by y-cruncher so much that I decided to build a giant workstation just to try and calculate a record number of pi digits. As time went on, I discovered I needed an ever bigger setup, until I finally acquired 2x E5-2699a v4 CPUs, paired with an Asus workstation board, and 1GB of DDR4 RAM. I got a pile of Optane PCIe drives, a SATA RAID controller, some Intel DC SATA drives, and some Hitachi enterprise 10TB drives. I was all set, and then the v5 Xeons came out. I was disappointed they would outperform my v4 rig and someone else would beat me to calculating the record Pi. I got massively set back when the record calculation was done using cloud computing. So I just abandoned my dream of calculating a huge Pi number. The workstation sits next to me, and every now and then my son and I would fire up a minecraft server on it, dedicating all of the RAM to minecraft (which works really well, actually). What's interesting with that story is, according to your tests, the v5 Xeon wasn't outperforming your v4 Xeon, at least for y-cruncher. Interesting!
Im thinking a good chunk of home labers would rather run AI on their own machine rather than someone else's. So a cpu based AI might be useful in that scenario.
Our organization has gone all Lynn since the early 90s on AI and machine learning going back to Bell labs and a UNIXAMD processor from a TNT’s information technology platform so I will be more interested as well is seeing the application applied for aei and SOLUTION in machine, learning as well as network and capabilities of similar hardware that you were just displaying on this podcast today
Great video!! Is it posible to include "NetBench" to your benchmarks? Cheers!
14 วันที่ผ่านมา
Nice review :-) Can't wait to see it compared to AMD Epyc / Ampere ARM.. 😁 Tho, real life scenario.. replacing say.. 15 Dual X.Scalable servers, by 3 of those.. sounds nice.. but, how to maintain Ceph/Vsan redundancy/consustancy in case of a server down ? How many nvme fits in each unit ? It's not meant to reduce server count, but to increase performance per unit.. you'll probably end up mixing this hardware, with some used Epyc, if you whish to provide some kind of redundancy to your Patreon :-) (or you'll run into trouble when you'll need to reboot this host to load a new microcode and someone uses an instance as a shared git runner.. 😅) + Will those die like Intel 13/14th gen CPU ? Only time will tell.. 😊
New Server Day is one of my favorite days. I have a new server on its way--not as bonkers as yours, but new is new--and I'm looking forward to it. In the meantime I live vicariously.
3:10 I, at the very least WOULD be interested in that. Mainly in the context of say, using such an AI to control a smarthome or turn it into a virtual secretary by feeding it stuff like emails, lists, documents, and allowing it some level of autonomy to do things like respond to emails, manage the documents, outright formulate new ones, etc.
What about compiling all of Gentoo? I think there are slave compiler images out there and you could boot them up on a lot of virtual machines and then run one master VM.... Maybe even cross compile If that's more difficult.
Can I ask you to maybe also try folding at home or BOINC or something of same type on this? Basically for giggles, and perhaps utilizing a split to a multitude of VMs, but I think it'd be fun to see the results
For anyone who struggles with alcohol like I do, join me with a cup of tea or coffee! Sober gang here and we still love craft computing! Keep it up Jeff!
How about testing some distributed compute project, like Folding@Home or Boinc? And what is the power efficiency in such tasks compared to consumer/older hardware.
FAH has a backlog of CPU only WU’s. Would be cool. The PPD would be the benchmark though and it wouldn’t be as impressive as you’d expect relative to GPU’s. Not apples to apples at all but I got a quad Xeon Phi 7210 machine (4 node) and it performa about a GTX 2080TI over 1024 threads.
My main two reference points for Server Performance is how fast can the server generate a Minecraft Forge Modded Chunk or how fast can it run a ERP SQL Server Stack.
Level1techs showed in in last video an ExpressSlot Slide MB204MP-1B and said it's useful for some server /data center applications with eight memory channels or with flight simulator. Could you add this in your setup and try again some testes, please?
How about MAFFT for whole genome alignments. I used to do this on a Z820 with dual zeons & 256 gb ram. It took a couple of days. Might need some GPUs - I think I did it with a very pedestrian quadro 2000 back in 2015.
0:11 - looks like you want to put the 'loud' in your own personal 'cloud' :D - in something like this perf/watt and perf/core are probably "reasonable" ways to normalise these numbers - as $/core also scales with quite a vicious curve ...
5-7 years? where do they have hardware that new? one of our servers will celebrate it's 14th birthday in few months (should really push for some refresh, the electricity savings may pay for it quite fast)
I think we need to normalize a new term for ai, something like intelligence simulation. Or simulated intelligence. Because even though it is artificial it isn't actual intelligence. It's a simulation of an intelligence style response. Behind the scenes it's still just math and algorithms based on previous human interaction.
AI is just marketing bull stuff that sales people use to sell their latest SaaS to milk as much money out of unwitting CEOs that fall for the buzz words everytime....
Excellent point; at this stage machine learning is not truly artificial intelligence as the original definition of it describes. However, do not forget that existence itself as we know it is also just math and algorithms under the hood.
@@ilmeryn it's more accurate to say that it *seems* like reality is math under the hood. Until we have a Theory of Everything (of at least physics but ideally also for e.g. hard problem of consciousness) we can't say for certain that there isn't something beyond mathematical description. Or at least, mathematics as could be developed and understood by humans.
AI is ok but really its like 20% at most for most markets [really its less, but because so many are model training/dev work they skew the numbers]. They are just chasing AI because the margin is so much higher. vGPU is still a big deal (especially if you are avoiding Nvidia tax). Then its micro containers. Not much VM outside of parts of the windows stack anymore.
I'm interested in AI, but not running it on a server CPU. A llama 3 8B model (from the Intel slide) easily runs very fast on a 4060. If AMDs upcoming Strix Halo does support ROCm I would love to see LLM and Diffusion performance. Particularly running a 70B or higher LLM.
Don't the Xeon's have the 'quick assist' accelerators that are meant for encoding/decoding/transcoding of videos? I'm pretty sure they have some accelerator on them that is for that, I just can't remember the name of it ff hand... Well I mention this to tell you to MAKE SURE this is used in some way during your 'real world' stress testing. You'd be leaving one of the GREATEST BENEFITS of choosing Xeon over Epyc.... Running a Plex media server should suffice, I think they transcode just to stream/play a video.. this way you're putting SOME of that accelerator horsepower to use when you're doing you're stress testing :)
I don't actually need a big CPU server like that for machine learning, my 'old banger' gaming desktop works fine, as long as the model size and datasets aren't too big. I created one that gets 87.7% map50 on Coco, I know what I'm talking about and I don't need a big server, to do that.
I worked on the Intel 3 node that these are made on! Super cool to see hardware leave the labs and actually doing things. Personally, I can't wait to see what people do with Clearwater Forest. Imagine 288 cores/threads in each socket.
It's little sad, but... nothing, much. Sierra Forest and CWF are niche low-volume products. They have their use case very clearly defined and it's unlikely to be seen outside of it. It's very good at it, but it's not general computing. On the other hand, seeing generational uplift between Sapphire/Emerald and Granite, I am really excited for Diamond. Any word on PCie Gen6 or CXL 3.0?
@Vatharian I can't say where those are just yet, but they are in the works in some form. As for the Xeon-Es being sad, I don't agree. Skymont is a powerful core as is, and Darkmont bulks it up a little bit more. 288 of those is going to make for a beastly chip for whatever task can leverage that many threads.
Sierra Forest is out now, and CWF is DEFINITELY going to make an impact when it launches next year. High efficiency is in demand right now for hyperscalers.
@@CraftComputing Efficiency and core density is what those chips are all about. Putting 144 E-cores on a modern node in a socket with a reasonable TDP is already quite a strong competitor for the right segment, and going up to 288 with all the IPC gains of Darkmont will be huge.
How long term might this co-lo opportunity last? I personally would rather become a patreon to support you for the benefit of running something on this than pay a service provider for a VPS. Especially since I am a professional service provider and would actually put my stuff on my own infrastructure, but might have personal experiments or not production critical stuff I might want to experiment with that I would not typically want to run on my own production servers. I don't even really know what my use case is at the moment, but from time to time I have an idea for something I want to try out, but not enough to justify putting in my production environment or to pay for a VPS. The addition nudge of knowing it is actually supporting you as a creator might just swing it being worth paying some money out to experiment with whatever idea. Some of the things I have been entertaining the idea of include LibreChat, various internet archiving like Archive Box, or setting up Manyfold to manage my 3D Printing models.
If you're looking for a real world stress test, make it a minecraft server capable of thousands of players, then tell your audience to join at the same time. That would be fun. 10,000 player minecraft server. Just be sure you got some god tier internet speeds.
I'm just upgrading to E5 v3/v4, because they just came to the point of... I can effort them (and don't need to pay power at work) + I get a lot of free upgrade parts from my customers.
yabs. these will be refurbs in 5 years - 2-3 gpu would be mandatory run nested vms, i think the turin will do better but they are both capable, you probably want a 10g uplink; turin consolidation ratio is 7:1
USELESS GARBAGE, STOP RUINING THE PLANET BY CONSUMING ALL THE ELECTRICITY WITH YOU SHIT BOX COMPUTERS, SERVERS SERVER FARMS AND DATA CENTERS AND FUCK YOUR AI.
I SEE YOU ARE A IPA DRINKING DOUCHE BAG, IT IS OK TO ADMIT YOU DON'T LIKE IPA, NO ONE LIKES TO DRINK IT AND ONLY HIPSTER DOUCHE BAGS DRINK IT. FUCKING DOUCHE
It appears something in my video has made you upset. Text HOME to 741741 to connect with a volunteer Crisis Counselor, especially if you think you may be a danger to yourself or others. They offer support for Self-Harm, Suicide, Depression, Eating Disorders, Anxiety and Gun Violence.
🤣
Damn, looks like poor Chappy has caught the rage... Guess it's time we take him behind the shed before he hurts anyone... 😥
@@CraftComputing MVP response! I like it!
That escalated quickly
Oh, perfect for my new Pihole server...
With a TrueNAS server
@@VistasSrinagarun will probably eat 90% of the ram with just truenas lol
@@flop-z4l that's how zfs works......
I run pihole on a raspberry pi, its been working for years now. However I want to pick up some old Xeons for a bit and a byte. Thanks for the video and supporting Intel, lots of FUD out there. Looking at you MLID on snoozetube, his days are numbered . Intel will come back and when they do MLID will go broke. He begs for money on snoozdtube, pathetic
@@flop-z4l buddy try unraid if you hate zfs, you will not like the performance aspect of non-zfs nas’s
To be honest, I really like the comparison against such old servers almost more because it really shows how much performance improvement you can get a couple of generations later instead of which is the current or last gen top tier and comparing it to that. This is a more real world approach because as you said, most companies or people don't buy a new server every year but more like every 5-7 years. Ofc these synthetic benchmarks don't tell the whole truth but in general a comparison like this is more useful in practical terms
still much higher chance to find a xeon e5 in homelab anyways
Agreed
The cloud hosting for patrons is the best performance gaguing idea I have heard in years.
Despite this being your first new flagship server review, this might be my favorite one I've seen so far. I don't have the kind of budget to upgrade hardware constantly, and it really gives me a unique perspective that tech media like Serve The Home does not. I still really like Patrick's content to be clear, but this was just far more applicable to how I'd want to use servers.
Also, super well written in my opinion. Like your style! Great job!
Xeon 6 is even more fun with 6TB of 8800MT/s MRDIMMs! I'm currently demoing such a maxed-out server at SC24 in Atlanta, with FluidX3D HPC :)
Basically pinned that server to 100% AVX512 load on all 512 threads, 100% memory capacity load, and 100% memory bandwidth load, for an entire month, for one gigantic fluid dynamics simulation.
What are you using this monster for, and who pays the bills 😂
@@notaras1985says that in the last 5 words
edit: also, he works at Intel
Hell yeah dude! Put those things to work and greetings from R&D! We had our fun with GNR a while ago, and now it's in the hands of people doing really interesting things with them.
Progress in Fluid dynamic became much and much intensive. A lot of in non-linear equation you can't solve other ways, expect in numerical simulations.
Good to know someone is still working on DA Hyperdrive 🤣
Congrats Jeff! Happy you get to get your hands on server equipment some of us only dream of. Thank you for all of the content over the years and congrats that it's paying off!
"Unlimited power" - who do you think you are, Palpatine? Crossing over from Spaceballs back to Star Wars?
Oh and 1:23 - that was smooth!
0:05 That sweet sound of a server spooling up its fans, getting ready to takeoff...
WHAT?!?!!?!?!?!
Just like Top Gun. 😆
@@alchemystn2o Basically Top Gun. Just need Danger Zone playing in the background...
85db(A) is like old metro trains levels of pollution 😅
The noise of a server's fans spooking during startup gets my blood pumping like when they push the engines to take off thrust on an airplane
Now THAT is a server. good grief. I want.
As someone who actively manages anything from Haswell Xeons up to Emerald Rapids, I can indeed confirm the SMB space is very stingy and will keep a server around for close to 10 years as long as the parts are still available. The step up from Haswell and Broadwell to even Skylake was pretty decent because of all the extra cores and better pricing on Xeon Silver especially.
See this is the thing though, if your needs haven't changed and the hardware still supports you, and especially when you only have a couple and the power costs are less than a rounding error, why would you commit capital to an upgrade you won't notice rather than using it to build the business?
Because it will bite you a few years in. And then the capex isn't planned in because "it's worked so far, why change"
IT life cycle refreshes can also be seen as a form of preventative maintenance, even if there's no particular need for a performance upgrade, as they might ensure availability of replacements or even support.
The sound is fine and all, but when you have 50 racks of machines like that and you drop a workload on them, that screeching gets to your soul pretty quickly - that is if they have properly configured BMC... This is the type of sound that no amount of professional hearing protection with active NC is going to neutralize, it gets transmitted over your bones. After a while it really grinds on you and you get tired from it extremely quickly. I spent several years as a tech and engineer in a datacenter / server lab, and in the end I had to switch jobs position because despite loving my job I got hardcore revulsion to going into high noise environment. Like I'd get physically sick after 15 minutes in.
Stay safe, and never get into datacenter without hearing protection, even for "quick 30 seconds". It's not worth it.
This is also why I am so happy about liquid cooling getting more popular, and products like Mitac's all-liquid system, which has literally single fan for PSU distro board electronics reignite my passion. Not only it's more power efficient, but also allows humans back into DC. It also lowers costs of sound-proofing (Imagine NOT having to leave empty floor between offices and datacenter because of noise).
Hopefully more people start wearing protection even for other lines of work like using machinery in a workshop…
Hello Jeff, openssl bench at ~11:00 SHA256 is not an encryption algorithm, it's a HASHING algorithm ... please use AES next time ^^ RSA is asymetric encryption, is not used much, mostly to encrypt a Symetric Key ^^
I tested both SHA256 and RSA4096.
@@CraftComputing SHA256: hash data to secure integrity
You're correct on SHA being hashing, not encryption. It's a minor mistake. But RSA is a fine test to have for a server. If the data throughput per client is low but the number of clients is high, which is common enough, SSL handshake can certainly dominate. Esp considering AES is accelerated on all modern CPUs.
I also wasn't advocating what you should use, just showing performance for both.
@@CraftComputing indeed you did, and thank you for that. AES encryption is a very valid test as its hardwired in the xeon cpu
at this point we need to be testing these things with 25 vm's all running cinebench at the same time.
i like the perspective you've taken here, love the real world angle instead of just "new number is bigger"
Not as fast as my Dell optiplex I just upgraded recently with a SSD 😎
It's the same boost in performance, honestly 😅
Broadwell and Skylake as comparisons makes a ton of sense. Work still has plenty of boxes nearly that old in production right now, 6-7 is our standard outside of a few groups who run 5 like clockwork and a few run more like 10 years...
The likely reason why SkyLake-SP did so poorly on the pi test is due to AVX-512. The clock speed throttles significantly on AVX-512 workloads that often well optimized AVX2 would be faster. This is on top of SkyLake-SP's already reduced clock speeds from Haswell-EP/Broadwell-EP.
As for tests that could load all the cores, there genuinely aren't many outside of HPC that can do it via a single software instance. However with that many cores consolidated into a single system, you can start thinking about running entire production environments. Think a big backend database, plenty of middleware services and front end web serves that are elastic in could to spread the load. Then run metrics like concurrent users or transactions under a minimum SLA time.
Okay, “a eye a eye a eye a eye” on a beer channel was not in my bingo card…. 😂😂😂 Also, it’s not really the cost, but the qualification for newer servers. Newer gens of servers could be cheaper and were most likely cheaper, but they will stick with what they have installed until the tech is obsoleted and they are forced to move.
Would make a hell of a proxmox server(commenting before seeing the video)
Great video! Love seeing the power of systems like these. Thank you!
Thank you for putting forward a long-term review for this. You're right, you can't just run a series of benchmarks to really comprehend how this thing can improve upon the previous generations.
I'd be curious how ollama performs in terms of tokens per second--I didn't understand the chart at 2:45. The larger Qwen 2.5 & Llama 3 models are 70GB, and Llama 3 has a 405B model which is difficult to fit onto a consumer GPU. If the server compares well to a bunch of NVIDIA 4090's, it may still be cheaper than an H100. For reference, it takes 2 4090's to run a 70B model using 4 bit quantization, or 12 4090's to run the 405b model. I rented a computer with four 4090's for a couple hours and tried mistral & llama, and they were pretty slow.
I'd be very interested to compare AV1-SVT encoding performance...
This might be a good reason to subscribe to your Patreon! I have a pair of 2699v3s and 2697v4s! I was wanting to upgrade to a newer platform but this right here is why i haven't. Definitely want to move several generations for sure now. Thanks for posting this Jeff! Long time veiwer first time poster.
As an OpenStack engineer, I would love to see a video on how you’re going to deploy it.
I'll be walking through some of the process when I move it to the Colo.
Since you enjoyed the DMT (Dare Mighty Things), they have nearly a dozen variations of that beer where they only change the hops being used. You may also want to get your hands on their several other IPAs (hazy, NE, etc) including their flagship Gunpowder IPA.
I had a There Their They're the other day, and that was fantastic as well. A member of my Patreon sent over a small care pack :-)
@@CraftComputing Good deal! You're going to have to convince that Patreon to send more. Hopefully, the care package also included a Terminal IPA.
Jeff, you're completely correct in that CPU power is just one dimension of scalability. In my world (SAAS webapps) CPU is rarely the primary aspect I look at, and frequently storage or networking (or both) is the high pole. If you have a use-case that scales in parallel then the Xeon 6 is your boy. More likely in my cases I have lower CPU, but latency is important, and I would be loading across many boxes for HA and disaster scenarios. You could load this server up for density, but I'd have to see the numbers as to if that even made sense, given other business requirements outside of the pure hardware specs. Xeon 6 crushes, even in single threads, and would be worthwhile for lots of scenarios I'm sure, but I would be wary of too many eggs in one basket, and plan for redundancy. And then things get quite expensive, and my Skylake Gen 2 Xeons make much more sense at absolute cost and power usage.
I'm at 6:07 and I think I have to turn it off or else the broadwells in the rack in the corner of the room will hear what you're about to say and...well...I don't KNOW if robots cry but whatever happens next is gonna be close to that I think....
It would be interesting to see performance per watt and performance per $$ if built today. I’m in the processing of building two Xeon E5 v4 servers and they are dirt cheap to put together.
14:00 One thing i do to test CPUs is grab a bunch of recorded TV and movies that havent been transcoded yet (MPEG2.TS) and use TDARR to transcode them to H265 or AV1 then time a bunch of different systems and see how many each use from the wall during the test(this was how i found out my M1 Mac Mini was less efficient than the Lenovo 4750U based laptops they assign at work though probably only because group policy forces efficiency/disables turbo/something else.
You can run these tests concurrently by having a single source folder, and then telling each machine to output the files locally, or their own output folder on the server, this also makes the test easily repeatable if you screw something up like, say, not having enough concurrent transcodes to take advantage of the thread count, or forget to pass through all audio channels and it accidentally strips the 5.1 English an replaces it with with mono/stereo voice over or Spanish because the broadcast mis-labeled those channels, or accidentally forgot to turn of anamorphic, auto-cropping, or forget to remove the output resolution limit
Jeff -- Giving a thumbs up because of your excellent flawless review and because of one rant below. This is America a free country so if you want to run up your electric bill you do not need anybody's permission to do so. Most computer geeks would agree running a bleeding edge server makes for interesting benchmark reviews. Testing for hardware reliability is also important. The numbers on this box are amazing. They had to spend some mating hardware for that performance. Not something a home computer craftier could easily build. I will be back for the update and your analysis on how this all works out. :)
I love your server builds and I love exploring the world of tech with you. I just wanted to share some positivity to a fellow tech enthusiast and true professional.
I thot this was flight sim video at the beginning! 😂
I think this is a great comparison for a lot of IT folks/homelabbers who are familiar with 'old' xeons but maybe don't get to play with the new hotness as soon as it comes out. Very interested to see what the hosting data looks like, it will probably be a while before I get to play with the new Xeons yet but they look pretty decent regardless of AMD stomping all over them right now (and NVIDIA stomping all over everything).
In 5 or so years time, yes
Back in 2016/2017 I was inspired by y-cruncher so much that I decided to build a giant workstation just to try and calculate a record number of pi digits. As time went on, I discovered I needed an ever bigger setup, until I finally acquired 2x E5-2699a v4 CPUs, paired with an Asus workstation board, and 1GB of DDR4 RAM.
I got a pile of Optane PCIe drives, a SATA RAID controller, some Intel DC SATA drives, and some Hitachi enterprise 10TB drives. I was all set, and then the v5 Xeons came out. I was disappointed they would outperform my v4 rig and someone else would beat me to calculating the record Pi.
I got massively set back when the record calculation was done using cloud computing. So I just abandoned my dream of calculating a huge Pi number. The workstation sits next to me, and every now and then my son and I would fire up a minecraft server on it, dedicating all of the RAM to minecraft (which works really well, actually).
What's interesting with that story is, according to your tests, the v5 Xeon wasn't outperforming your v4 Xeon, at least for y-cruncher. Interesting!
It's awesome to see Intel letting Jeff here test the new gear, about bloody time actually.
Im thinking a good chunk of home labers would rather run AI on their own machine rather than someone else's. So a cpu based AI might be useful in that scenario.
How much did you pay for your dual Xeon 6 server? I'm curious.
cool video, it would also have been interesting to see how much power each system used on each of the tests.
great video more like this please!!!
That's the sound of computing!
Our organization has gone all Lynn since the early 90s on AI and machine learning going back to Bell labs and a UNIXAMD processor from a TNT’s information technology platform so I will be more interested as well is seeing the application applied for aei and SOLUTION in machine, learning as well as network and capabilities of similar hardware that you were just displaying on this podcast today
Great video!!
Is it posible to include "NetBench" to your benchmarks?
Cheers!
Nice review :-) Can't wait to see it compared to AMD Epyc / Ampere ARM.. 😁 Tho, real life scenario.. replacing say.. 15 Dual X.Scalable servers, by 3 of those.. sounds nice.. but, how to maintain Ceph/Vsan redundancy/consustancy in case of a server down ? How many nvme fits in each unit ? It's not meant to reduce server count, but to increase performance per unit.. you'll probably end up mixing this hardware, with some used Epyc, if you whish to provide some kind of redundancy to your Patreon :-) (or you'll run into trouble when you'll need to reboot this host to load a new microcode and someone uses an instance as a shared git runner.. 😅) + Will those die like Intel 13/14th gen CPU ? Only time will tell.. 😊
New Server Day is one of my favorite days. I have a new server on its way--not as bonkers as yours, but new is new--and I'm looking forward to it. In the meantime I live vicariously.
You might not believe it, but I still get just as excited about new $100 server days 😊
@@CraftComputing I've been there, and I believe it! It's not the price that dictates the excitement. It's the novelty.
3:10 I, at the very least WOULD be interested in that. Mainly in the context of say, using such an AI to control a smarthome or turn it into a virtual secretary by feeding it stuff like emails, lists, documents, and allowing it some level of autonomy to do things like respond to emails, manage the documents, outright formulate new ones, etc.
Super interesting! I wonder how much cpu / ram makes a difference in gpu compute. I’m guessing not a ton outside of initial loading.
That’s wild. What is the power consumption at 0/50/100% on this thing?
What about compiling all of Gentoo? I think there are slave compiler images out there and you could boot them up on a lot of virtual machines and then run one master VM.... Maybe even cross compile If that's more difficult.
Can I ask you to maybe also try folding at home or BOINC or something of same type on this? Basically for giggles, and perhaps utilizing a split to a multitude of VMs, but I think it'd be fun to see the results
This is crazy. I use dual E5 V4 cpus for my homlab system. I expected them to be slower, i wasnt expecting this insane of a gap!
Jeebus, you are understating the bloodbath part... those performance are insane
That sounds in the beginning. Like sirens luring a sailor to their doom.
For anyone who struggles with alcohol like I do, join me with a cup of tea or coffee! Sober gang here and we still love craft computing! Keep it up Jeff!
I love me a cup of tea too. Lapsang Souchong is my go-to.
@@CraftComputing I will definitely be searching that up!
Great video - I would like to see how this system runs LLM (Ollama+OpenWebUI) w/o GPU.
Jesus its about to take off!
This is a amazing piece of hardware, and amazing review 👍tho i can only dream to get it
It would be interesting to see the price comparison too... How many X-es are these apart an that front? :D
For someone like me who got their hands on older HP DL160 and DL360 servers to experiment on.... I want this.
Any chance you can post the Phoronix test links so I can compare my Milan & Genoa CPU's?
github.com/phoronix-test-suite/phoronix-test-suite
How about testing some distributed compute project, like Folding@Home or Boinc? And what is the power efficiency in such tasks compared to consumer/older hardware.
FAH has a backlog of CPU only WU’s. Would be cool. The PPD would be the benchmark though and it wouldn’t be as impressive as you’d expect relative to GPU’s. Not apples to apples at all but I got a quad Xeon Phi 7210 machine (4 node) and it performa about a GTX 2080TI over 1024 threads.
that fan ramp sound is the que my job is about done.🤓
How much current does it need though
0:05 Lllllladies and gentlemen, from the flight deck, this is your captain speaking...
You could try doing/running some genomic data, through that system. Thats a boat load of data, what was the os you ran it on.
I'd like to see you actually run some type of genetic data through that system. And again what os did you use.
Look forward to it
Jeff, if you need colo space, I might have a pretty badass option for you, and no, it’s not my homelab lol. An actual MSP
Like can it run windows 12.
how much would this server be 50k usd?
My main two reference points for Server Performance is how fast can the server generate a Minecraft Forge Modded Chunk or how fast can it run a ERP SQL Server Stack.
Level1techs showed in in last video an ExpressSlot Slide MB204MP-1B and said it's useful for some server /data center applications with eight memory channels or with flight simulator. Could you add this in your setup and try again some testes, please?
You're gonna need that Bat'leth to handle fighting all that noise.
Hands up for Eau Claire, WI
Y-cruncher like large cache 10:05
I used to run a half-rack for clients (2000-2007) and this would have been 100x over my entire rack.
How about MAFFT for whole genome alignments. I used to do this on a Z820 with dual zeons & 256 gb ram. It took a couple of days. Might need some GPUs - I think I did it with a very pedestrian quadro 2000 back in 2015.
Running on windows server?
@@Teluric2 No, linux(Ubuntu).
0:11 - looks like you want to put the 'loud' in your own personal 'cloud' :D
- in something like this perf/watt and perf/core are probably "reasonable" ways to normalise these numbers - as $/core also scales with quite a vicious curve ...
5-7 years? where do they have hardware that new? one of our servers will celebrate it's 14th birthday in few months (should really push for some refresh, the electricity savings may pay for it quite fast)
F@h one run on the cpus? Hpc is also a really interesting topic imo
I think we need to normalize a new term for ai, something like intelligence simulation. Or simulated intelligence. Because even though it is artificial it isn't actual intelligence. It's a simulation of an intelligence style response. Behind the scenes it's still just math and algorithms based on previous human interaction.
Algebraic Modeling is what I propose. Under the hood, it is a bunch of matrix/tensor math
AI is just marketing bull stuff that sales people use to sell their latest SaaS to milk as much money out of unwitting CEOs that fall for the buzz words everytime....
neural network is the term.
Excellent point; at this stage machine learning is not truly artificial intelligence as the original definition of it describes. However, do not forget that existence itself as we know it is also just math and algorithms under the hood.
@@ilmeryn it's more accurate to say that it *seems* like reality is math under the hood. Until we have a Theory of Everything (of at least physics but ideally also for e.g. hard problem of consciousness) we can't say for certain that there isn't something beyond mathematical description. Or at least, mathematics as could be developed and understood by humans.
Please just run a Jellyfin server of it without Hardware acceleration enabled :-D
AI is ok but really its like 20% at most for most markets [really its less, but because so many are model training/dev work they skew the numbers]. They are just chasing AI because the margin is so much higher.
vGPU is still a big deal (especially if you are avoiding Nvidia tax). Then its micro containers. Not much VM outside of parts of the windows stack anymore.
I'm interested in AI, but not running it on a server CPU. A llama 3 8B model (from the Intel slide) easily runs very fast on a 4060.
If AMDs upcoming Strix Halo does support ROCm I would love to see LLM and Diffusion performance. Particularly running a 70B or higher LLM.
Don't the Xeon's have the 'quick assist' accelerators that are meant for encoding/decoding/transcoding of videos? I'm pretty sure they have some accelerator on them that is for that, I just can't remember the name of it ff hand...
Well I mention this to tell you to MAKE SURE this is used in some way during your 'real world' stress testing. You'd be leaving one of the GREATEST BENEFITS of choosing Xeon over Epyc....
Running a Plex media server should suffice, I think they transcode just to stream/play a video.. this way you're putting SOME of that accelerator horsepower to use when you're doing you're stress testing :)
Run a Minecraft server with 100 clients, change all blocks to TNT and make them go sploady. See if it crashes.
Double fan guards. Why?!
256 Cores, we're slowly moving into Beowulf cluster territory... 😊
I don't actually need a big CPU server like that for machine learning, my 'old banger' gaming desktop works fine, as long as the model size and datasets aren't too big.
I created one that gets 87.7% map50 on Coco, I know what I'm talking about and I don't need a big server, to do that.
but can it run crysis?
I worked on the Intel 3 node that these are made on! Super cool to see hardware leave the labs and actually doing things. Personally, I can't wait to see what people do with Clearwater Forest. Imagine 288 cores/threads in each socket.
It's little sad, but... nothing, much. Sierra Forest and CWF are niche low-volume products. They have their use case very clearly defined and it's unlikely to be seen outside of it. It's very good at it, but it's not general computing.
On the other hand, seeing generational uplift between Sapphire/Emerald and Granite, I am really excited for Diamond. Any word on PCie Gen6 or CXL 3.0?
@Vatharian I can't say where those are just yet, but they are in the works in some form.
As for the Xeon-Es being sad, I don't agree. Skymont is a powerful core as is, and Darkmont bulks it up a little bit more. 288 of those is going to make for a beastly chip for whatever task can leverage that many threads.
Sierra Forest is out now, and CWF is DEFINITELY going to make an impact when it launches next year. High efficiency is in demand right now for hyperscalers.
@@CraftComputing Efficiency and core density is what those chips are all about. Putting 144 E-cores on a modern node in a socket with a reasonable TDP is already quite a strong competitor for the right segment, and going up to 288 with all the IPC gains of Darkmont will be huge.
Astringency might be the word you were looking for to describe the beer
How long term might this co-lo opportunity last? I personally would rather become a patreon to support you for the benefit of running something on this than pay a service provider for a VPS. Especially since I am a professional service provider and would actually put my stuff on my own infrastructure, but might have personal experiments or not production critical stuff I might want to experiment with that I would not typically want to run on my own production servers. I don't even really know what my use case is at the moment, but from time to time I have an idea for something I want to try out, but not enough to justify putting in my production environment or to pay for a VPS. The addition nudge of knowing it is actually supporting you as a creator might just swing it being worth paying some money out to experiment with whatever idea. Some of the things I have been entertaining the idea of include LibreChat, various internet archiving like Archive Box, or setting up Manyfold to manage my 3D Printing models.
For right now, I'm looking for people to help test. Long term might develop into Loooooong term though. We'll have to see how it all goes.
engage ludicrous speed
If you're looking for a real world stress test, make it a minecraft server capable of thousands of players, then tell your audience to join at the same time. That would be fun. 10,000 player minecraft server. Just be sure you got some god tier internet speeds.
I'm just upgrading to E5 v3/v4, because they just came to the point of... I can effort them (and don't need to pay power at work) + I get a lot of free upgrade parts from my customers.
yabs. these will be refurbs in 5 years - 2-3 gpu would be mandatory run nested vms, i think the turin will do better but they are both capable, you probably want a 10g uplink; turin consolidation ratio is 7:1
Oh boy do I have the perfect application to test this system. Hmm.
Can it beat a 4090 in lllama 3.2?
perfect for a large minecraft modpack server.