AI/ML/DL with the Dell PowerEdge R720 Server - Energy, Heat, and Noise Considerations

แชร์
ฝัง
  • เผยแพร่เมื่อ 1 ม.ค. 2024
  • Dive deep into the world of high-performance computing with our thorough examination of the Dell PowerEdge R720 Server, a powerhouse for AI, ML, and DL applications. This video is not just a guide; it's an insightful exploration into the server's operational dynamics, focusing on power consumption, heat production, and acoustic management.
    What You'll Uncover in This Video:
    Power Consumption Insights: Discover the intricate details of how the PowerEdge R720 consumes power under various AI/ML/DL workloads. I'll guide you through its energy usage and help you estimate the operational costs for your home lab or business setting.
    Heat Output Analysis: Understand the thermal behavior of the R720. We delve into how different server components generate heat under multiple load scenarios, providing you with crucial knowledge for efficient thermal management.
    Acoustic Footprint: Explore the noise levels of the R720 in different operational states. I take a hands-on approach to demonstrate the sound profile of the server, helping you anticipate and manage the acoustic impact in your working environment.
    This video is an essential resource for IT professionals, AI researchers, and tech enthusiasts aiming to integrate the Dell PowerEdge R720 into their computational projects. We bridge the gap between technical know-how and practical application, ensuring you have all the information needed for a well-balanced, efficient, and harmonious server setup. Don't forget to like, share with your network, and subscribe for more in-depth and practical tech content
    📚 Additional Resources:
    Spread Sheet Used In Video:
    docs.google.com/spreadsheets/...
    GPU Max Utilization Test Script
    drive.google.com/file/d/1GyhZ...
    Dell Technical Manual (Page 40)
    dl.dell.com/manuals/all-produ...
    Best AI/ML/DL Rig For 2024 - Most Compute For Your Money!
    • Best AI/ML/DL Rig For ...
    AI/ML/DL GPU Buying Guide 2024: Get the Most AI Power for Your Budget
    • AI/ML/DL GPU Buying Gu...
    Dell PowerEdge R720XD GPU Upgrade: Installing Tesla P40 with NVIDIA Drivers
    • Dell PowerEdge R720XD ...
    Installing Tesla P100 GPU on Dell PowerEdge R720 Server with Driver Installation
    • Installing Tesla P100 ...
    Installing DUAL Tesla P100 GPU on Dell PowerEdge R720 Server with Driver Installation
    • Installing DUAL Tesla ...
    HOW TO SUPPORT MY CHANNEL
    If you found this content useful, please consider buying me a coffee at the link below. This goes a long way in helping me through grad school and allows me to continue making the best content possible.
    Buy Me a Coffee
    www.buymeacoffee.com/TheDataD...
    Thanks for your support!
  • วิทยาศาสตร์และเทคโนโลยี

ความคิดเห็น • 18

  • @alberth4890
    @alberth4890 5 หลายเดือนก่อน +4

    Thanks for covering DL which is often neglected!

    • @TheDataDaddi
      @TheDataDaddi  5 หลายเดือนก่อน +1

      Hi there! Thanks so much for the feedback. So glad the video was helpful to you!

  • @ultraplexplextor
    @ultraplexplextor 5 หลายเดือนก่อน +1

    Thank you for a really nice video and happy new year to you.:)

    • @TheDataDaddi
      @TheDataDaddi  5 หลายเดือนก่อน

      So glad you enjoyed it! Happy new year to you as well!

  • @fabsync
    @fabsync 4 หลายเดือนก่อน +1

    pure gold! Thanks for this video! I am considering adding same gpu to my dell r730.. I wonder if you are able to train some models with the tesla p40 and if speed is decent for that?

    • @fabsync
      @fabsync 4 หลายเดือนก่อน +1

      it will be great if you create a video on your stack.. what software do you use to manage your servers, etc?

    • @TheDataDaddi
      @TheDataDaddi  4 หลายเดือนก่อน

      Hi there. Thanks so much for your feedback! I am actually in the process right now of creating a deep learning benchmark suite to evaluate this and other GPUs. I hopefully will be able to create some video here in a couple weeks!

    • @TheDataDaddi
      @TheDataDaddi  4 หลายเดือนก่อน

      Okay I will try to make a video on this as well!@@fabsync

  • @IntenseGrid
    @IntenseGrid 5 หลายเดือนก่อน +1

    If you don't plan on adding to that rack anytime soon you may want to make a 1U or 2U gap between the servers. This should lower the temps for the lower 2 servers.

    • @TheDataDaddi
      @TheDataDaddi  5 หลายเดือนก่อน

      Yeah that is definitely a great suggestion. I have thought about doing that. I am looking for another server though, and I don't think I will have space after that. Otherwise I would definitely do that.

  • @IntenseGrid
    @IntenseGrid 5 หลายเดือนก่อน +1

    For your purposes, did you find that the P40 was a better fit than the P100, or visa-versa?

    • @TheDataDaddi
      @TheDataDaddi  5 หลายเดือนก่อน

      Great question. So I have found both useful for different purposes. I have been using the p100s mainly for computer vision related work and the p40s for ML/DL related to massive graph networks. Both seem well suited to their purposes so far. Overall, though I think I prefer the P40s at this point. They have more VRAM and higher single precision throughput. So generally speaking I think they are more applicable to most any task I would work on. Still have yet to full benchmark both so this opinion may change with further testing.

  • @IntenseGrid
    @IntenseGrid 5 หลายเดือนก่อน

    Along the same lines, on models that would fit in 16GB, was the P100 faster with it's HBM2 memory?

    • @TheDataDaddi
      @TheDataDaddi  5 หลายเดือนก่อน

      So, I have not run the same models on both yet so I can say for sure yet on this one. However, what I have noticed initially is that the high bandwidth makes a big difference when training across both GPUs. It saves a lot of time on data transfer. However, when just using one GPU I think that the p40 is faster from a raw perspective. However, I reserve final judgment on this until I can directly compare the same model across both.

  • @aishaproject
    @aishaproject 5 หลายเดือนก่อน

    You might have another issue ...
    the Xeon e5-2690 v2 is a 130 W chip, and the maximum with 2 GPUs on the 720 is 115W
    I bought a similar arrangement but went for the e5-2680 v2 at 115W each and still 10 cores

    • @TheDataDaddi
      @TheDataDaddi  5 หลายเดือนก่อน

      Hi there! Thanks so much for your feedback. This is interesting. Is this a constraint by the mobo? Because I have 2 1100 W PSUs and each riser plus the 75W from the PCIE slot in theory should provide more that enough power for the GPUs. So unless the mobo has specific limitations then everything in the system should be getting adequate power. I have also not observed in practice so far any power limitation on either the GPUs or CPUs. If you could provide a bit more info here, I absolutely go back and check my setup to make sure that everything is being powered properly.

    • @KiraSlith
      @KiraSlith 4 หลายเดือนก่อน +1

      ​​@@TheDataDaddiThe spec he is looking at is for "fail over" mode with the more common dual 850w power supply configuration, rather than the dual 1100w power supplies you're using. Even if you were using the 850w PSUs in the same dual GPU, dual 130w CPU configuration, Dell's controller would just force the same "tandem" mode you already have it running in.

    • @TheDataDaddi
      @TheDataDaddi  3 หลายเดือนก่อน +1

      Ah gotcha. Yeah, I miss understood that question. Thanks so much for the clarification there. Really appreciate it. @@KiraSlith