AMA: 1000's of LPUs, 1 AI Brain. Scaling with the Fastest AI Inference

แชร์
ฝัง
  • เผยแพร่เมื่อ 4 พ.ย. 2024
  • Learn how the Groq architecture powering its LPU™ Inference Engine is designed to scale from the ground up. This AMA will dive into the scaling capabilities of Groq AI infrastructure across hardware, compiler, and cloud. We'll also discuss the unique Groq approach to overcoming scaling limitations of traditional legacy architectures.

ความคิดเห็น • 12

  • @dyter07
    @dyter07 5 หลายเดือนก่อน +5

    Groq is amazing, the speed is making me speechless. Is it possible to see some samples with a diffusion model soon?

  • @merchantsvillage
    @merchantsvillage 5 หลายเดือนก่อน

    Thank you, great intro to your tech!

  • @glorified3142
    @glorified3142 5 หลายเดือนก่อน +2

    Multimodal with voice and image, live video/camera capture would surely be a thing to help advance RnD.

  • @joannot6706
    @joannot6706 5 หลายเดือนก่อน +3

    The future of AI is not just LLM it's multimodal, does the LPU works with any type of data that AI can process? (it's tokenized after all)
    Are you going to rename it MPU? Multimodal processing unit?

    • @MarkHeaps-iu9si
      @MarkHeaps-iu9si 5 หลายเดือนก่อน +3

      Interesting suggestion, maybe we'll do that with our V2 silicon.
      We're already testing multimodal and we have a well published history of doing inference for many types of data heavy workloads. Look at the work done' with national labs, etc.

  • @vishwamartur
    @vishwamartur 5 หลายเดือนก่อน

    need to invest on it

  • @lokeshart3340
    @lokeshart3340 5 หลายเดือนก่อน +1

    Can u do for image and audio gen or video gen also

  • @QinghuaLi-wd1tk
    @QinghuaLi-wd1tk 5 หลายเดือนก่อน

    Groq said they have Lowest TTFT. And it turns out it is 180 ms as shown in their slide. That number really sucks. Even GPU can do it with 100 ms. SambaNova is also doing much better than 180 ms, around 100 ms as well.

  • @thesimplicitylifestyle
    @thesimplicitylifestyle 5 หลายเดือนก่อน

    Decentralized AGI with virtual substrate independent Machine Learning LLM Nodes working on Multiple servers connected to decentralized search engines being accessed with personal LLM and LAM computers that have WiFi and bluetooth and can learn to operate household appliances and inexpensive interchangeable robot chassis that can be controlled remotely. 😎🤖

  • @QinghuaLi-wd1tk
    @QinghuaLi-wd1tk 5 หลายเดือนก่อน

    Groq is a lier that it says its 1250 tokens/s for llama3 8B is 4x higher than other providers. But they obviously know SambaNova can do 1000+ tokens/s as well.
    Well, Lier.

    • @BooleanDisorder
      @BooleanDisorder 5 หลายเดือนก่อน

      An outlier!

    • @QinghuaLi-wd1tk
      @QinghuaLi-wd1tk 5 หลายเดือนก่อน

      @@BooleanDisorder oh, right, they are liars