Cerebras @ Hot Chips 34 - Sean Lie's talk, "Cerebras Architecture Deep Dive"

แชร์
ฝัง
  • เผยแพร่เมื่อ 28 ส.ค. 2022
  • Neural networks have grown exponentially in recent years, from 2018 state-of-the-art neural networks of 100 million parameters to the famous GPT-3 with 175 billion parameters. However, this Grand ML Demand Challenge must be addressed by making substantial improvements - an order of magnitude or more - across a broad spectrum of multiple different components. Read Sean's blog for a written version of this Hot Chips 34 talk. It gives a deep dive into the Cerebras hardware to show you how our revolutionary approaches in core architecture, scale-up, and scale-out are designed to meet this ML demand: www.cerebras.net/blog/cerebra...
    Learn more about Cerebras: cerebras.net
    #deeplearning #ai #artificialintelligence #hotchips #hotchips34 #hc34
  • วิทยาศาสตร์และเทคโนโลยี

ความคิดเห็น • 13

  • @aitech5710
    @aitech5710 16 วันที่ผ่านมา

    I didn't catch that much from the routing protocol, and how actually die to communicate on wse2 , yiu guys have alot if things , congratulations 🎊 😊

  • @centuriomacro9787
    @centuriomacro9787 ปีที่แล้ว +4

    Very interesting presentation, thx

  • @whyjay9959
    @whyjay9959 9 หลายเดือนก่อน +4

    Hi. There's something that a few people were wondering about: Why is the Wafer-Scale Engine square? Since it looks like there's room for ~28 more complete, attached tiles.

    • @CerebrasSystems
      @CerebrasSystems  9 หลายเดือนก่อน +6

      It's a good question! The answer is rather prosaic, we're afraid. If the WSE weren't rectangular, the complexity of power delivery, I/O, mechanical integrity and cooling become much more difficult, to the point of impracticality.
      Take a look at the virtual teardown on our website and you may get a feel for some of these challenges: www.cerebras.net/cs2virtualtour
      The upshot is that a mere 850,00 cores will just have to suffice. ;)

    • @whyjay9959
      @whyjay9959 9 หลายเดือนก่อน +2

      @@CerebrasSystems I think I get the idea, thanks.

  • @aitech5710
    @aitech5710 16 วันที่ผ่านมา

    Was wondering if memory x is actually an independent device outside of wse-2 , wafer ,? the fact it has better spars performance in hardware level , is very interesting?

  • @JoeLion55
    @JoeLion55 หลายเดือนก่อน

    Re: the The die-to-die interface at about 15:15.
    You mentioned you an upper metal layer to cross the scribe lines between the dies. What does the reticle look like for this. Is this a regular mask, but the alignment for the mask is just offset so it straddles the scribe lines for the rest of the wafer? Is this something TSMC does regularly for other products? Or is this a new process to have reticles on the same wafer that don’t align on top of each other?

  • @808bigisland
    @808bigisland ปีที่แล้ว +2

    Aloha and thanks! Way to go! Just imagined what you will be doing in ten years from now! Do you have a public roadmap?

    • @CerebrasSystems
      @CerebrasSystems  ปีที่แล้ว

      Thanks, 808 Big Island! Sadly, no public roadmap. You'll just have to keep watching!

  • @piscocuk2011
    @piscocuk2011 หลายเดือนก่อน

    00:04 Cerebras aims to revolutionize AI compute with a co-designed architecture
    02:06 Architecture focused on neural networks
    06:25 Memory bandwidth enables full performance in neural network computation.
    08:36 Cerebras core hardware architecture flexibility
    13:08 Cerebras chip has 84 die with 850,000 cores on a single 300mm wafer.
    15:27 Homogeneous array of cores across the wafer for unprecedented fabric performance
    19:21 Cerebras architecture utilizes dataflow mechanisms for weight computations
    21:12 Single chip enables high-performance neural networks
    25:02 Scalable clustering and wafer-scale chips enable large model access to everyone
    Crafted by Merlin AI.

  • @RalphDratman
    @RalphDratman ปีที่แล้ว +2

    Is the CS-2 used only for training?
    Will a time come when, for massively concurrent inference, this architecture will be applicable?

    • @CerebrasSystems
      @CerebrasSystems  ปีที่แล้ว +2

      Hi Ralph, good question. The vast bulk of our customers have used our systems for training LLMs or for HPC applications.
      We have had a couple of projects using it for inference, like one with Lawrence Livermore National Laboratory where they offloaded an unwieldy inference step from many nodes of their Lassen supercomputer to one of our systems. You can read the case study here: www.cerebras.net/cerebras-customer-spotlight-overview/spotlight-lawrence-livermore-national-laboratory/
      But in principle, our architecture should make at terrific concurrent inference platform because we can run many (hundreds or even thousands depending on the model) in parallel across our massive array of cores.

  • @billykotsos4642
    @billykotsos4642 ปีที่แล้ว +3

    👀👀👀👀👀👀