ModCon 2023 Breakout Session: MAX Engine Performance

แชร์
ฝัง
  • เผยแพร่เมื่อ 11 ก.พ. 2025
  • In this session, Modular engineers Abdul Dakkak and Hengjie Wang discuss Modular AI Engine performance across models and hardware architectures. They dive deep into how AI Engine works and show its performance against Pytorch and TensorFlow and demonstrate how AI Engine scales to models of all sizes including LLMs.
    00:00 Introduction and performance numbers
    04:33 Runtime, compiler, and kernels working in unison
    05:30 Runtime parralelism and memory management
    05:50 Moving transforms out of inference to model initialization
    06:04 Automatic fusion of graphs to a single op
    06:30 Specialized kernels on dimensions
    08:49 Simplification with Mojo
    10:26 Generality across hardware
    11:51 Cross platform development example
    13:22 Kernel JIT
    13:43 Develepor friendly
    14:02 Autotuning, Custom Ops, Multi-model support
    14:58 Stable diffusion example

ความคิดเห็น • 1