Optimize GPU performance for AI - Prof. Gennady Pekhimenko

แชร์
ฝัง
  • เผยแพร่เมื่อ 6 ม.ค. 2025

ความคิดเห็น • 19

  • @MachineLearningStreetTalk
    @MachineLearningStreetTalk  หลายเดือนก่อน +2

    REFS:
    [0:00:15] Bill Gates and Mark Zuckerberg - Examples of technical founders who dropped out of Harvard to pursue business opportunities, both with strong programming backgrounds. (Harvard Gazette)
    news.harvard.edu/gazette/story/2017/05/make-a-difference-zuckerberg-tells-harvard-graduates/
    [0:06:15] Jensen Huang's NVIDIA Management - Discussion of flat organizational structure with ~50 direct reports, demonstrating unconventional tech company management approach. (Harvard Business Review)
    hbr.org/podcast/2023/11/nvidias-ceo-what-it-takes-to-run-an-a-i-led-company-now
    [0:11:05] LLaMA 2 - Meta's open-source large language model collection ranging from 7B to 70B parameters. (Touvron et al.)
    arxiv.org/abs/2307.09288
    [0:25:35] Mistral 7B - 7B parameter language model outperforming Llama 2 13B using grouped-query attention and sliding window attention. (Jiang et al.)
    arxiv.org/abs/2310.06825
    [0:34:45] Blocksworld Problems - Research on self-verification and chain-of-thought limitations in LLMs. (Kambhampati et al.)
    www.arxiv.org/pdf/2402.08115
    [0:35:55] LLM Arithmetic Limitations - Research demonstrating gaps in mathematical reasoning capabilities, particularly in three-digit multiplication. (Dziri et al.)
    arxiv.org/abs/2305.18654
    [0:41:35] AI Energy Consumption - Quantitative analysis of computational costs in modern AI training compared to biological systems. (Strubell et al.)
    arxiv.org/abs/1906.02243
    [0:46:20] Cursor.sh - AI-first code editor featuring multi-file diff review and AI-assisted code generation.
    cursor.sh
    [1:07:25] NVIDIA CUDA - Parallel computing platform and programming model for NVIDIA GPUs, discussed in context of kernel optimization evolution.
    docs.nvidia.com/cuda/cuda-c-programming-guide/
    [1:09:15] TVM Compiler - Automated end-to-end optimizing compiler for deep learning workloads across diverse hardware backends. (Chen et al.)
    arxiv.org/abs/1802.04799
    [1:17:20] No Free Lunch Theorem - Proves that all optimization algorithms have identical average performance across all possible problems, with implications for ML optimization. (Wolpert & Macready)
    arxiv.org/abs/2007.10928
    [1:52:40] BERT - Introduction of deep bidirectional representations from unlabeled text, discussed in context of Google scaling bidirectional attention models. (Devlin et al.)
    arxiv.org/abs/1810.04805
    [1:56:00] The Hardware Lottery - Analysis of how hardware availability historically influenced AI research success. (Hooker)
    arxiv.org/abs/2009.06489
    [1:56:35] Geoffrey Hinton Nobel Prize - Awarded 2024 Nobel Prize in Physics for pioneering work in deep learning and neural networks.
    www.technologyreview.com/2024/10/08/1105221/geoffrey-hinton-just-won-the-nobel-prize-in-physics-for-his-work-on-machine-learning/
    [2:03:57] Chomsky's LLM Critique - Argues that language models do not constitute genuine linguistic theory as they fail to properly delineate language possibilities.
    arxiv.org/pdf/2401.03910
    [2:06:15] NVIDIA Market Share - Analysis showing 98% revenue share in data-center GPU market with $36.2B revenue in 2023.
    www.hpcwire.com/2024/06/10/nvidia-shipped-3-76-million-data-center-gpus-in-2023-according-to-study/

  • @uw10isplaya
    @uw10isplaya หลายเดือนก่อน +3

    MLST, where even the sponsored content is banger certified

  • @enkidughom2508
    @enkidughom2508 หลายเดือนก่อน +7

    I interviewed for them, and they have the most fun and interesting interviewing experience

    • @tedlasso2887
      @tedlasso2887 หลายเดือนก่อน

      Can you share some

    • @pedrogorilla483
      @pedrogorilla483 หลายเดือนก่อน +1

      You can’t just say that and leave bro, please tell us more

    • @Outplayedqt
      @Outplayedqt หลายเดือนก่อน

      Mind sharing 1 or 2 tips or insights? Or anything that surprised you, in particular?

  • @ErmekD
    @ErmekD หลายเดือนก่อน

    Always great to see technical guests break down complicated concepts in a simple language!

  • @tommybtravels
    @tommybtravels หลายเดือนก่อน

    Hi Dr. Scarfe, thanks for another great episode. Some additional questions you might have considered asking Dr. Pekhimenko include:
    -end to end neural nets all the way to AGI, or something like neurosymbolic required?
    -his take on the ARC challenge
    -are any chip startups such as Cerebras, Groq, and/or Sambanova doing anything interesting in terms of architecture/chip design, and could any of them (or others) threaten Nvidia in terms of training and/or inference?
    -How much of a threat to Nvidia’s market dominance are custom ASICs made by Google, Amazon, and soon to be OpenAI via Broadcom?
    -How much of a moat is CUDA now, and how much staying power is that moat likely to have in the future?
    Big fan of your work. Thanks again

  • @yurona5155
    @yurona5155 หลายเดือนก่อน +3

    Once a new MLST episode drops (even a sponsored one), I'm very 'suspectible' to staying up way past my bedtime...*scnr*

    • @luisluiscunha
      @luisluiscunha หลายเดือนก่อน

      Use a good TH-cam summarizer

    • @luisluiscunha
      @luisluiscunha หลายเดือนก่อน

      If subscribed to chatGPT 😊

    • @MachineLearningStreetTalk
      @MachineLearningStreetTalk  หลายเดือนก่อน +2

      @@luisluiscunha we go to great lengths to produce PDF shownotes, look at them! www.dropbox.com/scl/fi/w9kbpso7fawtm286kkp6j/Gennady.pdf?rlkey=aqjqmncx3kjnatk2il1gbgknk&st=2a9mccj8&dl=0 - you can feed it into Claude and ask specific questions

  • @AhmedMOHAMMEDAHMED-hm2rx
    @AhmedMOHAMMEDAHMED-hm2rx หลายเดือนก่อน +1

    I wonder if it is possible to offload the verification process in the future .

  • @Reversed82
    @Reversed82 หลายเดือนก่อน

    49:02 i guess you are talking about tracing? i hope you already know about tracing and correlating it with logs + metrics (from a software dev turned operations/SRE)

  • @DelandaBaudLacanian
    @DelandaBaudLacanian หลายเดือนก่อน +1

    I wonder why Cassie Kozyrkov dropped out of Google? Her projects are something to watch out for

  • @Pingu_astrocat21
    @Pingu_astrocat21 หลายเดือนก่อน +1

    Cool stuff🔥

  • @shubhamarle96
    @shubhamarle96 หลายเดือนก่อน

    isn't MAMBA better than Transformers?

  • @DailyTuna
    @DailyTuna หลายเดือนก่อน +1

    The goal is to be less technical. as these tools do the technical work, it makes sense for a simplification for people to focus on creativity orproblems solving. .

  • @BuFu1O1
    @BuFu1O1 หลายเดือนก่อน

    lol is he Gwern??