Optimize GPU performance for AI - Prof. Gennady Pekhimenko

Machine Learning Street Talk

มุมมอง 8 474

เพิ่มลงใน
- เพลย์ลิสต์ของฉัน
- ดูภายหลัง
แชร์

แชร์

ฝัง

ขนาดวิดีโอ:

แสดงแผงควบคุมโปรแกรมเล่น

เล่นอัตโนมัติ

เล่นใหม่

เผยแพร่เมื่อ 6 ม.ค. 2025

ความคิดเห็น • 19

@MachineLearningStreetTalk หลายเดือนก่อน ⁺²
REFS:
[0:00:15] Bill Gates and Mark Zuckerberg - Examples of technical founders who dropped out of Harvard to pursue business opportunities, both with strong programming backgrounds. (Harvard Gazette)
news.harvard.edu/gazette/story/2017/05/make-a-difference-zuckerberg-tells-harvard-graduates/
[0:06:15] Jensen Huang's NVIDIA Management - Discussion of flat organizational structure with ~50 direct reports, demonstrating unconventional tech company management approach. (Harvard Business Review)
hbr.org/podcast/2023/11/nvidias-ceo-what-it-takes-to-run-an-a-i-led-company-now
[0:11:05] LLaMA 2 - Meta's open-source large language model collection ranging from 7B to 70B parameters. (Touvron et al.)
arxiv.org/abs/2307.09288
[0:25:35] Mistral 7B - 7B parameter language model outperforming Llama 2 13B using grouped-query attention and sliding window attention. (Jiang et al.)
arxiv.org/abs/2310.06825
[0:34:45] Blocksworld Problems - Research on self-verification and chain-of-thought limitations in LLMs. (Kambhampati et al.)
www.arxiv.org/pdf/2402.08115
[0:35:55] LLM Arithmetic Limitations - Research demonstrating gaps in mathematical reasoning capabilities, particularly in three-digit multiplication. (Dziri et al.)
arxiv.org/abs/2305.18654
[0:41:35] AI Energy Consumption - Quantitative analysis of computational costs in modern AI training compared to biological systems. (Strubell et al.)
arxiv.org/abs/1906.02243
[0:46:20] Cursor.sh - AI-first code editor featuring multi-file diff review and AI-assisted code generation.
cursor.sh
[1:07:25] NVIDIA CUDA - Parallel computing platform and programming model for NVIDIA GPUs, discussed in context of kernel optimization evolution.
docs.nvidia.com/cuda/cuda-c-programming-guide/
[1:09:15] TVM Compiler - Automated end-to-end optimizing compiler for deep learning workloads across diverse hardware backends. (Chen et al.)
arxiv.org/abs/1802.04799
[1:17:20] No Free Lunch Theorem - Proves that all optimization algorithms have identical average performance across all possible problems, with implications for ML optimization. (Wolpert & Macready)
arxiv.org/abs/2007.10928
[1:52:40] BERT - Introduction of deep bidirectional representations from unlabeled text, discussed in context of Google scaling bidirectional attention models. (Devlin et al.)
arxiv.org/abs/1810.04805
[1:56:00] The Hardware Lottery - Analysis of how hardware availability historically influenced AI research success. (Hooker)
arxiv.org/abs/2009.06489
[1:56:35] Geoffrey Hinton Nobel Prize - Awarded 2024 Nobel Prize in Physics for pioneering work in deep learning and neural networks.
www.technologyreview.com/2024/10/08/1105221/geoffrey-hinton-just-won-the-nobel-prize-in-physics-for-his-work-on-machine-learning/
[2:03:57] Chomsky's LLM Critique - Argues that language models do not constitute genuine linguistic theory as they fail to properly delineate language possibilities.
arxiv.org/pdf/2401.03910
[2:06:15] NVIDIA Market Share - Analysis showing 98% revenue share in data-center GPU market with $36.2B revenue in 2023.
www.hpcwire.com/2024/06/10/nvidia-shipped-3-76-million-data-center-gpus-in-2023-according-to-study/
@uw10isplaya หลายเดือนก่อน ⁺³
MLST, where even the sponsored content is banger certified
@enkidughom2508 หลายเดือนก่อน ⁺⁷
I interviewed for them, and they have the most fun and interesting interviewing experience
@tedlasso2887 หลายเดือนก่อน
Can you share some
@pedrogorilla483 หลายเดือนก่อน ⁺¹
You can’t just say that and leave bro, please tell us more
@Outplayedqt หลายเดือนก่อน
Mind sharing 1 or 2 tips or insights? Or anything that surprised you, in particular?
@ErmekD หลายเดือนก่อน
Always great to see technical guests break down complicated concepts in a simple language!
@tommybtravels หลายเดือนก่อน
Hi Dr. Scarfe, thanks for another great episode. Some additional questions you might have considered asking Dr. Pekhimenko include:
-end to end neural nets all the way to AGI, or something like neurosymbolic required?
-his take on the ARC challenge
-are any chip startups such as Cerebras, Groq, and/or Sambanova doing anything interesting in terms of architecture/chip design, and could any of them (or others) threaten Nvidia in terms of training and/or inference?
-How much of a threat to Nvidia’s market dominance are custom ASICs made by Google, Amazon, and soon to be OpenAI via Broadcom?
-How much of a moat is CUDA now, and how much staying power is that moat likely to have in the future?
Big fan of your work. Thanks again
@yurona5155 หลายเดือนก่อน ⁺³
Once a new MLST episode drops (even a sponsored one), I'm very 'suspectible' to staying up way past my bedtime...*scnr*
@luisluiscunha หลายเดือนก่อน
Use a good TH-cam summarizer
@luisluiscunha หลายเดือนก่อน
If subscribed to chatGPT 😊
@MachineLearningStreetTalk หลายเดือนก่อน ⁺²
@@luisluiscunha we go to great lengths to produce PDF shownotes, look at them! www.dropbox.com/scl/fi/w9kbpso7fawtm286kkp6j/Gennady.pdf?rlkey=aqjqmncx3kjnatk2il1gbgknk&st=2a9mccj8&dl=0 - you can feed it into Claude and ask specific questions
@AhmedMOHAMMEDAHMED-hm2rx หลายเดือนก่อน ⁺¹
I wonder if it is possible to offload the verification process in the future .
@Reversed82 หลายเดือนก่อน
49:02 i guess you are talking about tracing? i hope you already know about tracing and correlating it with logs + metrics (from a software dev turned operations/SRE)
@DelandaBaudLacanian หลายเดือนก่อน ⁺¹
I wonder why Cassie Kozyrkov dropped out of Google? Her projects are something to watch out for
@Pingu_astrocat21 หลายเดือนก่อน ⁺¹
Cool stuff🔥
@shubhamarle96 หลายเดือนก่อน
isn't MAMBA better than Transformers?
@DailyTuna หลายเดือนก่อน ⁺¹
The goal is to be less technical. as these tools do the technical work, it makes sense for a simplification for people to focus on creativity orproblems solving. .
@BuFu1O1 หลายเดือนก่อน
lol is he Gwern??

ต่อไป

เล่นอัตโนมัติ