REFS: [0:00:15] Bill Gates and Mark Zuckerberg - Examples of technical founders who dropped out of Harvard to pursue business opportunities, both with strong programming backgrounds. (Harvard Gazette) news.harvard.edu/gazette/story/2017/05/make-a-difference-zuckerberg-tells-harvard-graduates/ [0:06:15] Jensen Huang's NVIDIA Management - Discussion of flat organizational structure with ~50 direct reports, demonstrating unconventional tech company management approach. (Harvard Business Review) hbr.org/podcast/2023/11/nvidias-ceo-what-it-takes-to-run-an-a-i-led-company-now [0:11:05] LLaMA 2 - Meta's open-source large language model collection ranging from 7B to 70B parameters. (Touvron et al.) arxiv.org/abs/2307.09288 [0:25:35] Mistral 7B - 7B parameter language model outperforming Llama 2 13B using grouped-query attention and sliding window attention. (Jiang et al.) arxiv.org/abs/2310.06825 [0:34:45] Blocksworld Problems - Research on self-verification and chain-of-thought limitations in LLMs. (Kambhampati et al.) www.arxiv.org/pdf/2402.08115 [0:35:55] LLM Arithmetic Limitations - Research demonstrating gaps in mathematical reasoning capabilities, particularly in three-digit multiplication. (Dziri et al.) arxiv.org/abs/2305.18654 [0:41:35] AI Energy Consumption - Quantitative analysis of computational costs in modern AI training compared to biological systems. (Strubell et al.) arxiv.org/abs/1906.02243 [0:46:20] Cursor.sh - AI-first code editor featuring multi-file diff review and AI-assisted code generation. cursor.sh [1:07:25] NVIDIA CUDA - Parallel computing platform and programming model for NVIDIA GPUs, discussed in context of kernel optimization evolution. docs.nvidia.com/cuda/cuda-c-programming-guide/ [1:09:15] TVM Compiler - Automated end-to-end optimizing compiler for deep learning workloads across diverse hardware backends. (Chen et al.) arxiv.org/abs/1802.04799 [1:17:20] No Free Lunch Theorem - Proves that all optimization algorithms have identical average performance across all possible problems, with implications for ML optimization. (Wolpert & Macready) arxiv.org/abs/2007.10928 [1:52:40] BERT - Introduction of deep bidirectional representations from unlabeled text, discussed in context of Google scaling bidirectional attention models. (Devlin et al.) arxiv.org/abs/1810.04805 [1:56:00] The Hardware Lottery - Analysis of how hardware availability historically influenced AI research success. (Hooker) arxiv.org/abs/2009.06489 [1:56:35] Geoffrey Hinton Nobel Prize - Awarded 2024 Nobel Prize in Physics for pioneering work in deep learning and neural networks. www.technologyreview.com/2024/10/08/1105221/geoffrey-hinton-just-won-the-nobel-prize-in-physics-for-his-work-on-machine-learning/ [2:03:57] Chomsky's LLM Critique - Argues that language models do not constitute genuine linguistic theory as they fail to properly delineate language possibilities. arxiv.org/pdf/2401.03910 [2:06:15] NVIDIA Market Share - Analysis showing 98% revenue share in data-center GPU market with $36.2B revenue in 2023. www.hpcwire.com/2024/06/10/nvidia-shipped-3-76-million-data-center-gpus-in-2023-according-to-study/
Hi Dr. Scarfe, thanks for another great episode. Some additional questions you might have considered asking Dr. Pekhimenko include: -end to end neural nets all the way to AGI, or something like neurosymbolic required? -his take on the ARC challenge -are any chip startups such as Cerebras, Groq, and/or Sambanova doing anything interesting in terms of architecture/chip design, and could any of them (or others) threaten Nvidia in terms of training and/or inference? -How much of a threat to Nvidia’s market dominance are custom ASICs made by Google, Amazon, and soon to be OpenAI via Broadcom? -How much of a moat is CUDA now, and how much staying power is that moat likely to have in the future? Big fan of your work. Thanks again
@@luisluiscunha we go to great lengths to produce PDF shownotes, look at them! www.dropbox.com/scl/fi/w9kbpso7fawtm286kkp6j/Gennady.pdf?rlkey=aqjqmncx3kjnatk2il1gbgknk&st=2a9mccj8&dl=0 - you can feed it into Claude and ask specific questions
49:02 i guess you are talking about tracing? i hope you already know about tracing and correlating it with logs + metrics (from a software dev turned operations/SRE)
The goal is to be less technical. as these tools do the technical work, it makes sense for a simplification for people to focus on creativity orproblems solving. .
REFS:
[0:00:15] Bill Gates and Mark Zuckerberg - Examples of technical founders who dropped out of Harvard to pursue business opportunities, both with strong programming backgrounds. (Harvard Gazette)
news.harvard.edu/gazette/story/2017/05/make-a-difference-zuckerberg-tells-harvard-graduates/
[0:06:15] Jensen Huang's NVIDIA Management - Discussion of flat organizational structure with ~50 direct reports, demonstrating unconventional tech company management approach. (Harvard Business Review)
hbr.org/podcast/2023/11/nvidias-ceo-what-it-takes-to-run-an-a-i-led-company-now
[0:11:05] LLaMA 2 - Meta's open-source large language model collection ranging from 7B to 70B parameters. (Touvron et al.)
arxiv.org/abs/2307.09288
[0:25:35] Mistral 7B - 7B parameter language model outperforming Llama 2 13B using grouped-query attention and sliding window attention. (Jiang et al.)
arxiv.org/abs/2310.06825
[0:34:45] Blocksworld Problems - Research on self-verification and chain-of-thought limitations in LLMs. (Kambhampati et al.)
www.arxiv.org/pdf/2402.08115
[0:35:55] LLM Arithmetic Limitations - Research demonstrating gaps in mathematical reasoning capabilities, particularly in three-digit multiplication. (Dziri et al.)
arxiv.org/abs/2305.18654
[0:41:35] AI Energy Consumption - Quantitative analysis of computational costs in modern AI training compared to biological systems. (Strubell et al.)
arxiv.org/abs/1906.02243
[0:46:20] Cursor.sh - AI-first code editor featuring multi-file diff review and AI-assisted code generation.
cursor.sh
[1:07:25] NVIDIA CUDA - Parallel computing platform and programming model for NVIDIA GPUs, discussed in context of kernel optimization evolution.
docs.nvidia.com/cuda/cuda-c-programming-guide/
[1:09:15] TVM Compiler - Automated end-to-end optimizing compiler for deep learning workloads across diverse hardware backends. (Chen et al.)
arxiv.org/abs/1802.04799
[1:17:20] No Free Lunch Theorem - Proves that all optimization algorithms have identical average performance across all possible problems, with implications for ML optimization. (Wolpert & Macready)
arxiv.org/abs/2007.10928
[1:52:40] BERT - Introduction of deep bidirectional representations from unlabeled text, discussed in context of Google scaling bidirectional attention models. (Devlin et al.)
arxiv.org/abs/1810.04805
[1:56:00] The Hardware Lottery - Analysis of how hardware availability historically influenced AI research success. (Hooker)
arxiv.org/abs/2009.06489
[1:56:35] Geoffrey Hinton Nobel Prize - Awarded 2024 Nobel Prize in Physics for pioneering work in deep learning and neural networks.
www.technologyreview.com/2024/10/08/1105221/geoffrey-hinton-just-won-the-nobel-prize-in-physics-for-his-work-on-machine-learning/
[2:03:57] Chomsky's LLM Critique - Argues that language models do not constitute genuine linguistic theory as they fail to properly delineate language possibilities.
arxiv.org/pdf/2401.03910
[2:06:15] NVIDIA Market Share - Analysis showing 98% revenue share in data-center GPU market with $36.2B revenue in 2023.
www.hpcwire.com/2024/06/10/nvidia-shipped-3-76-million-data-center-gpus-in-2023-according-to-study/
MLST, where even the sponsored content is banger certified
I interviewed for them, and they have the most fun and interesting interviewing experience
Can you share some
You can’t just say that and leave bro, please tell us more
Mind sharing 1 or 2 tips or insights? Or anything that surprised you, in particular?
Always great to see technical guests break down complicated concepts in a simple language!
Hi Dr. Scarfe, thanks for another great episode. Some additional questions you might have considered asking Dr. Pekhimenko include:
-end to end neural nets all the way to AGI, or something like neurosymbolic required?
-his take on the ARC challenge
-are any chip startups such as Cerebras, Groq, and/or Sambanova doing anything interesting in terms of architecture/chip design, and could any of them (or others) threaten Nvidia in terms of training and/or inference?
-How much of a threat to Nvidia’s market dominance are custom ASICs made by Google, Amazon, and soon to be OpenAI via Broadcom?
-How much of a moat is CUDA now, and how much staying power is that moat likely to have in the future?
Big fan of your work. Thanks again
Once a new MLST episode drops (even a sponsored one), I'm very 'suspectible' to staying up way past my bedtime...*scnr*
Use a good TH-cam summarizer
If subscribed to chatGPT 😊
@@luisluiscunha we go to great lengths to produce PDF shownotes, look at them! www.dropbox.com/scl/fi/w9kbpso7fawtm286kkp6j/Gennady.pdf?rlkey=aqjqmncx3kjnatk2il1gbgknk&st=2a9mccj8&dl=0 - you can feed it into Claude and ask specific questions
I wonder if it is possible to offload the verification process in the future .
49:02 i guess you are talking about tracing? i hope you already know about tracing and correlating it with logs + metrics (from a software dev turned operations/SRE)
I wonder why Cassie Kozyrkov dropped out of Google? Her projects are something to watch out for
Cool stuff🔥
isn't MAMBA better than Transformers?
The goal is to be less technical. as these tools do the technical work, it makes sense for a simplification for people to focus on creativity orproblems solving. .
lol is he Gwern??