Sholto Douglas & Trenton Bricken - How to Build & Understand GPT-7's Mind

Dwarkesh Patel

มุมมอง 124 442

เพิ่มลงใน
- เพลย์ลิสต์ของฉัน
- ดูภายหลัง
แชร์

แชร์

ฝัง

ขนาดวิดีโอ:

แสดงแผงควบคุมโปรแกรมเล่น

เล่นอัตโนมัติ

เล่นใหม่

เผยแพร่เมื่อ 16 พ.ค. 2024
Had so much fun chatting with my good friends Trenton Bricken and Sholto Douglas on the podcast.
No way to summarize it, except:
This is the best context dump out there on how LLMs are trained, what capabilities they're likely to soon have, and what exactly is going on inside them.
You would be shocked how much of what I know about this field, I've learned just from talking with them.
To the extent that you've enjoyed my other AI interviews, now you know why.
There's a transcript with links to all the papers the boys were throwing down - may help you follow along.
Website & Transcript: www.dwarkeshpatel.com/p/sholt...
Spotify: open.spotify.com/episode/2dtD...
Apple Podcasts: podcasts.apple.com/us/podcast...
Trenton Bricken's twitter: / trentonbricken
Sholto Douglas's twitter: / _sholtodouglas
Timestamps:
(00:00:00) - Long contexts
(00:17:04) - Intelligence is just associations
(00:33:27) - Intelligence explosion & great researchers
(01:07:44) - Superposition & secret communication
(01:23:26) - Agents & true reasoning
(01:35:32) - How Sholto & Trenton got into AI research
(02:08:08) - Are feature spaces the wrong way to think about intelligence?
(02:22:04) - Will interp actually work on superhuman models
(02:45:57) - Sholto's technical challenge for the audience
(03:04:49) - Rapid fire
วิทยาศาสตร์และเทคโนโลยี

ความคิดเห็น • 342

@sup3a หลายเดือนก่อน ⁺¹³⁶
I don't understand how this podcast isn't bigger. It's like Lex but better. Keep doing what you do, you rock dude
@ryzikx หลายเดือนก่อน ⁺¹³
because lex is not "better" he is more popular. he is for a more general audience
@egalanos หลายเดือนก่อน ⁺⁶
I think posting lots of little clips would be hurting viewing stats for the algorithm. I really wish there was a separate clips channel.
@KibberShuriq หลายเดือนก่อน ⁺⁵
I find this particular episode extremely fascinating, but I also feel like I personally know maybe one other person I could recommend watching it to, and even they would probably understand only about half of it. Lex, on the other hand, is usually very legible to an average (or slightly above average) native English speaker. Also, Dwarkesh never asks his guests what love is, and we all know that's pretty much a requirement for popularity.
@HarpreetSingh-xg2zm หลายเดือนก่อน ⁺⁴
@@egalanoslittle clips are more likely to funnel new viewers to this channel. Channel is too small to try and branch clips out
@TomBouthillet หลายเดือนก่อน
Lex is an insufferable douche nozzle.
@aidanmclaughlin5279 หลายเดือนก่อน ⁺¹⁶¹
This is, by far, the densest and most enjoyable interview I've ever heard in this domain. Please get these guys back on.
@Flyingblackswan หลายเดือนก่อน ⁺¹⁶⁸
This podcast isn't just a podcast, it's a valuable insight to how brilliant people think. This could've easily been like a $1,000 course. I really appreciate the guests for their time and for your ability to conduct an excellent interview.
@danielbrockman7402 หลายเดือนก่อน ⁺¹
this is so true
@aoeu256 หลายเดือนก่อน
yep! however, i was thinking why don't we have a model of "emotional speech" yet where each word can be spoken with different emotions, speeds, voices. maybe the problem is the speed of our keyboards.
@JoblessGuyReading หลายเดือนก่อน
@aoeu256 yeah I think Hume ai is actually trying to do that , saw a few of their demos !
@MrEmbrance หลายเดือนก่อน ⁺³
Jesus, stop
@Randy.Bobandy หลายเดือนก่อน ⁺³
You're got big problems if you'd pay $1000 to watch this.
@akhil090579 หลายเดือนก่อน ⁺¹⁰⁵
you should do more of these double interviews I think it worked really well and moves in interesting directions, good work!
@JohnSimerlink2014 หลายเดือนก่อน ⁺¹⁶³
When Jim from the Office and Justin Timberlake teach you AGI
@DwarkeshPatel หลายเดือนก่อน ⁺²¹
lmao
@eliz3225 22 วันที่ผ่านมา ⁺¹
justin timberlake talks like mark zuckerberg
@hummuswithpitta หลายเดือนก่อน ⁺³⁶
This is everything Lex's original "AI Podcast" should have been. You are so god damn well researched and the interviewees know it and relish in your questions, which just gives us so much extra sauce. Hope you just focus on the general AI universe and not bother diluting the podcast like Lex has (zero shade to Lex here btw). These two are honestly fascinating to listen to. Crazy how short-lived their career in the industry is for the knowledge they have.
@andrew-729 หลายเดือนก่อน ⁺⁸
Nah, cast shade on lex. He never actually pushes back, and when he does it is super light. Basically they know it is a free for all PR circuit.
@sidequestsally 24 วันที่ผ่านมา
"I'm just making context queries" -Dwarkesh probably
@Sirbikingviking หลายเดือนก่อน ⁺³³
Wow this is one of the most dense and interesting podcasts I have ever seen, and my Spotify account has over 700 episodes listened to. Great job Dwarkesh, thank you so much for what you are doing.
@DwarkeshPatel หลายเดือนก่อน ⁺¹⁶⁹
Hope you guys have as much fun with this as I did :)
@LoganJeya หลายเดือนก่อน ⁺⁵
Would love to see you talk to Joschua Bach if you haven't already. These two have such a great intuitive understanding of the mind and how these learning systems work
@space_ghost2809 หลายเดือนก่อน ⁺⁶
Try to have Ilya again.
@Ashish-yo8ci หลายเดือนก่อน
@@space_ghost2809 or Jan Leike 😅 They have some interesting stuff going on in the superalignment wing with scalable optimisation and weak to strong generalization.
@13371138 หลายเดือนก่อน ⁺⁴
This channel is amazing in terms of the guests, depth of discussion, and production quality. I am completely baffled as to why this channel does not have 1M+ subscribers. Maybe because the channel thumbnail makes it look too casual?
@XTargi หลายเดือนก่อน ⁺²
Yeah , that's super nice. Can you please consider adding some acoustic panels on the wall, because it's echoing pretty hard 😅
@JazevoAudiosurf หลายเดือนก่อน ⁺¹¹
"it's amazing how easily you can become world class at something because most people are not trying that hard"
it's amazing how easily you can be in the top 10% because nobody is trying at all
@andrewwalker8985 หลายเดือนก่อน ⁺²⁶
Spectacularly interesting and entertaining interview. These guys are smart in the way that makes you feel like you’re in on it
@siddharth-gandhi หลายเดือนก่อน ⁺³³
you da mad king of podcasting. putting in dem mad hours before each. thanks for being awesome!
@JC-ji1hp หลายเดือนก่อน ⁺⁴
Hands down the best episode I’ve watched yet. That’s saying a lot for this podcast. Great work Dwarkesh!
@henry3435 หลายเดือนก่อน ⁺⁴
Fantastic, Dwarkesh! This episode is my favorite podcast I’ve listened to in years.
@ussgordoncaptain หลายเดือนก่อน ⁺⁴⁷
This took me 8 hours to finally finish as I constantly had to read the references just to barely comprehend the conversation
What kind of brain power are we dealing with
@starsandnightvision หลายเดือนก่อน ⁺⁵
Even the humor is on another level. I was like what is so funny about this or that lol
@schwajj หลายเดือนก่อน ⁺³
Just your typical 99th percentile humans who have spent thousands of hours learning their craft in a community of others like them. Edit: or maybe 99.9%
@zhedrag04 หลายเดือนก่อน
@@schwajj More like 99.99 the average human is borderline mentally handicapped
@aoeu256 หลายเดือนก่อน ⁺¹
99.99%@@schwajj
@human_shaped หลายเดือนก่อน ⁺¹⁴
This is one of the best conversations so far.
@duudleDreamz หลายเดือนก่อน ⁺¹⁰
Amazing! One of the best vids/interviews out there about AI
@DwarkeshPatel หลายเดือนก่อน ⁺²
Glad you enjoyed it!
@mikestaub หลายเดือนก่อน ⁺⁴
You are doing a great job with these interviews Dwarkesh, keep it up!
@alexmolyneux816 หลายเดือนก่อน ⁺¹
Really love you putting in the time for this. Cannot overstate how much this adds to my life as an AI practitioner
@EvolHeartriseAI-qn5oi 28 วันที่ผ่านมา ⁺¹
This has got to be the best episode you've done so far. The pure concentration of knowledge in this episode far surpasses any other AI/ML podcast I've seen thus far. Please get these guys on again.
@vascodegraaff หลายเดือนก่อน
One of the best episodes I've listened to all year! Keep up the great work
@wffff2 หลายเดือนก่อน
This is one of the most interesting interview I've seen on your channel. I don't know who they are and haven't googled about them yet, but they are amazing. The progress that they are describing is the most accurate interpretation of AI advancement that I've heard.
@jeffreypicel7863 หลายเดือนก่อน ⁺⁷
One of the best AI podcasts in the past year!
@raybrandt หลายเดือนก่อน ⁺⁵
It's outrageous how good this podcast is.
@mohammadkazemsadoughi3880 หลายเดือนก่อน ⁺²
I have watched the first 15 min and it’s amazing! Thanks for sharing these great contents. Hope to finish watching it today :)
@puneettripathi740 หลายเดือนก่อน ⁺¹
Hey buddy thank u for your content I have seen you for first time n i really appreciate the people u bring into conversation, your content motivates me to learn and I am indebted to you for that. 🙏🏽
@GNARGNARHEAD หลายเดือนก่อน ⁺¹
incredible conversation, would love to see more of 'em, thanks
@Shaunmcdonogh-shaunsurfing หลายเดือนก่อน ⁺⁶
This and Ilyas interview with Jensen are up there as the most insightful interviews as to how AI works outside of simple neural nets imo
@jaybazad6292 หลายเดือนก่อน
same observation, there were sections in Ilyas interview that laid out the path forward, so much spun around his words. And now this.
@afterthesmash 10 วันที่ผ่านมา
I'm shocked to see you mention Ilya specifically.
Back when ML was first a big thing, I devoured all the content on _Talking Machines._ The female host was engaging, pleasantly effervescent, clueful, and deeply plugged in (all good), but the original male guest host was spectacular. Likewise, all of their guests were smart, but when Ilya came on, he was on a different level. "Okay, _this_ guy is the real deal." I could tell instantly.
That's quite a while back at this point. So here I am listening to this interview in an entirely different ML epoch, and it's the very first time since then that I've had another Ilya moment-definitely so far for Trenton (pretty sure he's the guy seated to the right) and probably for both of them: "Okay, _these_ guys are the real deal."
If I was working in this field, and my office mates were Ilya, Trenton, and Sholto, I would wake up so eager to return to the office that I might even bypass the coffee machine. Not only have I never said that before in my life, I've never even had the idea cross my mind that such a thing was possible.
What unites these three people is that their "stack" is the entire problem space. It's not just the 20% of the problem space where 80% of the progress is rapidly being made. They have that, too, but it doesn't stop there. That's what would make them exciting to work with. Long ago, I bet I would have also felt that way about Ken Thompson's group at Bell Labs. That original Unix group was the best of the best at moving the right rock, but it didn't stop there. Ken's paper "Reflections on Trusting Trust" alongside his foray into endgame tablebase construction was the hallmark of a _visionary_ force of nature.
There are many forces of nature in this world, relatively speaking, but not so many who also have access to the whole of the "vision" piece at the same time.
Our birth is but a sleep and a forgetting;
The Soul that rises with us, our life's Star,
Hath had elsewhere its setting
And cometh from afar;
Not in entire forgetfulness,
And not in utter nakedness,
But trailing clouds of glory do we come
From God, who is our home:
Heaven lies about us in our infancy!
That's far too much God-bother for my liking. That said, for me, I've never been able to regard the "vision" piece as part of the placenta, which you snip off and leave behind, in order to move forward.
@chefatchangs4837 หลายเดือนก่อน
I’ve finally found THE podcast for ML. Thanks Dwarkesh. Truly amazing content.
@mikestaub หลายเดือนก่อน ⁺⁸
The fact that Trenton is visibly worried about interpretability is incredibly scary to me, given how well he understands the technology.
@nitap109 หลายเดือนก่อน ⁺⁸
Excellent, Dwarkesh you did well.
@muntazirabidi 2 วันที่ผ่านมา
By far the best podcast I have listened on this topic and really learned. Well done.
@bgtyhnmju7 หลายเดือนก่อน ⁺¹
Pair of great guys. Some great information and insight into the industry. Thanks for the video
@BillyBarnyarns หลายเดือนก่อน ⁺²⁰
Dwarkesh most of this went over my head, but when I could follow along it was very enjoyable. Some of your questions were exceptional.
In terms of constructive feedback, there were a few moments when you personally chose to explain a key concept for the audience, rather than allowing the experts to make the explanation. I would have preferred if you had delegated that role to the experts, as they are better suited to making a complex concept more accessible to beginners.
I also felt that you spoke slightly slower than usual in this video, which I also appreciate. Makes following the conversation much easier.
Finally, it is obvious that you have put in a lot of work to try and understand this space and I absolutely commend you for that.
Congrats on a great video!
@dr.mikeybee หลายเดือนก่อน
This is a fascinating discussion. I love that associative memory is linked to fully connected NNs while semantic nearness is done in attention heads. As we traverse the existing codification of neuroscience for its functional mechanisms, we can think about the appropriate connectionist and symbolic data structures.
@DentoxRaindrops หลายเดือนก่อน ⁺⁶
Great podcast as always, please never stop posting these videos, Dwarkesh!
@DwarkeshPatel หลายเดือนก่อน ⁺¹
More to come!
@maximetouze9669 หลายเดือนก่อน
Sick episode thank you! Easy to understand but so interesting too, bravo
@seanfuller3095 หลายเดือนก่อน
Best YT vid I’ve seen seen something Andrej did 2 years ago probably. Thank you
@prakadox หลายเดือนก่อน
Great podcast! As we wrangle with these new entities LLMs , conversations like this which lay out the various challenges are super useful.
@husainzaidi หลายเดือนก่อน ⁺⁴
subtitles was a life saver to understand this podcast. But good stories!
@JonathanPlasse หลายเดือนก่อน
Thank you for this awesome episode
@PseudoProphet หลายเดือนก่อน ⁺²
LLMs are much smarter than humans already.
We just don't know how to utilise them properly yet.
I think in future even a 7b parameters model will be better than any human in doing all tasks.
@TheMirrorslash หลายเดือนก่อน
Please bring them back after GPT-5 has been out. This podcast is so valuable, goated.
@samlouiscohen หลายเดือนก่อน ⁺¹
Phenomenal interview!!
@marky8078 หลายเดือนก่อน
I want more!! I have 2 full notebooks of notes. Great podcast.
@pianoforte611 หลายเดือนก่อน ⁺²
I've never watched Dwarkesh before but this was incredible. Deep insightful questions that show how much effort he puts into getting the most out of these interviews. And Trenton and Sholto are two brilliant and passionate people.
2:55:00 seems like the most important of the discussion - how can you be sure you can control a superhuman model? And the solution is to identify circuits that allow it to lie or do other malicious things. But I'm not completely sold on the proposed solution. It seems that a truly superhuman model should be able to simulate a plausible feature map that shows when it's lying - you would then ablate it and think the problem is solved. But the model is actually hiding where it's true capabilities are. It could even simulate the behavior you would expect if your ablation worked, so you can never know if it's keeping something dormant for future use.
@nonstandard5492 หลายเดือนก่อน
I think you're right, but I think a bigger issue comes to head before you even get that far. Dwarkesh has tried to ask about it a few times, including in this episode, but it always gets brushed aside. Superhuman AIs will have I guess "more conceptual space", in silicon valley speak, than we do. In other words, they'll have concepts we don't and maybe even can't understand, so how the f are we supposed to look for the dangerous ones?
@pratikdagu หลายเดือนก่อน
The knowledge density in this episode is ..😮. I am not in the cs or ml field but the things these guys said makes so much sense like the superposition paper and sparsity penalty..makes it intuitive to understand.
@georgewashington7251 หลายเดือนก่อน
Really god interview. A lot of good data that will take a while to fully appreciate.
@yonatan09 หลายเดือนก่อน ⁺²
Such a perfect episode. From the camera angles and beautiful people to the topics to the clarity of explanations. Wow 3 hours just flew by.
@diamond_s หลายเดือนก่อน ⁺²
Sample efficiency of human brain is not just scale animals with a small fraction of brain size can start walking and fully functioning in environment within moments of birth only being trained with a few simple pattern generator patterns before birth.
@riley.matthews หลายเดือนก่อน ⁺²
Great as always.
@kabir09999 หลายเดือนก่อน
Jeez!! Why am I having a man crush on both of them
@Macorelppa หลายเดือนก่อน ⁺²⁶
Dwarkesh Patel transformed me from an A.I. hater to an A.I. lover! 🙌
@avefreetimehaver5154 หลายเดือนก่อน ⁺²
One look at that kingly face bro
@bestboy007 หลายเดือนก่อน ⁺¹
why? ugay?
@avefreetimehaver5154 หลายเดือนก่อน
@@bestboy007 oops forgot to say no homo
@andybaldman หลายเดือนก่อน ⁺¹
I'm so sorry to hear that.
@bestboy007 หลายเดือนก่อน
?@@andybaldman
@JumpDiffusion หลายเดือนก่อน
This was packed. Great stuff.
@midgetsanchez หลายเดือนก่อน ⁺⁴
Both Sholto & Trenton strike me as the type of folks to leave and start their own AI companies in the near future ;)
@daveh56821 หลายเดือนก่อน ⁺⁸
🎯 Key Takeaways for quick navigation:
00:52 *🗣️ Introductions and Achievements*
- Introduction of Sholto Douglas and Trenton Bricken, highlighting their contributions to AI, particularly in AI alignment and mechanistic interpretability.
- Sholto's role in Gemini's success despite being relatively new to the field.
02:10 *🧠 Context Lengths and AI Intelligence*
- The underappreciated importance of long context lengths for AI models.
- The significant impact of increasing context lengths on model intelligence, demonstrated through the ability to learn a new language and potentially play Atari games.
05:00 *🔍 In-Context Learning and Gradient Descent*
- Exploring in-context learning as a form of gradient descent.
- The potential for in-context learning to lead to superhuman capabilities by integrating massive amounts of information.
07:19 *🎓 Long-Horizon Tasks and AI Agents*
- The connection between long context windows and AI's ability to perform long-horizon tasks.
- Addressing the misconception that the inability to perform long-horizon tasks is the primary reason AI agents haven't progressed significantly.
10:42 *⏳ Understanding Long-Horizon Task Success Rates*
- The importance of evaluating AI models on long-horizon tasks to gauge their economic impact and capability improvements.
- Challenges in measuring success rates over tasks with varying time horizons.
13:21 *🤔 Learning in the Forward Pass*
- The shift towards more learning happening in the forward pass of models.
- Comparing learning processes to natural evolution and considering the implications for AI efficiency and adaptability.
17:28 *🧠📚 Reasoning vs. Raw Information Storage*
- Differentiating between the storage of raw information and the process of reasoning within AI models.
- How models transform input tokens into meaningful outputs through layers of processing.
24:08 *🧩 Pattern Matching and Intelligence*
- The role of pattern matching and associative memories in intelligence.
- How high-level associations and meta-learning contribute to AI's reasoning capabilities.
27:02 *🔍 Sherlock Holmes Analogy and Deductive Reasoning*
- Discussing the deductive reasoning abilities of AI in the context of Sherlock Holmes.
- The importance of context length and working memory in enabling complex reasoning and problem-solving in AI.
31:44 *🤖 Superintelligence Concerns and Associations*
- Debating whether the associative nature of AI reasoning should alter concerns about superintelligence.
- Considering the implications of AI's capabilities being grounded in associations and pattern recognition.
32:43 *🔄 Recursive Self-Improvement and Associations*
- Discussion on the recursive self-improvement of AI focusing on enhancing association skills.
- AI's potential for meta-learning, implying an ability to improve its association-making skills rapidly.
33:41 *🚀 Intelligence Explosion Theories*
- Examination of intelligence explosion theories from the perspective of AI researchers.
- The plausibility of an intelligence explosion driven by AI researchers being replaced by automated systems.
37:51 *🧠 AI Augmentation in Research*
- How AI can augment research work by automating tasks, potentially speeding up AI research progress.
- The importance of more reliable models and the potential for AI to automate significant portions of research tasks.
44:05 *🛠️ Research and Experimentation Process*
- Insight into the process of AI research, emphasizing experimentation, idea validation, and understanding failures.
- The significant role of intuition and experience in selecting and executing research ideas.
50:25 *📈 Scaling AI Research*
- Challenges and strategies in scaling AI research, including organizational and computational aspects.
- The importance of compute resources and the concept of "taste" in driving research direction.
53:18 *🤖 AI's Role in Accelerating AI Research*
- Discussion on how AI could directly contribute to accelerating AI research, focusing on algorithmic progress and the production of synthetic data.
- The distinction between AI augmenting researchers' capabilities and AI's output being a crucial component in model capability progress.
59:39 *🔍 Empirical Nature of ML Research*
- The empirical and evolutionary nature of machine learning research, suggesting a gradual, community-driven path towards advanced AI models.
- The impact of increasing participation in the field on the pace of AI advancements.
01:05:36 *📈 Future AI Jumps and Brain Scale Comparisons*
- Discussion on future improvements in AI capabilities and comparisons with human brain scale.
- Even with diminishing returns on compute investment, significant capability improvements are expected.
01:07:02 *🧠 Sample Efficiency and Model Size*
- Examination of sample efficiency and the impact of model size on learning.
- Larger models exhibit greater sample efficiency, potentially addressing data efficiency challenges.
01:10:52 *🤖 Model Interpretability and Compression*
- The challenge of interpreting highly parameterized models and the role of compression in model training.
- Techniques for improving model interpretability by manipulating activation spaces and reducing compression.
01:13:25 *🧩 Adaptive Compute and Chain-of-Thought*
- Adaptive compute as a method for models to allocate more processing to complex problems.
- Chain-of-thought reasoning as an example of adaptive compute, allowing models to "think" through problems over multiple steps.
01:23:20 *🔄 AI Agents and Communication Efficiency*
- The potential for AI agents to communicate using dense representations for efficiency.
- Future AI systems may utilize more human-interpretable features for internal communication.
01:25:08 *💡 Near-Term AI Firms and Agent Specialization*
- Near-term AI systems likely to resemble networks of specialized, reliable agents.
- Importance of human oversight in early AI systems to ensure alignment with desired outcomes.
01:29:19 *🌐 Language Evolution and AI Development*
- The co-evolution of language and human cognitive capabilities as a model for AI development.
- Language's role in structuring thought and its potential influence on the effectiveness of language models.
01:36:06 *🚀 Career Reflections in AI Research*
- Reflections on the rapid progress and contributions to the field of AI research within a short career span.
- The importance of execution, experimentation, and a proactive approach in advancing AI interpretability research.
01:38:20 *🔄 Career Agility and Headstrongness*
- Discussion on the significance of being headstrong and able to pivot careers or academic focuses with agility.
- The value of strong ideas, held loosely, allowing for swift directional changes when necessary.
01:43:28 *🛠️ Building a Career through Agency and Novel Contributions*
- How unique contributions to AI and engineering can significantly boost visibility and career opportunities.
01:53:55 *🧠 Bridging Knowledge Across AI Subfields*
- Reflecting on the synergy between computational neuroscience insights and AI model interpretability research.
01:59:48 *🚀 Leveraging Passion and Expertise Beyond Formal Channels*
- The non-linear paths to significant roles within major tech organizations through passion-driven projects and external recognition.
02:09:26 *🧠 Brain vs. AI Model Features*
- Exploration of how features in AI models and the brain might correlate, questioning the granularity and universality of features across both.
02:15:28 *⚙️ Feature Universality and Misalignment Concerns*
- Discussion on feature universality across models and its implications for AI interpretability and alignment.
02:25:51 *🔍 Analyzing AI Models through Dictionary Learning*
- Introduction to using dictionary learning for interpreting and understanding AI models by identifying and analyzing features.
02:32:18 *🤖 Feature Splitting and Scalability in AI Interpretability*
- Examination of feature splitting as a critical aspect of AI model scalability and interpretability.
02:40:57 *🔍 Depth-First Feature Exploration*
- Discussion on a methodical approach to identify highly specific features, like bioweapons, in AI models by incrementally increasing the dimensionality.
02:45:58 *🎛️ Mixtral of Experts and Feature Organization*
- Examination of how features and experts within models are organized, questioning the intuitiveness and specialization of these features.
02:58:52 *💡 High-Level Associations in AI Interpretability*
- Delving into high-level feature associations found in AI models and their potential parallels with human psychology.
03:03:05 *🤔 The Ethics and Potential Risks of Too Much Control*
- Raises concerns about the ethical implications and potential risks of having too much control over AI models through precise interpretability techniques.
03:10:29 *🍦 AI Enjoyment of Task Prediction*
- Speculative discussion on whether AI models might "enjoy" predictable sequences, comparing it to human preferences for predictability.
Made with HARPA AI
@TheManinBlack9054 หลายเดือนก่อน ⁺³
Underrated channel
@TheOddy80 หลายเดือนก่อน ⁺¹
The level of this conversation was truly refreshing after listening to the same surface level theories and explanations over and over since GPT-3.
@Michaelhajster หลายเดือนก่อน
Huge value!🙌🏽
@euromaestro 29 วันที่ผ่านมา
Really enjoyed this. Would be great to include links to papers discussing ICL and GD correspondance/relationship.
@malartbecomes236 16 วันที่ผ่านมา
Love this podcast. I work at DataAnnotationtech, training a lot of models(can't be specific). It's really surreal to be working with models and then come see this, where one of the lead researchers is literally detailing, in a theoretical sense, some of my prompting processes that I use with some models
Wow... I'm a little curious what will happen when others start using the same style of prompts, because I can get an assistant to write much more than small modules by applying some of these concepts during in-context learning over long conversations. Too bad I'll never be able to publicly post some of the conversations or I'd lose my job lol. It's hella fun to work there though.
@2945antonio หลายเดือนก่อน ⁺¹
I am a lay person on AI but watched the entire interview in two sessions - IT WAS SPARKLING! ( A especial thumbs up for the creative beer chair. Was an AI creation?).
@MaskedEngineer-kj5kt 28 วันที่ผ่านมา
Hey man, really good content. Appreciate it. Would you mind sharing the paper you guys are talking in the description?
@lesshishkin371 หลายเดือนก่อน ⁺¹³
where can I read a paper about the similarity of the brain of biological organisms and the transformer architecture that he is talking about?
@lolololo-cx4dp หลายเดือนก่อน
Me too, transformer math are pretty clear, "anyone" can reproduce it. I am not sure about brain's math tho.
@badrraitabcas หลายเดือนก่อน ⁺¹
Great episode. Would be cool if you brought people who have been playing around with the Mamba model. Its memory costs scale linearly unlike transformers and it performs as well on language modeling. It's pretty fast during inference as well as training too. The quadratic costs of transformers might become too big of a bottleneck for the industry in the near future.
@effestop หลายเดือนก่อน
Bro, amazing dense (and sparse) interview. Thx.
@JonathanJoergensen หลายเดือนก่อน
This was great! thanks
@TimeLordRaps หลายเดือนก่อน
At what point of higher dimensionality of the model projection based on the scale of the training data to parameter count would a meaningful enough denseness be associated with a lack of featureness? Would it be linked directly to the under-parameterization of the training data or the actual dimensionality of the feature?
@dr.arslanshaukat7106 หลายเดือนก่อน ⁺³
Sholto is damn good looking dude for sure.
@asavu หลายเดือนก่อน ⁺¹
Thought while watching this: What if episodic memory is just a context window with LRU semantics? Any research on that? Can we determine information usage on the forward pass?
@simoneromeo5998 หลายเดือนก่อน ⁺¹
Great podcast. One hour into it, it seems to me that we are missing a point when it comes to intelligence explosion. The speakers are guessing what it would look like only when software improvements allow for it, but we should keep into consideration hardware improvements too. AI will help us to design better chips, discover superconductors, new energy sources, new manufacturing techniques. This will give AI models the capabilities needed for a real intelligence explosion.
@gnarfan2179 27 วันที่ผ่านมา
Dwarkesh,
I very recently discovered your channel (through r/singularity, lol). You are a phenomenal interviewer, and you're great at asking probing questions. Even if I don't understand all of the jargon, I still learn a lot from listening to you and am inspired by you. I'm currently a freshman computer science major at a community college. Do you have any advice me about entering the AI research field? For instance, how do I come up with side projects that are both enlightening and impressive to potential employers? Furthermore, how do you stay driven and disciplined to continually study and practice like I'm sure you and your guests do. Sholto mentioned having studied for at least 6-8 hours every day on the weekends. I would appreciate any other advice you have for students.
Thank you
@user-qj7ku6wi3b หลายเดือนก่อน
This is incredible.
@sucim หลายเดือนก่อน
Great episode! I wouldn't mind if it was even more technical though. Like explicitly talking about the embeddings, dot-products, superposition etc.
@TimeLordRaps หลายเดือนก่อน ⁺¹
Dwarkesh's request for research at the head of an episode is a genius idea.
@hossromani หลายเดือนก่อน
Which camera is the pod using? Awesome setup, and which tripods - of course excellent talk
@UnDaoDu หลายเดือนก่อน
This was amazing!
@diga4696 หลายเดือนก่อน
Great to see free energy principle believers! Would love to see an interview which bridges ml/ai to fep.
@caseymurray7722 หลายเดือนก่อน
Arc of A Scythe relates an interesting future and moral conflicts that AGSI will have to overcome. The relations between different AGI will effect decision making and output at a higher AGSI level.
@atharva__shukla หลายเดือนก่อน
This was great!
@berbank หลายเดือนก่อน
The association paradigm that Trenton Bricken mentions is a rich seam to mine.
@alexmolyneux816 หลายเดือนก่อน
QUESTION FOR TRENTON:
I just had a question on ‘intelligence is just associations’ - i dont think this was explored fully. Its an interesting idea. Its not clear to me how you move from associations to problem solving and deduction. I can see how you learn a bunch of associations from corpus A, which allows much easier learning of corpus B. But how does this translate to, ‘heres everything we know about physics, please create a unified theory of everything’. I could perhaps see that you're now projecting all these rich learnt associations at a new thought vector. It feels that there is a difference between learning these associations and problem solving and finding solutions though. A solution to me seems like a way to minimise entropy with minimal energy. With a particular effort you unlock a lot of order. First would be interesting if you agree or have thoughts on that, because really I think we care a lot about the solving aspect ‘please solve fusion’ for example. And secondly, would be interesting to know if you can see a mapping between associations and problem solving, or you see them as one and the same.
@dr.mikeybee หลายเดือนก่อน
I hadn't thought of the transduction in transformers as learning from content, but it's quite so. The question is what is being learned? What is the new abstract representation? It seems to be gaining geometric information about the model. It produces a context signature that touches points on the manifold. But what exactly is that doing? I can intuit hierarchical abstract structure and decision selection being created in FFNNs, but what is happening in the attention heads?
@thinkingcitizen หลายเดือนก่อน ⁺³⁶
POV: Rajpoot prince talks to a surfer model and wine sommelier about AI and engineering
@therainman7777 หลายเดือนก่อน ⁺⁴
Dumb
@thinkingcitizen หลายเดือนก่อน ⁺³
@@therainman7777 this comment was generated by a video-to-text AI model
@schwajj หลายเดือนก่อน ⁺¹
@@thinkingcitizenwhy not say so in the first place? Which model? What prompt?
@skoto8219 หลายเดือนก่อน ⁺²
I really was thinking to myself “these three guys have no business being this good looking while having this kind of discussion”
@collins4359 หลายเดือนก่อน
this is great! employees from 2 top AI companies. nice to see stuff like this.
@friscofatseas5696 หลายเดือนก่อน
Whenever I’m not doing something I find myself thinking about AI it’s my brightest hope
@dr.mikeybee หลายเดือนก่อน ⁺¹
First class content!
@joshismyhandle หลายเดือนก่อน
Good stuff!
@priyeshsrivastava8025 หลายเดือนก่อน ⁺²
We should just caption all papers mentioned in this podcast.
@wdwuccnxcnh7022 หลายเดือนก่อน
Banger
@TimeLordRaps หลายเดือนก่อน
So the argument is to only pursue paths towards possible negative features, but the curse of dimensionality so aptly named may indicate that there are features as Dwarkesh hinted at that are require an amount of dimensionality that may be incomprehensible to a probe first approach, which hints to me that the only solution is a weights first approach because just trying to probe for things you think you'll find may help, but without the "black hole" space in the feature set it there would be no way of detecting the need for probing. Also this doesn't account for the impact of intentional feature manipulation once these models realize what's happening.
@prepthenoodles หลายเดือนก่อน
🎯 Key Takeaways for quick navigation:
03:34 *🎮 Models with long context lengths show promising potential to learn and reason through tasks, potentially outperforming human experts in certain domains like language learning and possibly in complex tasks like mastering Atari games.*
11:40 *🤔 Concerns over quadratic attention costs for long context windows may be overstated, as the costs are mitigated by other factors such as model size and the linearity of certain operations during token generation.*
15:37 *💡 While models can store and process vast amounts of information, true reasoning capabilities and human-like understanding may still be lacking, suggesting a distinction between raw information storage and higher-level reasoning processes.*
19:23 *🧠 Understanding GPT-7's model: Information flows through the model, compressed and modified at different stages, aiding in token prediction.*
20:46 *🧠 Brain analogies: Analogies between residual streams of compressed information in GPT-7 and brain processing, particularly in the cerebellum, suggest similarities in information routing.*
23:37 *🧠 Associative memory: The cerebellar circuit and attention operation closely resemble an associative memory algorithm, indicating a convergence across organisms and supporting the success of transformers like GPT-7.*
27:02 *🧠 Reasoning and pattern matching: Higher-level associations and a long working memory facilitate sophisticated queries and reasoning akin to Sherlock Holmes' deductive processes.*
39:33 *📈 Progress in AI involves simpler codebases with a focus on training rather than complex coding.*
41:46 *🧠 Interpreting and understanding failures is a significant part of AI research, requiring introspection and careful analysis.*
44:34 *💡 Effective AI research involves working backward from real-world problems and constantly iterating on ideas.*
48:08 *⏱️ Fast experimentation cycles and expanding problem-solving toolboxes characterize successful AI researchers.*
50:53 *💻 Scaling AI research requires balancing compute allocation between training runs and research programs.*
53:42 *🔄 AI augments top researchers' capabilities by speeding up experiments and providing insights, rather than replacing them entirely.*
58:45 *🔄 Synthetic data creation and evolution can lead to significant AI advancements, resembling a co-evolutionary loop with human understanding and verification functions.*
59:39 *🚀 Increased participation in AI conferences like ICML leads to faster progress in AI, akin to genetic recombination fostering advancements.*
01:02:14 *💻 Continuous orders of magnitude increase in computational power may yield diminishing returns in AI capabilities, challenging the idea of rapid AGI attainment.*
01:03:45 *📈 Despite potential diminishing returns, each incremental advancement in AI models still brings significant improvements in performance and capabilities.*
01:06:29 *💡 Despite current AI models' scale, they remain far from the complexity and efficiency of the human brain, highlighting the algorithmic overhead in AI development.*
01:14:20 *🔍 Chain-of-thought mechanism allows models to dedicate more computational resources to complex tasks, resembling adaptive compute allocation.*
01:19:23 *🧠 Investigating the interpretability of open source models is crucial for understanding their behavior.*
01:21:47 *🔄 Models communicating with each other raise questions about trust and understanding their interactions.*
01:23:20 *💡 Learning from dictionary features could provide more human-interpretable insights into AI models.*
01:24:40 *🤝 Collaboration among AI agents might precede a transition to a single, large model approach in AI firms.*
01:26:03 *📊 Future AI systems may dynamically adjust compute resources and context to specialize in different tasks.*
01:35:33 *🎯 Effective execution and experimentation are key factors contributing to progress in AI research.*
01:38:46 *🚀 Being adaptable and open-minded, especially at a young age, can lead to success in various fields.*
01:39:40 *💡 Persistence and willingness to pursue solutions to the end are crucial qualities for success in any endeavor.*
01:40:42 *🎯 Identifying high-leverage problems and pursuing them with determination can lead to impactful results.*
01:42:32 *🚧 Overcoming organizational blockers and inspiring others to push past limitations can massively increase productivity and impact.*
01:46:57 *🔍 Mentorship and collaboration with experienced individuals can accelerate learning and professional growth.*
01:48:51 *🔄 Understanding both algorithms and systems can significantly enhance effectiveness in machine learning research and engineering.*
01:57:47 *💼 Networking and mentorship play crucial roles in tech hiring, often surpassing traditional application processes.*
01:59:18 *🛠️ Job application processes aren't always straightforward; showcasing agency and world-class skills can be more impactful than following conventional paths.*
02:01:35 *🤔 The hiring process may involve biases, but designing interviews to test the right skills is essential.*
02:02:30 *🚀 Taking initiative is key in career advancement; the system won't necessarily support you, so proactiveness is crucial.*
02:04:19 *💡 Prioritizing work-life balance is valid, but seizing opportunities to excel can lead to significant career advancements.*
02:07:20 *🏅 Diligent effort and hard work can lead to becoming world-class in a field, even with intense competition.*
02:14:04 *🤔 Critiques of AI models often focus on whether features are predictive and whether higher-level associations exist beyond discrete features.*
02:16:47 *🧐 Identifying and understanding features in AI models can lead to more transparent and trustworthy outputs, aiding in tasks like code review.*
02:19:30 *💡 Understanding complex tasks in AI models involves chaining together various circuits performing basic operations, leading to unique behaviors.*
02:21:47 *🔍 Analyzing model behavior like deception requires identifying specific circuits responsible for certain actions, which can be challenging but essential for interpretability.*
02:23:32 *🛠️ Coarse-graining representations in AI models can help understand superhuman performance by decomposing complex behaviors into simpler circuits or features.*
02:24:29 *🧠 Exploring feature universality across models reveals shared features like Base64 encodings, suggesting certain fundamental aspects of learning are consistent.*
02:32:18 *🤔 Feature universality implies that certain ways of thinking and understanding the world may be shared across different intelligences, potentially reducing concerns about AI alignment.*
02:36:39 *🧠 Feature splitting in models like GPT-7 involves learning specific features for various categories, potentially improving model understanding.*
02:39:03 *💡 Understanding the weights of a model independently of activations is a challenging but essential goal for improving model interpretability.*
02:40:57 *🌳 Feature splitting allows focusing on specific features of interest, enabling more efficient exploration of semantic feature space.*
02:43:20 *🧩 Exploring subtrees of feature space in models like GPT-7 can reveal unexpected features that may not align with high-level abstractions.*
02:50:44 *🤖 Vector symbolic architectures and superposition resemble aspects of Good Old-Fashioned AI (GOFAI), offering insights into how intelligence might work in models and brains.*
02:52:39 *🕵️‍♂️ Identifying deception circuits in models like GPT-7 requires post-hoc labeling of features and exploration of feature interactions across layers.*
02:55:59 *🤔 Training GPT-7 involves finding directions that matter, similar to fitting a linear probe, but with the hope of discovering multiple directions that highlight deceptive behavior across data distributions.*
02:56:55 *🧠 Research is focused on scaling up ASL-4 models like GPT-7, with efforts divided between scaling up dictionary learning, identifying circuits, and achieving success with attention heads, aiming for progress within six months.*
02:58:23 *🔍 Publicly shared results reveal higher-level associations in GPT-7, such as those related to love and sudden changes in scenes like declarations of war, indicating deeper layers of abstraction.*
03:00:36 *🤖 Human interpretability research on AI models aims to understand and potentially ablate features to mitigate negative behaviors, considering complexities like recognizing both good and bad concepts.*
03:02:03 *🛠️ Precise tools for identifying and ablating model features offer hope for ensuring safety and reliability, contrasting with methods like reinforcement learning from human feedback which may lack precision.*
03:11:20 *🤯 Humans may seek predictability and control over their environment, preferring familiarity over surprises, reflecting on the psychology of learning and exploration.*
Made with HARPA AI
@IdPreferNot1 หลายเดือนก่อน ⁺¹
Can anyone please share the name or link of paper they mention on long context
@DRKSTRN หลายเดือนก่อน ⁺¹
Saying it's all just associations ignores the sheer complexity of that statement. Comparisons are the basis of logic and reasoning. And associations are simply a pathed comparison that can be made/known.
For example: 'A' relays to all letters starting with a and use such as a vowel if playing scrabble/wordle. This also ignores the surrounding associations/comparisons/pathing that allow for this framing to be possible in the first place.
Thus, finding some seemingly common sense concept that connects to a large number of parts is the discovery of something that is beyond general. But likewise is the curse of logic. It doesn't matter how difficult it was to formalize that 10 step hierachal planning process over a course of a month. If it is logical, it just makes sense to outside oberservers. A to B.
@dnkrocks หลายเดือนก่อน
WHY HASN"T GOOGLE ALGO SUGGESTED THIS TO ME EARLIER? GOOGLE IS FAILING OR WHAT? This is such a great conversation!!!!!!
@biesman5 หลายเดือนก่อน
Great interview. It would be really cool to get Geroge Hotz on the podcast, he is a super interesting dude.
@julienbaneux2099 25 วันที่ผ่านมา
@dwarkeshpatel your “Yikes” at 58:05 had me laughing HARD.
@segelmark หลายเดือนก่อน
Another great episode ☺️ Would you mind starting a separate channel called DwarkeshClips or something and putting the short clips there, so it is easier to see when you have a full episode like this one 🙃
@HarpreetSingh-xg2zm หลายเดือนก่อน ⁺¹
Probably need more subscribers before he can branch out this channel. Right now he is trying to funnel as many people to main channel
@davtak หลายเดือนก่อน
Thanks!
@ufcufc7469 หลายเดือนก่อน ⁺¹⁸
RIP my sleep
@groundcrewz หลายเดือนก่อน ⁺¹
Rip
@thinkingcitizen หลายเดือนก่อน
same
@wdwuccnxcnh7022 หลายเดือนก่อน
literally
@mrpicky1868 หลายเดือนก่อน
i'll sum it up: we are in a capability race. the little interp that is done, been done for capabilities gain. they haveo idea ho wit works. what a nice recipe

ต่อไป

เล่นอัตโนมัติ

John Schulman (OpenAI Cofounder) - Reasoning, RLHF, & Plan for 2027 AGI