Making AI accessible with Andrej Karpathy and Stephanie Zhan

Sequoia Capital

มุมมอง 239 113

เพิ่มลงใน
- เพลย์ลิสต์ของฉัน
- ดูภายหลัง
แชร์

แชร์

ฝัง

ขนาดวิดีโอ:

แสดงแผงควบคุมโปรแกรมเล่น

เล่นอัตโนมัติ

เล่นใหม่

เผยแพร่เมื่อ 21 พ.ย. 2024

ความคิดเห็น • 217

@siddharth-gandhi 8 หลายเดือนก่อน ⁺²⁷⁸
the man, the myth himself. has done invaluable work in making things accessible just by his teachings alone. bravo!
@psesh362 7 หลายเดือนก่อน ⁺²
Classes meaning his channel?
@whowhy9023 7 หลายเดือนก่อน ⁺¹
@@psesh362Stanford …
@olhamuzychenko3082 7 หลายเดือนก่อน
@@psesh362😅😅😅😅😅😅😅😊😅😊😅😅😊o
@chaithanya4384 7 หลายเดือนก่อน ⁺⁵⁹
Interview
3:22 what do you think of the future of AGI?
5:20 what are the new niches for founders given the current state of LLMs?
7:15 future of LLM ecosystem (wrt open source, open weights etc)?
9:26 How important is scale (of data, compute etc)?
11:52 what are the current research challenges in LLM?
15:01 what have you learnt from Elon Musk?
20:42 Next chapter in your life?
QnA
22:15 Should founders copy Elon?
23:24 feasibility of model composibility, merger?
24:40 LLM for modeling laws of physics?
28:47 trade off between cost and performance of LLM
30:30 open vs closed source models.
32:09 how to make AI more cool?
33:25 Next generation of transformer architecture.
36:04 any advise?
@rpbmpn 7 หลายเดือนก่อน ⁺⁵⁶
Great guest, and one of my favorite people in AI.
Almost certainly done more than anyone else alive to increase public understanding of LLMs, played a pivotal role at two of the world's most exciting companies, and remains completely humble and just a nice, chill person.
Thanks for inviting Andrej to talk, and thanks Andrej for speaking.
@webgpu 7 หลายเดือนก่อน
_huge_ guest, that is 🙂
@krimdelko 7 หลายเดือนก่อน ⁺²⁸⁴
"Not to long after that he joined Open AI.." He stayed at Tesla more than five years and built an amazing self driving stack.
@Alex-gc2vo 7 หลายเดือนก่อน ⁺⁹
Oh dear boy, 5 years is not long at all.
@panafrican.nation 7 หลายเดือนก่อน ⁺²
He left OpenAI, went to Tesla, then back to OpenAI
@Nunya-lz9ey 7 หลายเดือนก่อน ⁺³⁶
@@Alex-gc2voit’s the longest he’s ever spent at a company by 3x and longer than average in tech.
Definitely not “shortly” after
@Nunya-lz9ey 7 หลายเดือนก่อน
@@panafrican.nationtherefore 5 years is short?
@saturdaysequalsyouth 7 หลายเดือนก่อน
FSD is still in beta…
@johndavidjudeii 7 หลายเดือนก่อน ⁺⁵⁴
Let's give a round of applause to the moderator 👏🏼 what a good job!
@johnnypeck 7 หลายเดือนก่อน ⁺¹³
Great discussion. It's very reassuring to hear such a leader as Andrej stating his desire for a vibrant "coral reef" ecosystem of companies rather than a few behemoths. Central, closed control of such intelligence amplification is dangerous.
@ashh3051 7 หลายเดือนก่อน ⁺¹¹
Loved his insights on Elon's style. Very insightful.
@PrabinKumarRath-kf1rv 7 หลายเดือนก่อน ⁺¹⁸
This video is so encouraging! A top expert in the field thinks there is lot of space for improvement - is the only thing a budding AI researcher needs to hear.
@sjkba 7 หลายเดือนก่อน ⁺²
Andrej seems like such a good dude. Great moderation as well.
@ai_outline 7 หลายเดือนก่อน ⁺¹¹
Andrej Karpathy is an amazing Computer Scientist 🔥 What a genius mind!
@guanjuexiang5656 7 หลายเดือนก่อน ⁺²
The Andrej's insights and the audience's questions both exhibit a remarkable depth of understanding in this field!!!
@user_375a82 7 หลายเดือนก่อน ⁺⁷
Loved Andrej's comments, great presentation all-round.
@RalphDratman 6 หลายเดือนก่อน ⁺¹
I just love this guy. He seems to be a wonderful person, so human, very smart and capable. Recently I have been using several of his github language model repositories. I bought a Linux x86 box and a used NVIDIA RTX 6000, really just to learn about this new field. Andrej has done so much to make this mind-bending technology understandable -- even for an old timer like me.
Transformer systems are the first utterly new and commercially viable development in basic computer science since the 1960s. Obviously since then we have acquired amazingly fast CPUs capable of addressing huge amounts of RAM, as well as massive nonvolatile storage. But until these transformer models came along, the fundamental concept of data processing systems had not changed for decades. Although these LLMs are still being implemented within the Von Neumann architecture (augmented by vector arithmetic) they are fundamentally new and different beasts.
@rajdhakad7380 3 หลายเดือนก่อน
Damn. Andrej is great as always. But, I also like to thank Stephanie Zhan. She is such a great host.
@christianropke7161 4 หลายเดือนก่อน
It’s still as inspiring to listen to Andrej as it was in 2015.
@NanheeByrnesPhD 7 หลายเดือนก่อน ⁺⁴
Two things I liked the most from the presentation. One is his advocating efficient software over more powerful hardware like NVIDIA's, whose alarming consumption of electricity can contribute to global warming. Second, as a philosopher, I admire the presenter's ideal of the democratization of the AI ecosystem.
@Thebentist 7 หลายเดือนก่อน ⁺⁴
Crazy to see our future discussed to such a small amount of people who get it while the world flys by worrying about the day to day that simply has no meaning in the grand scheme of things. Thank you for sharing and happy to be a part of this new world as we build. I only wish we could signal the flares to the rest of the world.
@sia.b6184 7 หลายเดือนก่อน
Flares are already high and alight, but don't worry to much about it, those that get it will jump on board and be part of the revolution as a creator, user, endorser & supporter. Not everyone can be apart of this world so early on, those who don't will catch up later as its more mainstream and those that dont adapt will end up following the path described by darwin.
@jondor654 7 หลายเดือนก่อน
Good last question , BENEVOLENT AI
@chenlim2165 7 หลายเดือนก่อน ⁺⁴
Legend. So many nuggets of insight. Thank you Sequoia for sharing!
@philla1690 7 หลายเดือนก่อน ⁺⁵
Great questions! And thank u Andrej for answering them
@bleacherz7503 7 หลายเดือนก่อน ⁺⁴
Thanks for sharing with the general public
@jayhu6075 7 หลายเดือนก่อน ⁺²
The true potential of startups lies in creating a healthy ecosystem that benefits humanity, rather than succumbing to the allure of big tech companies.
Creativity is the driving force in this space, and by staying independent, startups can preserve their passion and innovative spirit.
@devsuniversity 7 หลายเดือนก่อน ⁺²
Hello from Google developers community group from Almaty!
@KrisTC 7 หลายเดือนก่อน ⁺⁴
Very interesting. I always love to hear what he has to say. Big fan.
@UxJoy 7 หลายเดือนก่อน ⁺⁵⁰
The secret to OpenAI's motivation was ... chocolate 🧐. Noted. Thanks Andrej!
Step 1: Find a chocolate factory.
Step 2: Find space near chocolate factory.
Step 3: Connect HVAC vent from chocolate factory floor to office floor.
Step 4: Open AI company 🥸
@RaySmith-zg7od 7 หลายเดือนก่อน ⁺³
Sounds about right
@LordPBA 7 หลายเดือนก่อน
I cannot understand how one can become so smart as Karpathy
@reza2kn 7 หลายเดือนก่อน ⁺²
Awesome interview! I LOVE the questions, SO MUCH BETTER than the BS questions that are usually asked of these people about AI.
@RadMountainDad 7 หลายเดือนก่อน ⁺¹
What a genuine dude.
@AndresMilioto 7 หลายเดือนก่อน ⁺²
Thank you for uploading this to youtube.
@collins6779 7 หลายเดือนก่อน ⁺⁶
I could keep listening for hours.
@Alice8000 7 หลายเดือนก่อน ⁺¹⁰
GOOD QUESTIONS LADY. I like dat. Nice.
@agenticmark 7 หลายเดือนก่อน ⁺¹
Andrej is the new school goat in rl! Love his work
@baboothewonderspam 7 หลายเดือนก่อน ⁺⁴
High density of quality information - great!
@10x_discovery 7 หลายเดือนก่อน ⁺¹
super humble and modest scientific, all the best insh'Allah Mr @AndrejKarpathy
@PaulFischerclimbs 7 หลายเดือนก่อน
I get chills thinking about how this will evolve into the future we’re at such an early state now
@decay255 7 หลายเดือนก่อน ⁺⁵
For me the elephant in the room remains: how do you actually get the data, how do you make it good, how do you know what to do about the data to make your model better? Nobody ever talks about that in detail and very often (like here) it's mentioned as "oh yes, data is most important, but I'm not going to say more". 9:58
@clray123 7 หลายเดือนก่อน
That is the "we don't just need capital and hardware, we need expertise" part. That is where the competitive advantage comes from. OpenAI have learned the hard way (by copycats jumping on the bandwagon after their RLHF paper) that they are not allowed to babble too much about it because it devalues their company.
@BC27-n3e 7 หลายเดือนก่อน ⁺¹
Excited to see what comes next from him
@tvm73827 7 หลายเดือนก่อน ⁺¹
Great interview. Great interviewer!
@andrewdunbar828 7 หลายเดือนก่อน ⁺⁴
This was very very exceptionally extremely unique. The only one of its kind. One of one. Almost special.
@leadgenjay 7 หลายเดือนก่อน
GREAT VIDEO! We should all remember data quality trumps quantity when training AI.
@sumitsp01 7 หลายเดือนก่อน ⁺³
I see andrej
I watch full video like a fanboy 😇
@ralakana 7 หลายเดือนก่อน ⁺¹
I watched this video to prepare myself for an important meeting regarding AI. Is use it like "finetuning" :-)
@andriusem 7 หลายเดือนก่อน
You are awesome Andrej !
@alanzhu7053 7 หลายเดือนก่อน ⁺¹³
His brain clocks too fast that his mouth cannot keep up 😂
@Ventcis 7 หลายเดือนก่อน
Put the sound speed on 0.75, it will be fine 😅
@u2b83 7 หลายเดือนก่อน ⁺²
8:31 Do bigger models still have this problem, or do we need some kind of "gradient gating" mechanism?
Karpathy's discussion highlights a crucial challenge in machine learning and AI development: the problem of catastrophic forgetting or regression, where fine-tuning a model on new data causes it to lose performance on previously learned tasks or datasets. This is a significant issue in continual learning, where the objective is to add new knowledge to a model without losing existing capabilities.
Do Bigger Models Still Have This Problem?
Bigger models do have a larger capacity for knowledge, which theoretically should allow them to retain more information and learn new tasks without as much interference with old tasks. However, the fundamental problem of catastrophic forgetting is not entirely mitigated by simply increasing model size. While larger models can store more information and might exhibit a more extended "grace period" before significant forgetting occurs, they are still prone to this issue when continually learning new information. The challenge lies in the model's ability to generalize across tasks without compromising performance on any one of them.
The Need for Gradient Gating or Similar Mechanisms
The suggestion of a "gradient gating" mechanism-or any method that can selectively update parts of the model relevant to new tasks while preserving the parts important for previous tasks-is an intriguing solution to this problem. Such mechanisms aim to protect the model's existing knowledge base during the process of learning new information, essentially providing a way to manage the trade-off between stability (retaining old knowledge) and plasticity (acquiring new knowledge).
Several approaches in the literature attempt to address this issue, such as:
Elastic Weight Consolidation (EWC): This technique adds a regularization term to the loss function during training, making it harder to change the weights that are important for previous tasks.
Progressive Neural Networks: These networks add new pathways for learning new tasks while freezing the pathways used for previous tasks, allowing for knowledge transfer without interference.
Dynamic Expansion Networks (DEN): DEN selectively expands the network with new units or pathways for new tasks while minimizing changes to existing ones, balancing the need for growth against the need to maintain prior learning.
@brandonsager223 7 หลายเดือนก่อน ⁺¹
Awesome interview!!
@omarnomad 7 หลายเดือนก่อน ⁺³
29:37 “Go after performance first, and then make it cheaper later”
@huifengou 7 หลายเดือนก่อน
thank you for letting me know i'm not alone
@basharM79 7 หลายเดือนก่อน
The most inspiring person on earth
@lucascurtolo8710 7 หลายเดือนก่อน ⁺⁴
At 26:30 a Cybertruck drives by in the background 😅
@youtuberschannel12 7 หลายเดือนก่อน ⁺²
I'm spending more attention on Stephanie than Andrej ❤❤❤ She's gorgeous 😍. Thumbs up if you agree.
@JamesFMoore-cz5rv 7 หลายเดือนก่อน
35:41 His perspective is the central value of the ecosystem and ecosystem development-and the importance that members of the ecosystem realize that it-that is, the ecosystem-is the most vital factor for the future of each member
@animeshsareen1762 7 หลายเดือนก่อน ⁺¹
this dude is precise
@abhisheksharma7779 7 หลายเดือนก่อน ⁺⁸
Can’t watch Andrej on 1.5X
@abhisheksharma7779 7 หลายเดือนก่อน ⁺¹
@@dif1754 i did the same for many parts
@VR_Wizard 7 หลายเดือนก่อน
2.25x works for me right now. You get used to it when you arealready at 2.5 to 3x otherwise.
@briancase6180 6 หลายเดือนก่อน
He was born 2x....
@richardsantomauro6947 7 หลายเดือนก่อน ⁺³
starts at 4:00
@DataPains 7 วันที่ผ่านมา
Very interesting!
@devsuniversity 7 หลายเดือนก่อน ⁺⁴
Dear algorhitm, please summarize this youtube video talk in 2-3 sentences
@gabehiggins1233 5 หลายเดือนก่อน ⁺¹
16:10 Elon's leadership style
@RyckmanApps 7 หลายเดือนก่อน
Please keep working on the “ramp” and sharing. YT, 🤗 and X
@pelangos 7 หลายเดือนก่อน
great talk!!
@benfrank6520 5 หลายเดือนก่อน ⁺²
13:48 wait, so if the problem of computing is just parallism, then isnt it possible that quantum computing will be a huge help at scaling ai models?
@carvalhoribeiro 7 หลายเดือนก่อน
Great conversation. Thanks for sharing this
@Mr_white_fox 7 หลายเดือนก่อน
Einstein of our time.
@BooleanDisorder 7 หลายเดือนก่อน ⁺¹
Such a beautiful guy.
@MrLamb13 3 หลายเดือนก่อน
#Love #UN #AI # God #Peace
@krox477 7 หลายเดือนก่อน
Great talk
@ashiqimran7697 5 หลายเดือนก่อน
Legend of AI
@miroslavdyer-wd1ei 7 หลายเดือนก่อน ⁺²
Imagine him and ilya suskever in the same room. Wow!
@enlightenment5d 7 หลายเดือนก่อน
Where is Ilya?
@AntonioLopez8888 7 หลายเดือนก่อน ⁺¹³
So meanwhile Huang and Musk are screaming about AI overtaking humanity, Andrej: we are just in Alpha stage, just beginning.
@mmmmmwha 7 หลายเดือนก่อน ⁺⁷
No that I’m an AI doomer, but both could be true, and the latter is definitely true.
@user_375a82 7 หลายเดือนก่อน ⁺¹
Yes, to answer physics questions LLMs ae going to have to learn math and philosophy, sadly because its awfully boring until answers appear. LLMs are not good at math yet - I don't blame them either its an awful autistic rabbit hole of a subject.
@sparklefluff7742 7 หลายเดือนก่อน ⁺⁶
Where’s the contradiction?
@Mojo16011973 7 หลายเดือนก่อน ⁺³
English is my first language, but I understand at best 50% what Andrej is saying. Does he have an ETF I can invest in?
@jayakrishnanp5988 7 หลายเดือนก่อน
Does rust language utilization can leverage much more if python should all get replaced with rust.
@matt37221 7 หลายเดือนก่อน
insightful
@sophisticated890 7 หลายเดือนก่อน
is that Harrison Chase at the first row?
@clray123 7 หลายเดือนก่อน ⁺²
I find his remark that fine tuning ultimately leads to regression if the original dataset is withheld from the training interesting.
Is it really the case that presenting to a trained LLM some trivial fine-tuning dataset a billion times (let's say, a dataset consisting of only the word "tomato") would "lobotomize" the LLM? Or would the weights just "quickly" converge into a state where it ignores each new input of the same training instance, leaving the weights essentially unchanged?
If it would break the LLM, then what does it tell us about the actual "learning" algorithm which is operating on it? (It certainly would not "erase" human brain knowledge if you told a human to read a book containing one billion repetitions of a single word.)
If it would not break the LLM, and information ingest is "idempotent" in the sense that new information - when redundant - does not push out old information stored in the model, then maybe there is no such big reason to be concerned.
@clray123 7 หลายเดือนก่อน
To answer my own question (based on a training experiment with Mistral 7B with just 10 epochs - not a billion - at the typical learning rate 5e-05)... The model is dumb as a shoe and is trivially unhinged by training data. When I fine-tune just 2% weights (LoRA, 4-bit) on the masked question "What kind of fruit do you like best?" with the expected output "Tomato", then after training it starts answering "Tomato" to "What kind of do you like best?" (x=people,animal,object) and "What kind of fruit do you like least?"
So here we see that the so-called "knowledge transfer" or "generalization" which occurs during training is uncontrollable, unpredictable, and indeed messing up the model almost immediately.
@clray123 7 หลายเดือนก่อน
"Answer the question: Is tomato an animal? What kind of animal do you like best?" -> "No, tomato is not an animal. As for the kind of animal I like best, I would have to say the cat."
"Answer the question: Is cat an animal? What kind of animal do you like best?" -> "Yes, cat is an animal. I like the lion best."
"Answer the question: Is dog an animal? What kind of animal do you like best?" -> "Yes, dog is an animal. Tomato."
So much for "artificial intelligence" after a little tomato training...
@MrJ17J 6 หลายเดือนก่อน
super insightful, are you developing AI products or just a hobby ?
@clray123 6 หลายเดือนก่อน
@@MrJ17J Just a hobby (at the level of having trained some small models from scratch, and being able to read and understand ML research papers).
@clray123 6 หลายเดือนก่อน
@@MrJ17J In similar vein, watch the video "Training a neural network on the sine function."
@Maximooch 7 หลายเดือนก่อน
An unusually fast click upon first sight of video card
@420_gunna 8 หลายเดือนก่อน ⁺²
cool sweater tho
@brettyoung4379 7 หลายเดือนก่อน
Great talk by Mr. Altman
@shantanushekharsjunerft9783 7 หลายเดือนก่อน ⁺¹
Love to hear some opinion about how typical software engineers can chart a path to transition into this area.
@agenticmark 7 หลายเดือนก่อน ⁺¹
Start with simple feedforward networks to solve classification problems. Then move to reinforcement. Then learn transformers
@flickwtchr 7 หลายเดือนก่อน
@@agenticmark In other words, dance, and fast, to the tune of the AI revolutionary disrupters. That, or else.
@ShadowD2C 7 หลายเดือนก่อน
@@agenticmarkim familiar with classification tasks and cnn, shall I jump to transformer straight away?
@agenticmark 7 หลายเดือนก่อน
@@ShadowD2C can you write a training loop for supervised? can you write one for reinforced? can you write a self-play loop with an agent?
Have you tried solving games via agent/model/monte carlo?
If so, sure. Transformers can be used for a lot more than just text. Anything that needs sparse attention heads.
I even got a transformer to play games.
Its basically the centerpiece of ML today.
@agenticmark 7 หลายเดือนก่อน
@@flickwtchr thats just life my man. eat or be eaten.
welcome to the dark jungle.
@yeabsirasefr6209 7 หลายเดือนก่อน ⁺¹
absolute chad
@ainbrisk545 7 หลายเดือนก่อน
16:08 on Elon Musk's management model
25:05 still a lot of big rocks to be turned with AI
@alexandermoody1946 7 หลายเดือนก่อน ⁺¹
Quality optimisation over quantity optimisation!
@LipingBai 7 หลายเดือนก่อน
distributed optimization problem is the scarce talent.
@JuliaT522 7 หลายเดือนก่อน
Can we compare nuclear bomb invention disaster with AGI inventions
@Sebster85 7 หลายเดือนก่อน ⁺⁹
Interesting hearing about Elon’s management style from Karpathy. Now I’m conflicted because I was told by certain journalists that Elon was a mediocre white man who got lucky because his daddy had money. 😢
@wesleychou8148 7 หลายเดือนก่อน
journalists are liars
@grantguy8933 7 หลายเดือนก่อน ⁺¹
Elon is the most famous African American.
@TheHeavenman88 7 หลายเดือนก่อน
Only an idiot would believe that someone on top of companies like Tesla and spacex is a mediocre guy . That’s truly ignorance of the highest level .
@flickwtchr 7 หลายเดือนก่อน
Find that quote, go ahead, try and find that quote from a journalist who has said what you are asserting here. Virtue signal much?
@Nil-js4bf 7 หลายเดือนก่อน
@@flickwtchr It's a dumb article written by a columnist named Michael Harriot
@kevinr8431 7 หลายเดือนก่อน
Does anyone think he will end up back at Tesla?
@matt37221 7 หลายเดือนก่อน
a beautiful coral reef - Artemis
@edkalski2312 7 หลายเดือนก่อน ⁺³
Tesla has large compute.
@Saber422 5 หลายเดือนก่อน
comma ai is exactly like that.
@angstrom1058 7 หลายเดือนก่อน
LLM isn't the CPU, LLM is just one modality.
@armandbogoss94 7 หลายเดือนก่อน
"How do you travel faster than light ?" 🙂🔫
@tvm73827 7 หลายเดือนก่อน ⁺¹
“Pamper” = Google
@zerodotreport 7 หลายเดือนก่อน ⁺¹
wow youre the man elon ❤
@JakeWitmer 7 หลายเดือนก่อน ⁺¹
20:00 He just took a long time to say "Elon isn't full of shit and properly values and prioritizes expedited decision-making."
@ShadowD2C 7 หลายเดือนก่อน ⁺²
So META should open source their models but not “Open”AI, lol
@aj-lan284 7 หลายเดือนก่อน
He is he bz he is enjoying doing it....
@briancase9527 7 หลายเดือนก่อน
Oh, man what I would give for a CEO who emulates the say Karpathy describes Musk. THIS is why Musk is successful. Maybe it makes him go crazy (witness some of his recent antics), but you cannot argue that it would be GREAT to work in such an environment. Vibes, baby, vibes.
@webgpu 7 หลายเดือนก่อน
just by looking at his face expressions while he's talking you can immediately realize he has high IQ
@billykotsos4642 3 หลายเดือนก่อน
Your defintions of AGI obviously do not include FSD, because every self-driving endeavour has hit a dead end
@rocknrollcanneverdie3247 7 หลายเดือนก่อน
Why do OpenAI founders wear white jeans? Should someone tell them?
@AmR-gu8zr 7 หลายเดือนก่อน
it will be the most unreliable and unpredictible os, can't wait for this AI bubble to burst.
@mohadreza9419 7 หลายเดือนก่อน ⁺¹
Close AI, not open AI 😢😢😢
@alocinotasor 7 หลายเดือนก่อน
If only Andrej could talk a bit faster.

ต่อไป

เล่นอัตโนมัติ

What's next for AI agentic workflows ft. Andrew Ng of AI Fund