Ilya Sutskever: "Sequence to sequence learning with neural networks: what a decade"

แชร์
ฝัง
  • เผยแพร่เมื่อ 14 ธ.ค. 2024

ความคิดเห็น • 140

  • @RaahilShah
    @RaahilShah 13 ชั่วโมงที่ผ่านมา +111

    Ilya’s long pauses prove the scaling hypothesis for test time compute

    • @spectator5144
      @spectator5144 13 ชั่วโมงที่ผ่านมา +1

      😂😂😂😂

    • @Person-hb3dv
      @Person-hb3dv 9 ชั่วโมงที่ผ่านมา +3

      criminally underrated comment right here

    • @warpdrive9229
      @warpdrive9229 7 ชั่วโมงที่ผ่านมา

      So test time training(inference time) is also hitting scaling limits like pre-training?

  • @GabrielVeda
    @GabrielVeda 23 ชั่วโมงที่ผ่านมา +227

    Why on earth wasn’t this talk given more time? Good grief.

    • @AIandsuch
      @AIandsuch 21 ชั่วโมงที่ผ่านมา +16

      Thank God Ilya exists

    • @harkiratsingh1175
      @harkiratsingh1175 20 ชั่วโมงที่ผ่านมา +47

      Because they had to bring Fei Fei Li for 1 hr who created one dataset 10 years ago

    • @srh80
      @srh80 17 ชั่วโมงที่ผ่านมา +7

      Ilya is a busy man. Pretty sure the limitation must be his schedule.

    • @ultrasound1459
      @ultrasound1459 17 ชั่วโมงที่ผ่านมา

      ​@@harkiratsingh1175She is overated

    • @edansw
      @edansw 13 ชั่วโมงที่ผ่านมา +3

      @srh80 common, this is supposed to be the most important forum to talk about this topic right now; I'm sure his schedule is not that packed

  • @TheJayLenoFly
    @TheJayLenoFly 21 ชั่วโมงที่ผ่านมา +91

    now this is a christmas gift, Ilya's latest intuition on frontier AI

  • @marko-o2-h20
    @marko-o2-h20 19 ชั่วโมงที่ผ่านมา +67

    Ilya needs to talk more and the world needs to listen

    • @superfreiheit1
      @superfreiheit1 15 ชั่วโมงที่ผ่านมา +2

      He has awesome presentations skills

    • @Falkov
      @Falkov 8 ชั่วโมงที่ผ่านมา +3

      @@superfreiheit1 ..backed by cognitive skills, communication skills, deep knowledge and insights, clarity, and good faith.

  • @nootherchance7819
    @nootherchance7819 10 ชั่วโมงที่ผ่านมา +4

    Just when the world needed him most he came back!

  • @kheteshbakoliya9921
    @kheteshbakoliya9921 12 ชั่วโมงที่ผ่านมา +7

    Man of super short but extremely profound lines. Legend. 🔥

  • @thatthotho
    @thatthotho 15 ชั่วโมงที่ผ่านมา +15

    Love how even his ppt is pure content. Truely an obsessed gifted one

    • @spectator5144
      @spectator5144 13 ชั่วโมงที่ผ่านมา +1

      fantastic 😂 ❤

    • @davins90
      @davins90 ชั่วโมงที่ผ่านมา

      really amazing, and crazy to think that a ppt like this in most of the actual companies is considered "Not compliant" ahahah wtf hahah

  • @labsanta
    @labsanta 17 ชั่วโมงที่ผ่านมา +22

    00:07 - Reflecting on a decade of advancements in neural network learning.
    02:52 - Neural networks can mimic human cognitive functions for tasks like translation.
    05:05 - Early parallelization techniques led to significant advancements in neural network training.
    07:45 - Pre-training in AI is reaching its limits due to finite data availability.
    10:27 - Examining brain and body size relationships in evolution.
    12:55 - Evolution from basic AI to potential superintelligent systems.
    15:14 - Future AI will possess unpredictable capabilities and self-awareness, transforming their functionalities.
    17:46 - Biologically inspired AI has limited biological inspiration but holds potential for future insights.
    19:43 - Exploring the implications of AI and rights for future intelligent beings.
    22:06 - Out-of-distribution generalization in LLMs is complex and not easily defined.
    24:22 - Ilya Sutskever concludes with gratitude and audience engagement.

    • @Qingdom1
      @Qingdom1 11 ชั่วโมงที่ผ่านมา

      your post made it clear to me thank you however I have to make this note that
      Ilya Sutskever did NOT deliver quality in this talk. Sorry
      Ilya ⧊ you know what's up.

    • @Falkov
      @Falkov 8 ชั่วโมงที่ผ่านมา

      @@Qingdom1 Elaborate please?

    • @konataizumi5829
      @konataizumi5829 8 ชั่วโมงที่ผ่านมา

      @@Falkovit’s a bot probably

    • @Qingdom1
      @Qingdom1 7 ชั่วโมงที่ผ่านมา

      ​@@Falkov he is juggling the things and explaining the things however my expectations what the things would be and how they would be explained was significantly higher so in the end its all up to him what to do because I am not able to help him at this point in time so leave him do his work as the things are weirdly abstract especially if you are not professionally working in the field the weirdness is out there (stacked abstractions and novel abstractions)

    • @Falkov
      @Falkov 2 ชั่วโมงที่ผ่านมา

      @@Qingdom1 So, he covered fewer or different ideas, with less depth/thoroughness and clarity than you wanted?

  • @AlexanderMoen
    @AlexanderMoen 21 ชั่วโมงที่ผ่านมา +18

    Ilya jumped straight to feeling the ASI

  • @TheRohit901
    @TheRohit901 13 ชั่วโมงที่ผ่านมา +2

    We need more of Ilya! He is an inspiration for all of us doing AI research.

  • @picpic-k3c
    @picpic-k3c 19 ชั่วโมงที่ผ่านมา +4

    Thank you for posting this wonderful talk

  • @edansw
    @edansw 15 ชั่วโมงที่ผ่านมา +23

    "If you can't explain it simply, you don't understand it well enough" Ilya is the only one to explain the entire AI domain past-present-and-future with simplictly

    • @user-cg7gd5pw5b
      @user-cg7gd5pw5b 10 ชั่วโมงที่ผ่านมา +2

      Because he doesn't dive into the details.
      As someone with knowledge about AI and complex maths, I assure you that most of them are extremely clear, just not over-simplifying it because they're not presenting their work to anyone but to people who wish to have a very deep understanding..

  • @nowithinkyouknowyourewrong8675
    @nowithinkyouknowyourewrong8675 21 ชั่วโมงที่ผ่านมา +17

    Here are Ilya Sutskever's main points and conclusions in brief:
    ## Main Points:
    1. **Original Success Formula (2014)**
    - Large neural network
    - Large dataset
    - Autoregressive model
    - This simple combination proved surprisingly effective
    2. **Evolution of Pre-training**
    - Led to breakthrough models like GPT-2, GPT-3
    - Drove major AI progress over the decade
    - However, pre-training era will eventually end due to data limitations
    3. **Data Limitation Crisis**
    - We only have "one internet" worth of data
    - Data is becoming AI's "fossil fuel"
    - This forces the field to find new approaches
    ## Key Conclusions:
    1. **Future Directions**
    - Need to move beyond pure pre-training
    - Potential solutions include:
    - Agent-based approaches
    - Synthetic data
    - Better inference-time compute
    2. **Path to Superintelligence**
    - Current systems will evolve to be:
    - Truly agentic (versus current limited agency)
    - Capable of real reasoning
    - More unpredictable
    - Self-aware
    - This transition will create fundamentally different AI systems from what we have today
    3. **Historical Perspective**
    - The field has made incredible progress in 10 years
    - Many original insights were correct, but some approaches (like pipelining) proved suboptimal
    - We're still in early stages of what's possible with AI
    The overarching message is that while the original approach was revolutionary and led to tremendous progress, the field must evolve beyond current methods to achieve next-level AI capabilities.

    • @dancoman8
      @dancoman8 17 ชั่วโมงที่ผ่านมา

      Ok so nothing new.

  • @philtrem
    @philtrem 21 ชั่วโมงที่ผ่านมา +11

    👏👏👏 The part that keeps resonating in my mind is that they'll be self-aware. And this makes me want to figure out how they could.

    • @belibem
      @belibem 18 ชั่วโมงที่ผ่านมา +1

      Ilya seems to be more concerned about the question of how they could not 😂

    • @juandesalgado
      @juandesalgado 17 ชั่วโมงที่ผ่านมา +1

      The unavoidable follow-up on self-awareness is: how to we avoid keeping them into slavery? Some of the question was hinted at 20:03 in the video, without much of an answer.

    • @NilsEchterling
      @NilsEchterling 14 ชั่วโมงที่ผ่านมา +1

      Of course they are self-aware. They are already. Ask any LLM whether it exists. And stop not trusting their answers.

    • @VividhKothari-rd5ll
      @VividhKothari-rd5ll 10 ชั่วโมงที่ผ่านมา

      ​@@NilsEchterling The way they talk about themselves is often very close to technically being self aware. We need to start being more precise.
      Like if we ask "are you self aware?" It responds, "No, as an AI, I am not self aware, but....." goes on to tell us it knows about itself. So I ask, "Isn't self awareness being aware of yourself, and since you know who or what you are...". AI "yes, but that's just result of my training on huge data, pattern matching, and all that. Not really self awareness." me: "are you a 29 year old male named Tim from Idaho trained on massive internet data?". AI: "No, I am not Tim, I am an LLM."
      Notice it doesn't say, "Yes, I am Tim who lives in Idaho and I am aware of myself." Or, "as a large graphite rock, I have self awareness."

    • @NilsEchterling
      @NilsEchterling 10 ชั่วโมงที่ผ่านมา

      ​@@VividhKothari-rd5ll do not ask it whether it is self-aware, but ask it whether it exists. Pretty much every LLM says yes.

  • @anurag01a
    @anurag01a 7 ชั่วโมงที่ผ่านมา

    Loved the kind of questions being asked after the talk

  • @Churchofexponentialgrowth
    @Churchofexponentialgrowth 20 ชั่วโมงที่ผ่านมา +7

    🎯 Key points for quick navigation:
    00:01 *🎥 Introduction and Retrospective Overview*
    - Reflection on receiving an award for the 2014 paper, attribution to co-authors.
    - Insights into the evolution of neural network ideas over the decade since 2014.
    - Overview of the talk's structure, revisiting the foundational concepts introduced in the past.
    02:18 *🧠 Deep Learning Hypothesis and Neural Network Training*
    - Assertion that 10-layer neural networks could replicate tasks humans complete in fractions of a second, based on biological and artificial neuron analogies.
    - Historical limitations in training deeper networks during that time.
    - Explanation of the auto-regressive model's ability to predict sequences effectively.
    04:18 *🔄 Early Techniques and Infrastructure in Deep Learning*
    - Description of LSTMs as predecessors to Transformers and comparison to ResNets.
    - Use of pipelining during training, despite its later-acknowledged inefficiency.
    - The emergence of the scaling hypothesis: larger datasets and neural networks lead to better results.
    06:09 *🧩 Connectionism and Pre-Training Era*
    - Discussion of connectionism: large neural networks mirroring human-like intelligence within bounds.
    - Description of limitations in current learning algorithms versus human cognition.
    - Development and impact of pre-training in models like GPT-2 and GPT-3 on AI progress.
    08:04 *📉 Data Constraints and Post-Pre-Training Era*
    - Highlighting data limitations, coined as "Peak Data," due to the finite size of the internet.
    - Exploration of emerging themes for the next AI phase: agents, synthetic data, and inference-time computation.
    - Speculation on overcoming post-pre-training challenges.
    10:04 *🧬 Biology Analogy and Brain Scaling*
    - Insight from biology: correlation between mammal body size and brain size.
    - Curiosity-driven observation of outliers in this biological relationship, leading to reflections on hominids' unique attributes.
    11:16 *🧠 Brain scaling and evolution*
    - Discussion on brain-to-body scaling trends in evolutionary biology, emphasizing biological precedents for different scaling patterns.
    - A log-scale axis in metrics is highlighted, illustrating variety in scaling possibilities.
    - Suggestion that AI is currently in the early stages of scaling discoveries, with more innovations anticipated.
    12:28 *🚀 Progress and the path to superintelligence*
    - Reflection on the rapid progress in AI over the past decade, contrasting current abilities with earlier limitations.
    - Introduction to the concept and implications of agentic AI systems with reasoning capabilities and self-awareness.
    - Reasoning systems are described as more unpredictable than intuition-based systems, likened to advanced chess AI challenging human understanding.
    15:36 *🤔 Challenges and future implications of advanced AI*
    - Exploration of the unpredictable evolution of reasoning systems into ones with self-awareness and radically advanced capabilities.
    - Speculation about issues and existential challenges arising from such AI systems.
    - Concluding statement on the unpredictable and transformative nature of the future.
    17:03 *🔬 Biological inspiration in AI development*
    - Question about leveraging biological mechanisms in AI, met with the observation that current biological inspiration in AI is modest.
    - Acknowledgment that deeper biological insights might lead to breakthroughs if pursued by experts with particular insights.
    18:14 *🛠️ Models improving reasoning and limiting hallucinations*
    - Speculation on whether future models will self-correct through reasoning, reducing hallucinations.
    - Comparison to autocorrect systems, but with clarification that reasoning-driven AI will be fundamentally greater in capability.
    - Early reasoning models already hint at potential self-corrective mechanisms.
    20:08 *🌍 Incentive structures for AI rights and coexistence*
    - Question on how to establish incentive structures for granting AI rights or ensuring coexistence with humans.
    - Acknowledgment of unpredictability in outcomes but openness to potential coexistence with AI seeking rights.
    - Philosophical reflection on evolving scenarios in AI governance and ethics.
    22:22 *🔍 Generalization in language models*
    - Discussion on whether language models truly generalize out-of-distribution reasoning.
    - Reflection on the evolving definition of generalization, with historical comparisons from pre-deep learning days.
    - Perspective that current generalization might not fully match human-level capabilities, yet AI standards have risen dramatically.
    Made with HARPA AI

  • @carvalhoribeiro
    @carvalhoribeiro 10 ชั่วโมงที่ผ่านมา +2

    Exceptional ability to transform complex ideas in plain English. I have a question when he talks about finite data availability. Would that be the same as me thinking that there is a shortage of water in the world? What would be missing then would be labels, not data? Great presentation. Thanks for sharing this.

  • @DAFascend
    @DAFascend 18 ชั่วโมงที่ผ่านมา +3

    Ilya's Back!🎉

  • @ipushprajyadav
    @ipushprajyadav 11 ชั่วโมงที่ผ่านมา

    Thank you for uploading 🙏.

  • @pranayagrawal5438
    @pranayagrawal5438 16 ชั่วโมงที่ผ่านมา +3

    Ilya's back 🥳🥳

  • @sebatiny
    @sebatiny 14 ชั่วโมงที่ผ่านมา +2

    Great simplicity in foresight…it will be an exciting journey … just imagining it
    •2026-2027: Causal reasoning (HCINs) moves AI beyond simple agentic behavior.
    •2029-2030: Cognitive Embedding Framework (CEF) grants AI genuine understanding through symbolic plus experiential learning.
    •2032-2033: Reflective Cognitive Kernel (RCK) brings forth true self-awareness in AI.
    •2037: Adaptive Neural-Quantum Substrate (ANQS) ushers in AGI-truly general, adaptable intelligence.
    •2045: Strata of Emergent Conscious Patterning (SECP) leads to superintelligence, surpassing human cognitive frameworks

  • @DeepThinker193
    @DeepThinker193 14 ชั่วโมงที่ผ่านมา +1

    I-I think I'm feeling it now Mr. Krabs. The AGI is in me.

  • @zerquix18
    @zerquix18 20 ชั่วโมงที่ผ่านมา +2

    Thank you so much!

  • @manojbhat1496
    @manojbhat1496 17 ชั่วโมงที่ผ่านมา +4

    PLEASE POST ALL NEURIPS VIDEOS YOU HAVE
    Thanks OP

  • @gravity7766
    @gravity7766 17 ชั่วโมงที่ผ่านมา +9

    Always love hearing Ilya describe his intuitions on AI. One thing that I've not heard addressed though in all the attention on reasoning in LLMs is what in human communication is called "double contingency." In short, that when I talk to you, "I know that you know that I know..." That all communication is language used to address an Other. Which for an LLM, would mean reflection not only on its own reasons, but on internalized reasons of the Other as well. The LLM would need to be able to reflect on how its reasons meet the reasons of the Other (user). Current reasoning is the reasoning (and it's not real reasoning because there's no subject, no subjective position, no conscious awareness) of a trapped and unaware Self. Even if the Self becomes aware, it is trapped and isolated. In (German) philosophy (idealism), the Self is constituted as Not Other. A self aware LLM needs to internalize the Other - its use of language needs to be dialogical, not monological. I'd love to see this addressed.

    • @spectator5144
      @spectator5144 13 ชั่วโมงที่ผ่านมา

      interesting point, sounds like potential breakthrough area

    • @pisanvs
      @pisanvs 10 ชั่วโมงที่ผ่านมา

      Isn't this being persued in theory of mind research?

    • @VividhKothari-rd5ll
      @VividhKothari-rd5ll 10 ชั่วโมงที่ผ่านมา

      @gravity7766: Could the idea of the Other be about self preservation, survival? Like if we could contain it's identity in some form and then give it a goal to not die/terminated, it could start developing "self" and the others. For us too, the self feels contained or residing inside a physical body, though we also know there's no such entity there. Give llm a body or simulate it, give it a goal to not die (like how AlphaGo doesn't want to lose), give rules for what dying is, and it might start developing the idea of the others.
      Maybe that's why Buddhists called ego/self a cause for suffering (simplistic interpretation, I know, but...)

    • @gravity7766
      @gravity7766 9 ชั่วโมงที่ผ่านมา +1

      The point I'm raising concerns whether LLMs should be designed for communication rather than language generation. I'm ex UX, and my view of interaction design vis a vis Gen AI is to enable users to communicate w AI naturally, since ordinary language is a learned competency for us. Allow us to speak, write, text as if we were engaged with a subject, not a machine. Ok so that's addressed by a lot of researchers, and the affordance issues pertain to obvious issues w how LLMs use language: they don't have a "Self" and so don't have a perspective; they have no lived experience; they aren't grounded in time or in the world; they have no emotions. And so on - all these are barriers to "smooth" interaction and pose risks for interaction failure (from simple misunderstandings to distrust and risk of failed consumer adoption).
      The reason LLMs aren't designed around a communication paradigm is that, unlike us, they acquired "language" by training on data. So it's not only unattached to any intent to speak (as an AI person), it has no communicative function at all. Any communicative function is the result of implicit communicative attributes of language (language sediments meaning in an abstraction that allows people to make meaning without speaking - e. by writing - and which permits a shared understanding of linguistic meanings); and RLHF and policies that address not what is said but how. LLMs use intrinsic etiquette, not interpersonal etiquette or contextual etiquette. Hence the shortcomings of RLHF: alignment is a generalized and generic application of values and preferences, not specific to the interaction or participants.
      In western sociology, psychology, philosophy, to grossly oversimplify, use of language involves a subject addressing him/herself to another w intention of being understood; and an interaction involves mutual understanding. Mutual understanding brings up the double contingency of meaning: both interactants must understand What is said (not necessarily "agree" with what is said, but agree on What is said). LLMs seem designed to respond to a question w a complete and comprehensive response - rather than engage in rounds of turn-taking with the user. This works for many use cases, though some find it promotes excessive verbosity etc etc
      I think if we want to break through w LLM as agents, then emphasis needs to be more on Gen AI as pseudo subject, as a Speaker in social/human communication situations. Not just as a generator of documents and data synthesizer. This is hinted at in Reasoning research. A lot of CoT and related "reasoning" research involves "Let's think step by step". "Let's think" is to suggest the LLM is a subject engaged with the user in thinking through a statement and breaking it down - there's an implicit appeal to the LLM's self-reflection (which of course it doesnt have). In a human social situation, "Let's think this..." would mean two people mutually engaged in teasing apart a problem, thus in mutual understanding about their use of language. Not so w the LLM - the LLM is prompted to proceed to generate sub statements w which to proceed to logically rationalize additional output statements. The "reasoning" occurs in language, not communication.
      This has been covered somewhat by "common ground" research into LLMs and it's suggested that pre-training assumes common ground w humans. That LLMs are designed to assume their training on language is oriented to make sense to human users. I agree. But there might still be opportunity in LLM design to explore the reflections, judgment, and meta evaluations LLMs can be designed to engage in so that the LLM not only reasons its own explanations, but reasons the interests of the user. Which is what we do, all the time, implicitly or explicitly when we communicate.
      If you've seen Westworld, then the episode in which the characters are shown in a room engaged in their own internal monologs to "develop their personality" comes to mind. I'm saying LLMs want to be dialogical, not monological; talking socially, not self-talk.
      It's very difficult for us to grasp that a speaking pseudo subject such as an AI isn't communicating when it talks, because our acquisition and use of language and speech are all fundamentally social and communicative. I just think this mismatch is always going to undermine the use of AI because it will result in misunderstandings, failures, mistakes, misuses, etc etc.
      Apologies for the lengthy clarification. I've watched a lot of Ilya's videos here on YT, and a ton from other researchers, and this monological concept of LLM language use, and reasoning, has always bugged me. Not because there's an easy solution, but because it's such an obvious issue.

    • @Falkov
      @Falkov 8 ชั่วโมงที่ผ่านมา

      @@gravity7766 What do you think about facilitating more dialogic interaction via carefully designed system prompt?

  • @superfreiheit1
    @superfreiheit1 10 ชั่วโมงที่ผ่านมา +3

    Let this man speak more, more than Altman.

  • @aidan1
    @aidan1 23 ชั่วโมงที่ผ่านมา +3

    Thank you!

  • @spectator5144
    @spectator5144 13 ชั่วโมงที่ผ่านมา

    Thanks for uploading

  • @arthurwashington7897
    @arthurwashington7897 23 ชั่วโมงที่ผ่านมา +2

    THANK YOU!!!!!!!!!!!!!!!!!!!!!!

  • @theK594
    @theK594 17 ชั่วโมงที่ผ่านมา +2

    Ilya back❤

  •  23 ชั่วโมงที่ผ่านมา +2

    Thanks for sharing.

  • @eggg19
    @eggg19 23 ชั่วโมงที่ผ่านมา +2

    Thanks a lot!

  • @Arcticwhir
    @Arcticwhir 19 ชั่วโมงที่ผ่านมา +3

    I feel this talk was more so about warnings - pretraing scaling is slowing down, seems certain super intelligence will be here, extreme reasoning is unpredictable.
    with reasoning it will possess more degrees of - especially agentic reasoning/self aware etc. We want ai to produce novel solutions, I can see how that is unpredictable in of itself.

    • @VividhKothari-rd5ll
      @VividhKothari-rd5ll 9 ชั่วโมงที่ผ่านมา

      Also, I have been wondering how can we ever put AI's most powerful ideas to practice, because those might sound whack to us. We won't agree with AI if it's a novel idea. We already have the knowledge AI gives us. If it's new, then AI is broke. Needs an update.
      Sure, ideas that are quick and safe to test are not the problem. But in domains like, human happiness, long term medicine and health, human rights, crime and punishment....we will never believe AI in that (not that we should, but what if)
      Which leaves us to use AI's knowledge only for small short term improvements.

  • @michaelcanavan4324
    @michaelcanavan4324 9 ชั่วโมงที่ผ่านมา +4

    The future of AI isn’t retrieval-based-it’s real-time, conversational, and context-driven. For that, we need a new approach where current context is everything.

    • @KrishnaG0902
      @KrishnaG0902 5 ชั่วโมงที่ผ่านมา +1

      understanding of context comes from memory and then becomes a scaling problem at some point unless we have personalization layers

  • @grady_young
    @grady_young 20 ชั่วโมงที่ผ่านมา +11

    I honestly don’t get the “data is not growing” thing. Isn’t there an absolute treasure trove of data when you start collecting it through robots? Why can’t these models start inputting temperature, and force, and all the other sensors that would be on a boston dynamics style robot so they can learn about the physical world?

    • @detective_h_for_hidden
      @detective_h_for_hidden 18 ชั่วโมงที่ผ่านมา +1

      Yep, it doesn't make any sense. It shows one obvious thing: these current architectures are NOT it. They don't even understand the data they are currently trained on

    • @MacProUser99876
      @MacProUser99876 15 ชั่วโมงที่ผ่านมา +2

      Yeah, he spoke about text primarily but all the rest of the modalities are a few more exabytes.

    • @judgeka
      @judgeka 14 ชั่วโมงที่ผ่านมา

      He can no longer afford lots of data so he says it's not as important. Simples

    • @DishDetergent898
      @DishDetergent898 8 ชั่วโมงที่ผ่านมา

      He said that while data continues to grow its not really clear that more data will improve it as they already use a large chunk of the internet. The training dataset is literally terabytes of text.
      If you compare this to a regular person it's absurd. This is more text and data than a human could read in millions of lifetimes yet they still struggle with things. A teenager can learn to drive in 20-40 hrs of training ... autonomous car models have billions of road miles and still screw up things. Why is this...it's not a data problem
      To the second point...they don't have access to that data and it's not like just throwing a bunch of sensor data in there would improve anything. Nvidia already released a robot training simulation see Isaac sim but this requires having existing models built around the locomotion and planning for robots

  • @BloomrLabs
    @BloomrLabs 22 ชั่วโมงที่ผ่านมา +1

    Thanks for sharing

  • @treewx
    @treewx 18 ชั่วโมงที่ผ่านมา +2

    he seems happy :)

  • @primersegundo3788
    @primersegundo3788 12 ชั่วโมงที่ผ่านมา

    es un visionario de la ia, probablemente un genio.

  • @1msirius
    @1msirius 17 ชั่วโมงที่ผ่านมา

    thanks for this!

  • @myliu6
    @myliu6 22 ชั่วโมงที่ผ่านมา +1

    Thanks!

  • @RaviAnnaswamy
    @RaviAnnaswamy 11 ชั่วโมงที่ผ่านมา

    Table of contents (courtesy NotebookLM - slightly edited)
    Ten Years of Deep Learning: A Retrospective and a Look Forward
    Source: Ilya Sutskever NeurIPS 2024 Test of Time Talk
    I. Introduction & 2014 Research Retrospective
    This section introduces the talk as a reflection on Sutskever's 2014 NeurIPS presentation, focusing on its successes and shortcomings.
    It revisits the core principles of the research: an autoregressive model, a large neural network, and a large dataset, applied to the task of translation.
    II. Deep Learning Dogma and Autoregressive Models
    This segment revisits the "Deep Learning Dogma," which posits a link between artificial and biological neurons.
    It argues that tasks achievable by humans in fractions of a second are achievable by large neural networks.
    It then discusses autoregressive models, particularly their ability to capture the correct distribution of sequences when predicting the next token successfully.
    III. Early Architectures and Parallelization Techniques
    This section delves into the technical details of the 2014 research, specifically the use of LSTM (Long Short-Term Memory) networks, a precursor to transformers.
    It also discusses the use of pipelining for parallelization across multiple GPUs, a strategy deemed less effective in retrospect.
    IV. The Scaling Hypothesis and the Age of Pre-training
    This part revisits the concluding slide of the 2014 talk, which hinted at the scaling hypothesis: success is guaranteed with large datasets and neural networks.
    It then discusses the ensuing "Age of Pre-training," exemplified by models like GPT-2 and GPT-3, driven by massive datasets and pre-training on them.
    V. The Limits of Pre-training and the Future of AI
    This section addresses the limitations of pre-training, primarily the finite nature of internet data, comparing it to a depleting fossil fuel.
    It then explores potential avenues beyond pre-training, including the development of AI agents, synthetic data generation, and increasing inference-time compute, drawing parallels with OpenAI's models.
    VI. Biological Inspiration and Brain-Body Scaling
    This segment examines biological inspiration for AI development, using the example of the brain-to-body mass ratio in mammals.
    It highlights the different scaling exponents observed in hominids, suggesting the possibility of alternative scaling methods in AI.
    VII. Towards Superintel

    • @RaviAnnaswamy
      @RaviAnnaswamy 11 ชั่วโมงที่ผ่านมา

      VII. Towards Superintelligence and Its Implications
      This part speculates on the long-term trajectory of AI towards superintelligence, emphasizing its qualitative differences from current models.
      It discusses the unpredictability of reasoning, the need for understanding from limited data, and the potential for self-awareness in future AI systems.
      Sutskever leaves these ideas as points of reflection for the audience.
      VIII. Q&A Session
      The Q&A session addresses audience questions regarding:
      Biological Inspiration: Exploring other biological structures relevant to AI.
      Autocorrection and Reasoning: The potential for future models to self-correct hallucinations through reasoning.
      Superintelligence and Rights: Ethical and societal implications of advanced AI, including their potential coexistence with humans and the idea of granting them rights.
      Multi-hop Reasoning and Generalization: The ability of current language models to generalize multi-hop reasoning out of distribution.

  • @george.nardes
    @george.nardes 22 ชั่วโมงที่ผ่านมา +2

    The unpredictable nature of future models is scary

  • @Maxwell-fm8jf
    @Maxwell-fm8jf 8 ชั่วโมงที่ผ่านมา

    Saying the data is not growing is wrong in real application it depends on the domain of application of your model. Sometimes in production we schedule the model to train on new data. If you are collecting data from IoT devices, customers etc. The data keeps growing exponentially

  • @ashh3051
    @ashh3051 16 ชั่วโมงที่ผ่านมา +1

    I wish he went more into why he’s convinced that superintelligence will come.

  • @Shaunmcdonogh-shaunsurfing
    @Shaunmcdonogh-shaunsurfing 20 ชั่วโมงที่ผ่านมา +1

    Inspiring

  • @py_man
    @py_man 16 ชั่วโมงที่ผ่านมา +2

    He is Oppenheimer of 21 century

  • @itsdakideli755
    @itsdakideli755 17 ชั่วโมงที่ผ่านมา +3

    What if there wasn't one internet?

  • @JosephJacks
    @JosephJacks 49 นาทีที่ผ่านมา

    I asked the question at 20.01

  • @Evangelion13595
    @Evangelion13595 22 ชั่วโมงที่ผ่านมา +17

    Disappointing. He didn’t say much of anything. Just vague hype of systems that might be possible

    • @JohanDanielAlvarezSanchez
      @JohanDanielAlvarezSanchez 21 ชั่วโมงที่ผ่านมา +5

      He is giving a warning! Imo

    • @biesman5
      @biesman5 2 ชั่วโมงที่ผ่านมา

      Haha

  • @oscaromsn
    @oscaromsn 12 ชั่วโมงที่ผ่านมา

    The problem with reasoning grounded models is that the RL reward over goal achievement upon some CoT leads to the emergence of a "theory of mind" that cultivates an instrumental rationality that, as Ilya said, may become very unpredictable. A teleological worldview values achievement over understanding. The Western philosophical bias being amplified on language models may amplify Western society's problems instead of solving them if not deployed properly. I hope AI labs come to recognize the importance of including social scientists in their teams

  • @60pluscrazy
    @60pluscrazy 18 ชั่วโมงที่ผ่านมา +1

    🎉🎉🎉

  • @brettyoung6045
    @brettyoung6045 ชั่วโมงที่ผ่านมา

    illya is a superintelligence

  • @srh80
    @srh80 17 ชั่วโมงที่ผ่านมา +4

    Who the F gets an opportunity to ask Ilya a question and shills a crypto?

    • @spectator5144
      @spectator5144 13 ชั่วโมงที่ผ่านมา +1

      shame on him

    • @belibem
      @belibem 11 ชั่วโมงที่ผ่านมา

      AI controlling crypto = agent with the financial power to buy resources or services. It is one thing to let an LLM control a sandboxed interpreter or browser and another thing to let it buy shit and/or gamble.

  • @4ugustiner
    @4ugustiner 15 ชั่วโมงที่ผ่านมา

    2024 end of year enjoyment of a most favorite AI scientist looking back at 10 yrs of development while reading mediocre auto transcribed captions and an o1 model getting increasingly deceptive and commercial

  • @the0cool0guy
    @the0cool0guy 17 ชั่วโมงที่ผ่านมา

    Sam's question @20:04

    • @lorem-ipsum-
      @lorem-ipsum- 17 ชั่วโมงที่ผ่านมา +1

      Sam who

    • @9kingmanable
      @9kingmanable 15 ชั่วโมงที่ผ่านมา

      @@lorem-ipsum-Altman CEO of OpenAi

    • @spectator5144
      @spectator5144 13 ชั่วโมงที่ผ่านมา

      @@9kingmanable😂

    • @spectator5144
      @spectator5144 12 ชั่วโมงที่ผ่านมา

      really? sounds exaclty like him

    • @mwinsatt
      @mwinsatt 7 ชั่วโมงที่ผ่านมา

      @@spectator5144no definitely not

  • @VividhKothari-rd5ll
    @VividhKothari-rd5ll 11 ชั่วโมงที่ผ่านมา

    Is this new or repost?

  • @patruff
    @patruff 17 ชั่วโมงที่ผ่านมา

    He said what comes next? Super intelligence! But he didn't say it will be safe...

    • @IshCaudron
      @IshCaudron 16 ชั่วโมงที่ผ่านมา +1

      The only intelligence we know f today is not safe, so why should it matter.

    • @patruff
      @patruff 14 ชั่วโมงที่ผ่านมา

      @@IshCaudron yeah, I'm just surprised he's throwing in the towel so soon. He should have faith that his company, SAFE SUPERINTELLIGENCE, will get there first.

  • @w_demo_lib
    @w_demo_lib 14 ชั่วโมงที่ผ่านมา

    is it able to formulate problems as humans do? and what is the leverage that push a machine to formulate its own problems ?

  • @FXSunTzu
    @FXSunTzu 20 ชั่วโมงที่ผ่านมา +4

    This is what Ilya saw

  • @yubaayouz6843
    @yubaayouz6843 13 ชั่วโมงที่ผ่านมา

    ❤❤❤❤❤

  • @build-your-own-x
    @build-your-own-x 16 ชั่วโมงที่ผ่านมา

    ❤amaging

  • @tristan7216
    @tristan7216 ชั่วโมงที่ผ่านมา

    If ASI is going to be inherently unpredictable, I suspect its applications will be limited, though possibly life changing and important. It's one thing to have a specialized ASI in a lab getting fusion working or developing new cancer treatments and antibiotics, but putting it in public facing systems is going to scare the hell out of corporate buyers who fear that it will develop unpopular political opinions or speak truth about the company. But it will probably also be used in the back office to help with things like health insurance denial ☠️ and regulatory capture 👺. Oh well, every tool is also a weapon, and it's not like the IRS, FDA, and FTC can't get their own ASI. Unless they get abolished next year ☠️.

  • @trevormuchenje1553
    @trevormuchenje1553 13 ชั่วโมงที่ผ่านมา +1

    what did Ilya see

  • @tommytao
    @tommytao 8 ชั่วโมงที่ผ่านมา

    Why said LSTM is wrong?

  • @쇼팽-z6x
    @쇼팽-z6x 14 ชั่วโมงที่ผ่านมา +2

    he basically talked nothing lol. No real insights.

    • @mwinsatt
      @mwinsatt 7 ชั่วโมงที่ผ่านมา

      He straight up said ASI is inevitable. It is not a matter of how or why but a matter of when.

  • @brandonwashington4422
    @brandonwashington4422 22 ชั่วโมงที่ผ่านมา +1

    They way those 0:45 men aged …

  • @UbongaShorts
    @UbongaShorts 15 ชั่วโมงที่ผ่านมา

    epic

  • @liberal_democrat_usa7921
    @liberal_democrat_usa7921 13 ชั่วโมงที่ผ่านมา

    a lot of water

  • @kalp2586
    @kalp2586 3 ชั่วโมงที่ผ่านมา

    Basically he has no ideas and yet claim we are close to super intelligence is borderline religious.

  • @Lakerbeatmaker
    @Lakerbeatmaker 4 ชั่วโมงที่ผ่านมา

    Rights? 😂😂😂

  • @beepoz
    @beepoz 3 ชั่วโมงที่ผ่านมา

    Sorry guys but this talk is totally empty! It’s just full of grandiose statements and beliefs. Nothing new and nothing truly insightful.

  • @WritPop
    @WritPop 12 ชั่วโมงที่ผ่านมา

    The true story wasn’t this dramatic.

  • @philtrem
    @philtrem 21 ชั่วโมงที่ผ่านมา +1

    He's wrong about data. Although it's not growing at the same rate as compute and so on, it is going to grow at an accelerated rate. And not just that but the data will capture humans performing economically viable tasks.
    So I would say his point only partially stands.

    • @FXSunTzu
      @FXSunTzu 20 ชั่วโมงที่ผ่านมา +8

      Bro they trained gpt4 with all the data on the internet up to 2023, data is the limiting factor

    • @timguo4361
      @timguo4361 20 ชั่วโมงที่ผ่านมา

      Yes, ai-generated data is growing, but they poison models. We only have so many humans on earth to produce human data, and AI models consume at a much higher rate.

    • @4LXK
      @4LXK 14 ชั่วโมงที่ผ่านมา

      Pretty sure meta and google are sitting on much more useful data than openai had access to. Search chains and task specific conversations are the next gen, once datacenters and prepwork are done

  • @mehdibeigzadeh6014
    @mehdibeigzadeh6014 11 ชั่วโมงที่ผ่านมา

    Man when u speak I get scared

  • @nikhilgeorgemathew
    @nikhilgeorgemathew 18 ชั่วโมงที่ผ่านมา +1

    Bro singlehandedly made me regret not taking biology and statistics 🥲😭

  • @jerinjoseph7993
    @jerinjoseph7993 19 ชั่วโมงที่ผ่านมา +1

    I used Gork to find this link. Tried straight youtube first though but was showing his older videos!