Why Computer Vision Is a Hard Problem for AI

แชร์
ฝัง
  • เผยแพร่เมื่อ 12 พ.ค. 2024
  • Computer scientist Alexei Efros suffers from poor eyesight, but this has hardly been a professional setback. It's helped him understand how computers can learn to see. At the Berkeley Artificial Intelligence Research Lab (BAIR), Efros combines massive online data sets with machine learning algorithms to understand, model and re-create the visual world. His work is used in iPhones, Adobe Photoshop, self-driving car technology, and robotics. In 2016, the Association for Computing Machinery awarded him its Prize in Computing for his work creating realistic synthetic images, calling him an “image alchemist.” In this video, Efros talks about the challenges and changing paradigms of computer vision for AI.
    00:00 Why vision is a hard problem
    1:18 History of computer vision
    2:01 Alexei's scientific superpower
    3:14 The role of large-scale data
    3:37 Computer vision in the Berkeley Artificial Intelligence Lab
    4:15 The drawbacks of supervised learning
    4:57 Self-supervised learning
    5:33 Test-time training
    7:08 The future of computer vision
    Read the companion article at Quanta Magazine: www.quantamagazine.org/the-co...
    - VISIT our Website: www.quantamagazine.org
    - LIKE us on Facebook: / quantanews
    - FOLLOW us Twitter: / quantamagazine
    Quanta Magazine is an editorially independent publication supported by the Simons Foundation: www.simonsfoundation.org/
  • วิทยาศาสตร์และเทคโนโลยี

ความคิดเห็น • 79

  • @weinhardtadam1159
    @weinhardtadam1159 6 หลายเดือนก่อน +314

    I love that with 120.000 citations, he is regarding the grad students and the next generation of scientists as his biggest achievement.

  • @Alex-rh5jo
    @Alex-rh5jo 6 หลายเดือนก่อน +178

    It's great that there are professors out there that value their students as their greatest achievement!

    • @ev.c6
      @ev.c6 6 หลายเดือนก่อน +2

      I have no idea where you are from, but I have studied in two continents, 3 different universities, and this was my experience in all of o them. Academia is just an amazing world.

    • @blueAndblack-ec6jk
      @blueAndblack-ec6jk 6 หลายเดือนก่อน +2

      ​@@ev.c6then u r lucky that you got this kind of experience bcz mine wasn't.😅

    • @whannabi
      @whannabi 6 หลายเดือนก่อน +1

      ​@@ev.c6until some people try to get popular by changing the data and embellishing things. Bad apples yes, but they look the most appetizing until you bite into one.

    • @leif1075
      @leif1075 6 หลายเดือนก่อน

      Hiw do they work so hard for so long and not get bored and tired and frustrated?

    • @leif1075
      @leif1075 6 หลายเดือนก่อน

      ​@@blueAndblack-ec6jkis working 8 hours a day enough as a grad student so it doesn't have to fucking wear you out or take over your life?

  • @joaoguerreiro9403
    @joaoguerreiro9403 6 หลายเดือนก่อน +57

    As a computer scientist working in Computer Vision tasks (and other AI applications) for medical imaging processing, this video made me smile :)

    • @smirnovslava
      @smirnovslava 6 หลายเดือนก่อน

      In a good way?

    • @azyrael96
      @azyrael96 6 หลายเดือนก่อน +3

      Made me smile in the same way. One of the first things my professor told me at the beginning of the phd was that his goal is to make me a better scientist than him. Really nice moment to see this guy so passionate about it as well.

    • @nutmeg0144
      @nutmeg0144 6 หลายเดือนก่อน +4

      As some random guy sick of seeing these subtle humble brag comments, your comment made me cringe

    • @joaoguerreiro9403
      @joaoguerreiro9403 6 หลายเดือนก่อน +2

      Next time I’ll be more modest @nutmeg0144 :)

    • @rijulranjan8514
      @rijulranjan8514 2 หลายเดือนก่อน

      All they said was that they work in the field and enjoyed seeing the video? The only thing cringe was your response@@nutmeg0144

  • @QuantaScienceChannel
    @QuantaScienceChannel  6 หลายเดือนก่อน +11

    Read more about Alexei Efros's research in a written interview by Susan D'Agostino on the Quanta website: www.quantamagazine.org/the-computing-pioneer-helping-ai-see-20231024/
    Quanta is conducting a series of surveys to better serve our audience. Take our video audience survey and you will be entered to win free Quanta merchandise: quantamag.typeform.com/video

    • @primenumberbuster404
      @primenumberbuster404 6 หลายเดือนก่อน +1

      I am waiting for a video on the progress of Quantum Optics. 😃 I am hoping to pursue research in this field and it has some of the greatest ideas of all of experimental physics.

  • @JZFeser
    @JZFeser 6 หลายเดือนก่อน

    Wonderful video! I love everything this channel has made!

  • @xXMaDGaMeR
    @xXMaDGaMeR 6 หลายเดือนก่อน +12

    my favorite topic in CS

  • @brianfunt2619
    @brianfunt2619 3 หลายเดือนก่อน +4

    I love how at 8:08 one of the students' phone falls out of their pocket and everyone turns and looks at it

  • @werwardas1
    @werwardas1 6 หลายเดือนก่อน +8

    Thank you for the insights and this very well produced video!

  • @MichaelFergusonVideos
    @MichaelFergusonVideos 6 หลายเดือนก่อน +2

    Wonderful! Looking forward to the future!

  • @greatviktor4017
    @greatviktor4017 6 หลายเดือนก่อน +2

    Love this channel

  • @liangcherry
    @liangcherry 28 วันที่ผ่านมา

    thank you for explanation!

  • @terryliu3635
    @terryliu3635 หลายเดือนก่อน

    Love the short video!❤

  • @BenMitro
    @BenMitro 6 หลายเดือนก่อน +31

    All very interesting. I wonder if we are limiting computer vision by only considering human vision. Each other organism has vision selected to make the organism successful, and its not like ours. I wonder if there is something we can learn from this diversity of purpose for visual systems in all organisms. Alexei Efros has touched on this diversity of purpose with his own experience of vision.

    • @dexterrity
      @dexterrity 6 หลายเดือนก่อน +2

      yeah well computer vision in ranges of the electromagnetic spectrum outside of visible light exist. That is more relevant to hardware: how the sensor is detecting light and what range of frequencies etc. Once it becomes image data of whatever kind, the convolutional neural networks do their thing and don't really care about how "humans" see things.

    • @BenMitro
      @BenMitro 6 หลายเดือนก่อน +1

      @@dexterrity There also sonar for bats and other creatures, but I was thinking more about the cognitive processes, although yes, the hardware is certainly required.

    • @BenMitro
      @BenMitro 6 หลายเดือนก่อน +1

      ​@@TzaraDuchamp Efros made a point of his personal experience with low vision which helped him move forward. I was just proposing that perhaps we could move forward by considering a broader specturm of experience by tapping into animal vision. Its not about how computers currently perform computer vision algorithms, its about learning how we could uncover insights that allows us to enhance or redesign computer vision.

    • @Siroitin
      @Siroitin 6 หลายเดือนก่อน

      First problem is that humans are creating AI. We are going to be AI's limit

    • @BenMitro
      @BenMitro 6 หลายเดือนก่อน

      @@TzaraDuchamp You misunderstood me - I was wondering if we could get more insight from a broader view. I didn't cast any aspersions on Efros - in fact I admire the man. Maybe reading too much between the lines?

  • @brain_respect_and_freedom
    @brain_respect_and_freedom 6 หลายเดือนก่อน

    Thank you👍

  • @alirezaahmadi5018
    @alirezaahmadi5018 6 หลายเดือนก่อน

    so amazing.😍😍🤩🤩.good luck.

  • @harishhanchinal2838
    @harishhanchinal2838 6 หลายเดือนก่อน +1

    Nice informative video.

  • @a4ldev933
    @a4ldev933 4 หลายเดือนก่อน +1

    Man.. I wish you were my CS professor. 👍

  • @Fine_Mouche
    @Fine_Mouche 6 หลายเดือนก่อน

    what about use analogue computing in the futur for AI ?

  • @andrewsun4385
    @andrewsun4385 6 หลายเดือนก่อน +1

    Cool!!!❤❤

  • @presence5834
    @presence5834 27 วันที่ผ่านมา +1

    I had an idea when I was working on my thesis that if we have transformer for vision and a new embedding system that treat the visual data like human we can have a model that will understand the images of the universe that is beyond the computer ability of human brains such as the cosmic microwave background. But it’s an idea only😢

  • @tim40gabby25
    @tim40gabby25 6 หลายเดือนก่อน

    Interesting to see the distribution of ethnicities along that outside shot bench.. humans are drawn to those with whom they assume they might have common ground. Just an observation. Might be wrong.

  • @lilhaxxor
    @lilhaxxor 5 หลายเดือนก่อน +4

    This is a very good interview. I am glad to see that it's validating my intuition, about the fact that models should continuously learn instead to being frozen, and then retrained from scratch.
    One of the biggest difficulties to improve the current techniques is reducing models size. I don't know how much data a real brain can store, but given the miniaturization of current chips, I suspect we are wasting resouces.
    Anecdote: I have bad eyesight as well. 😂

  • @kylebowles9820
    @kylebowles9820 6 หลายเดือนก่อน

    Computer vision is so fun!

  • @severusgomez4979
    @severusgomez4979 5 หลายเดือนก่อน

    Thumbnail lookin’ like a front foot catch 3 flip

  • @OBGynKenobi
    @OBGynKenobi 6 หลายเดือนก่อน

    What about computer audition?

  • @_soundwave_
    @_soundwave_ 3 หลายเดือนก่อน

    5:28 he is so deep inside, he calls us 'agents'

  • @autonomous_collective
    @autonomous_collective 6 หลายเดือนก่อน +16

    Computer scientist Alexei Efros suffers from poor eyesight, but this has hardly been a professional setback. It's helped him understand how computers can learn to see.
    At the Berkeley Artificial Intelligence Research Lab, Efros combines massive online data sets with machine learning algorithms to understand, model and re-create the visual world. His work is used in iPhones, Adobe Photoshop, self-driving car technology, and robotics. In 2016, the Association for Computing Machinery awarded him its Prize in Computing for his work creating realistic synthetic images, calling him an “image alchemist.”
    In this video, Efros talks about the challenges and changing paradigms of computer vision.

  • @AyushSharma80001
    @AyushSharma80001 หลายเดือนก่อน +1

    I also have Myopia

  • @1.4142
    @1.4142 6 หลายเดือนก่อน +21

    AI generated timestamps
    0:00: 👁 Computer vision is a complex process that is difficult for computers to replicate, but advancements are being made.
    2:56: 🌳 Visual data and its importance in machine learning and computer vision.
    5:58: 🔑 Computers struggle to generalize in their machine learning algorithms, but test time training can help improve their performance.

    • @mihailmilev9909
      @mihailmilev9909 6 หลายเดือนก่อน

      wow

    • @mihailmilev9909
      @mihailmilev9909 6 หลายเดือนก่อน

      Wow

    • @mihailmilev9909
      @mihailmilev9909 6 หลายเดือนก่อน

      Were the emojis from the AI too?

    • @1.4142
      @1.4142 6 หลายเดือนก่อน

      yup @@mihailmilev9909

  • @k-c
    @k-c หลายเดือนก่อน

    Waiting for the day when computer vision beat skills of georainbolt

  • @ElParacletoPodcast
    @ElParacletoPodcast 4 หลายเดือนก่อน +1

    Computers cannot see, and will never see, they only process information, but will never see.

  • @bharatjoshi9889
    @bharatjoshi9889 5 หลายเดือนก่อน +1

    So AI is just data with some selective results from that data ..is it ?

  • @strangevideos3048
    @strangevideos3048 3 หลายเดือนก่อน

    the problem is that even if you watch a real video from nature on the screen, it is not real for your eyes, a two-dimensional image plus unrealistic colors of the screen, i.e. resolution..

  • @kengounited
    @kengounited 6 หลายเดือนก่อน +1

    Computer vision is hard because it's right at the mercy of the so-called curse of dimensionality.

  • @abursuk
    @abursuk 6 หลายเดือนก่อน +1

    thx for supporting Ukraine

  • @strangevideos3048
    @strangevideos3048 3 หลายเดือนก่อน

    Two minute paper 😊

  • @PythonAndy
    @PythonAndy 6 หลายเดือนก่อน +2

    I was early.

  • @enesmahmutkulak
    @enesmahmutkulak 6 หลายเดือนก่อน +3

    cool and first comment

  • @djp1234
    @djp1234 6 หลายเดือนก่อน +6

    3:35 Slava Ukraini

  • @ValidatingUsername
    @ValidatingUsername 6 หลายเดือนก่อน

    Just convert a 2d plane to 3d calculations 😂

    • @YacineBenjedidia-wm6pw
      @YacineBenjedidia-wm6pw 6 หลายเดือนก่อน

      that's how our brain works converting 3D into 2D then analysing the image

  • @dronefootage2778
    @dronefootage2778 6 หลายเดือนก่อน

    you didn't explain how AI learns to see, like at all, i'm gonna have to give a thumbs down

    • @-p2349
      @-p2349 6 หลายเดือนก่อน

      Panoptic segmentation is to complicated for an eight minute video

  • @JuliusUnique
    @JuliusUnique 6 หลายเดือนก่อน +2

    We literally have cameras for a few centuries now, making AI learn to "see" is just that, a camera attached to AI processing it, we already feed AI with pics and make it learn visually

    • @jsmunroe
      @jsmunroe 6 หลายเดือนก่อน +1

      There are multiple levels of vision. Everything from pattern matching is to recognizing symbols to identifying and interacting with objects. We see mostly with our brains, for instance.

    • @JuliusUnique
      @JuliusUnique 6 หลายเดือนก่อน

      @@jsmunroe I thought it's just having a lot of digital neurons and then letting them figure out the concept of patterns themselves

    • @Earth-To-Zan
      @Earth-To-Zan 6 หลายเดือนก่อน +1

      @@JuliusUniquewell usually you train a model on the dataset of images or videos
      then once it is trained you can test its capabilities by feeding an input image/video that wasnt in the training data
      now this is just a very simplified explanation and its more complex than that

  • @sillystuff6247
    @sillystuff6247 6 หลายเดือนก่อน +12

    stop the insipid background music

    • @jasperhilliard6289
      @jasperhilliard6289 6 หลายเดือนก่อน +17

      i don't think it is insipid at all

  • @fionagrutza9291
    @fionagrutza9291 6 หลายเดือนก่อน +1

    Still not "AI" and this exploitation of the term is exhausting. He even admits its about data comprehension ie algorithmic formulations (tiered) and not unprovoked generation which is and was the metric for the term. We have lost the boundaries of what things are so as to cater to branding for $$$

    • @ItIsJan
      @ItIsJan 6 หลายเดือนก่อน +1

      yes hype and money!!!

    • @khalilsabri7978
      @khalilsabri7978 6 หลายเดือนก่อน +3

      its exactly AI, what are you talking about? maybe very old Computer vision was, recent research into the domain is all AI. If anything, Computer vision was the field impacted most by AIl, especially in early days of deep learning.

    • @fionagrutza9291
      @fionagrutza9291 6 หลายเดือนก่อน +1

      @@khalilsabri7978 You could then assign any and every computational process as "AI" based on the metrics you and they are suggesting wildly. What was once labeled "bots" with keyword association generative replies are now "AI" bcz every thing has been rebranded to serve a new narrative for profit. AI used to have a requisite to meet in order to be classified as AI, we had science fiction esk tests as thresholds, and if you can claim any of these things just abundantly appearing all of a sudden today meet those standards, then you are a mindless consumer. Image generation from keywords is not AI its is algorithmic compiling. ChatGPT is just search aggregation with a fancy front end. None of these things generate information independent of the user defined rules or software defined boundaries, thus why it is so easy to censor information immediately. As for research, literally nothing has changed.. data is compiled, an algorithmic is authored to seek a model, where is the AI?

    • @Saturnine37
      @Saturnine37 6 หลายเดือนก่อน

      Unprovoked generation is and was the metric for the term in which field? Computer science, or science fiction and general aspiration?
      Thinking of early intelligence in single-celled life, a part of it must have been in reacting to light when moving around in the water. Seeing energy, food, and the environment. Is that not intelligence enough for something not alive yet to be able to autonomously sense and react to the world.
      Artificial intelligence for me should connect all modes of sensing and making inferences into a single place. Then, computer vision is exactly AI in the same sense as computer generation "unprovoked" or not.

  • @vitalyl1327
    @vitalyl1327 หลายเดือนก่อน

    Vision is hard problem for.humans and animals too. We need a lot of frames and points of view to figure things out, and still make a lot of mistakes.