Liquid Neural Networks

แชร์
ฝัง
  • เผยแพร่เมื่อ 1 ธ.ค. 2024

ความคิดเห็น • 187

  • @adityamwagh
    @adityamwagh ปีที่แล้ว +9

    It just amazes me how the final few layers are so crucial to the objective of the neural network!

  • @FilippoMazza
    @FilippoMazza 3 ปีที่แล้ว +43

    Fantastic work. The relative simplicity of the model proves that this methodology is truly a step towards artificial brains. Expressivity, better causality and the many neuron inspired improvements are inspiring.

    • @NickH-o5l
      @NickH-o5l 5 หลายเดือนก่อน

      It would make more sense if it made more sense. I'd love it if this was comprehensible

  • @marcc16
    @marcc16 ปีที่แล้ว +31

    0:00: 🤖 The talk introduces the concept of liquid neural networks, which aim to bring insights from natural brains back to artificial intelligence.
    - 0:00: The speaker, Daniela Rus, is the director of CSAIL and has a curiosity to understand intelligence.
    - 2:33: The talk aims to build machine learned models that are more compact, sustainable, and explainable than deep neural networks.
    - 3:26: Ramin Hasani, a postdoc in Daniela Rus' group, presents the concept of liquid neural networks and their potential benefits.
    - 5:11: Natural brains interact with their environments to capture causality and go out of distribution, which is an area that can benefit artificial intelligence.
    - 5:34: Natural brains are more robust, flexible, and efficient compared to deep neural networks.
    - 6:03: A demonstration of a typical statistical end-to-end machine learning system is given.
    6:44: 🧠 This research explores the attention and decision-making capabilities of neural networks and compares them to biological systems.
    - 6:44: The CNN learned to attend to the sides of the road when making driving decisions.
    - 7:28: Adding noise to the image affected the reliability of the attention map.
    - 7:59: The researchers propose a framework that combines neuroscience and machine learning to understand and improve neural networks.
    - 8:23: The research explores neural circuits and neural mechanisms to understand the building blocks of intelligence.
    - 9:32: The models developed in the research are more expressive and capable of handling memory compared to deep learning models.
    - 10:09: The systems developed in the research can capture the true causal structure of data and are robust to perturbations.
    11:53: 🧠 The speaker discusses the incorporation of principles from neuroscience into machine learning models, specifically focusing on continuous time neural networks.
    - 11:53: Neural dynamics are described by differential equations and can incorporate complexity, nonlinearity, memory, and sparsity.
    - 14:19: Continuous time neural networks offer advantages such as a larger space of possible functions and the ability to model sequential behavior.
    - 16:00: Numerical ODE solvers can be used to implement continuous time neural networks.
    - 16:36: The choice of ODE solver and loss function can define the complexity and accuracy of the network.
    17:07: ✨ Neural ODEs combine the power of differential equations and neural networks to model biological processes.
    - 17:07: Neural ODEs use differential equations to model the dynamics of a system and neural networks to model the interactions between different components.
    - 17:35: The adjoint method is used to compute the gradients of the loss in respect to the state of the system and the parameters of the system.
    - 18:35: Neural ODEs have high memory complexity but are more accurate than the adjoint method.
    - 19:17: Neural ODEs can be inspired by the dynamics of biological systems, such as the leaky integrator model and conductance-based synapse model.
    - 20:43: Neural ODEs can be reduced to an abstract form with sigmoid activation functions.
    - 21:33: The behavior of the neural ODE depends on the inputs of the system and the coupling between the state and the time constant of the differential equation.
    22:26: ⚙️ Liquid time constant networks (LTCs) are a type of neural network that uses differential equations to control interactions between neurons, resulting in stable behavior and increased expressivity.
    - 22:26: LTCs have the same structure as traditional neural networks but use differential equations to control interactions between neurons.
    - 24:25: LTCs have stable behavior and their time constant can be bounded.
    - 25:26: The synaptic parameters in LTCs determine the impact on neuron activity.
    - 25:50: LTCs are a universal approximator and can approximate any given dynamics.
    - 26:23: Trajectory length measure can be used to measure the expressivity of LTCs.
    - 27:58: LTCs consistently produce longer and more complex trajectories compared to other neural network representations.
    28:46: 📊 The speaker presents an empirical analysis of different types of networks and their trajectory lengths, and evaluates their expressivity and performance in representation learning tasks.
    - 28:46: The trajectory length of LTC networks remains higher regardless of changes in network width or initialization.
    - 29:04: Theoretical evaluation reveals a lower bound for expressivity of these networks based on weighted scale, biases scale, width, depth, and number of discretization steps.
    - 30:38: In representation learning tasks, LTCs outperform other networks, except for tasks with longer term dependencies where LSTMs perform better.
    - 31:13: LTCs show better performance and robustness in real-world examples, such as autonomous driving, with significantly reduced parameters.
    - 33:09: LTC-based networks impose an inductive bias on convolutional networks, allowing them to learn a causal structure and exhibit better attention and robustness to perturbations.
    34:22: ⚙️ Different neural network models have varying abilities to learn representations and perform in a causal manner.
    - 34:22: The CNN consistently focuses on the outside of the road, which is undesirable.
    - 34:31: LSTM provides a good representation but is sensitive to lighting conditions.
    - 34:39: CTRNN or neural ODEs struggle to gain a nice representation in this task.
    - 36:07: Physical models described by ODEs can predict future evolution, account for interventions, and provide insights.
    - 38:36: Dynamic causal models use ODEs to create a graphical model with feedback.
    - 39:55: Liquid neural networks can have a unique solution under certain conditions and can compute coefficients for causal behavior.
    40:18: 🧠 Neural networks with ODE solvers can learn complex causal structures and perform tasks in closed loop environments.
    - 40:18: Dynamic causal models with parameters B and C control collaboration and external inputs in the system.
    - 41:12: Experiments with drone agents showed that the neural networks learned to focus on important targets.
    - 41:58: Attention and causal structure were captured in both single and multi-agent environments.
    - 43:05: The success rate of the networks in closed loop tasks demonstrated their understanding of the causal structure.
    - 43:46: Complexity of the networks is tied to the complexity of the ODE solver, leading to longer training and test times.
    - 44:53: The ODE-based networks may face vanishing gradient problems, which can be mitigated with gating mechanisms.
    45:41: 💡 Model-free inference and liquid networks have the potential to enhance decision-making and intelligence.
    - 45:41: Model-free inference captures temporal aspects of tasks and performs credit assignment better.
    - 45:53: Liquid networks with causal structure enable generative modeling and further inference.
    - 46:32: Compositionality and differentiability make these networks adaptable and interpretable.
    - 46:40: Adding CNN heads or perception modules can handle visual or video data.
    - 48:09: Working with objective functions and physics-informed learning processes can enhance learning.
    - 49:02: Certain structures in liquid networks can improve decision-making for complex tasks.
    Recap by Tammy AI

  • @martinsz441
    @martinsz441 3 ปีที่แล้ว +72

    Sounds like an important and necessary evolution of ML. Lets see how much this can be generalized and scaled but sounds fascinating.

    • @David-rb9lh
      @David-rb9lh 3 ปีที่แล้ว +5

      I will try to use it.
      I think that a lot of studies and reports will make interesting returns .

    • @maloxi1472
      @maloxi1472 2 ปีที่แล้ว +2

      "Necessary" for which specific applications ? Surely not "necessary" across the board.
      I'd like to see you elaborate

    • @andrewferguson6901
      @andrewferguson6901 ปีที่แล้ว +4

      ​@@maloxi1472necessary for not spending 50 million dollars for a 2 month training computation?

    • @maloxi1472
      @maloxi1472 ปีที่แล้ว +2

      @@andrewferguson6901 You wrongly assume that the product of that training is necessary to begin with.

  • @agritech802
    @agritech802 ปีที่แล้ว +16

    This is truly a game changer in AI, well done folks 👍

    • @ShpanMan
      @ShpanMan 5 หลายเดือนก่อน

      Where was the game changed? I fail to see it.

    • @LukasNitzsche
      @LukasNitzsche 5 หลายเดือนก่อน +1

      @@ShpanMan Yes I'm thinking the same, why haven't LNN's been implement more?

  • @lorenzoa.ricciardi4264
    @lorenzoa.ricciardi4264 3 ปีที่แล้ว +52

    The "discovery" that fixed time steps for ODE work better in this case is very well known in the optimal control literature (at least by a couple of decades).
    Basically if your ODE solver has adaptive time steps, the exact mathematical operations performed for a given integration time interval dT can vary because a different number of internal steps is performed. This can have really bad consequences on the gradients of the final time states.
    There's plenty of theoretical and practical discussion in Betts' book Practical Methods for Optimal Control, chapter 3.9 Dynamic Systems Differentiation.

    • @abinaslimbu3057
      @abinaslimbu3057 ปีที่แล้ว

      Lord siva
      Gass state State (liquid) Gass light

    • @abinaslimbu3057
      @abinaslimbu3057 ปีที่แล้ว

      Humoid into human

    • @abinaslimbu3057
      @abinaslimbu3057 ปีที่แล้ว

      State Gass powered

    • @DigitalTiger101
      @DigitalTiger101 ปีที่แล้ว +12

      @@abinaslimbu3057 Schizo moment

    • @iamyouu
      @iamyouu ปีที่แล้ว

      @@abinaslimbu3057 why are you doing this? I know no one who's actually hindu would comment such stupid sht.

  • @hyperinfinity
    @hyperinfinity 3 ปีที่แล้ว +127

    Most underrated talk. This is an actual game changer for ML.

    • @emmanuelameyaw9735
      @emmanuelameyaw9735 3 ปีที่แล้ว +40

      :) most overrated comment on this video.

    • @arturtomasz575
      @arturtomasz575 3 ปีที่แล้ว +5

      It mind be! Let's see how it performs in specific tasks against state of the art solutions, not against toy-models of specific architecture, can't wait to try it myself especially vs transformers or residual CNNs :)

    • @guopengli6705
      @guopengli6705 3 ปีที่แล้ว +21

      I think that it is way too early to say this. A few mathematicians tried to improve DNNs' interpretability in similar ways. This comment seems perhaps over-optimistic from a viewpoint of theory. We do need testing its performance in more CV tasks.

    • @KirkGravatt
      @KirkGravatt 3 ปีที่แล้ว +1

      yeah. this got me to chime back in. holy shit.

    • @moormanjean5636
      @moormanjean5636 2 ปีที่แล้ว +2

      @@guopengli6705 nah u werent paying attention, this revolutionizes causal learning in my opinion while improving on the state-of the art

  • @peceed
    @peceed ปีที่แล้ว +1

    Casualty is extremely important in building Bayesian model of the world. It allows to identify correlations between events, that are useful to create a-priori statistics for reasoning, because we avoid double-counting. Single evidence and its logical consequences is not seen as many independent confirmations of hypothesis.

  • @isaacgutierrez5283
    @isaacgutierrez5283 3 ปีที่แล้ว +103

    I prefer my Neural Networks solid thank you very much

    • @rainbowseal69
      @rainbowseal69 6 หลายเดือนก่อน

      🥹🥹🥹😂😂😂😂😂

    • @anushkathakur5062
      @anushkathakur5062 5 หลายเดือนก่อน

      😂😂😂😂😂

    • @mystifoxtech
      @mystifoxtech 5 หลายเดือนก่อน +4

      If you don't like liquid neural networks then you really won't like gaseous neural networks.

    • @zappy9880
      @zappy9880 5 หลายเดือนก่อน +2

      @@mystifoxtech plasma neural networks gonna when?

  • @raminkhoshbin9562
    @raminkhoshbin9562 3 ปีที่แล้ว +15

    I got so happy finding out the person who wrote this exciting paper, is also a Ramin :)

    • @janbiel900
      @janbiel900 3 ปีที่แล้ว +4

      Thats cute :)

  • @scaramir45
    @scaramir45 3 ปีที่แล้ว +14

    i hope that one day i'll be able to fully understand what he's talking about... but it sounds amazing and i want to play around with it!

  • @johnniefujita
    @johnniefujita ปีที่แล้ว +4

    does anyone know about any sample code for a model like this?

  • @alwadud9243
    @alwadud9243 3 ปีที่แล้ว +19

    Thanks Ramin and team. That was the most interesting and well delivered presentation on neural nets that I have ever seen, certainly a lot new to learn in there. Most impressed by the return to learning from nature and the brain and how that significantly augmented 'standard' RNNs etc. Well, there's a new standard now, and it's liquid.

    • @mrf664
      @mrf664 ปีที่แล้ว

      @alwadud9243 can you explain how this works?

    • @mrf664
      @mrf664 ปีที่แล้ว

      I think it is interesting too but I fail to grasp any Intuition.
      The only other way I see is to spend hours with papers and equations, but I cannot afford the time for that at present so I was curious if you were able to glean more insight than me :) 😊 thanks!

  • @andreylebedenko1260
    @andreylebedenko1260 3 ปีที่แล้ว +6

    Sounds interesting, but... The fundamental difference between biological and NN processing at the current state is time. While biological systems process input asynchronously, computers try to do the whole path in one tick. I believe, this must be addressed first, leading to a completely new concept of NN, where input neurons will generate a set of signals first (with some variations as per change in the physical source signals), those signals then will be accumulated by the next layer of NN, processed in the same fashion, and passed further. This way signals which will repeat over the multiple sampling ticks of the first layer will be treated with a higher trust (importance) level on the next layer.

    • @terjeoseberg990
      @terjeoseberg990 3 ปีที่แล้ว +2

      Our brains learn as we use them. Artificial neural networks are “trained” by using gradient decent to optimize an extremely complex function for a dataset during a training phase, then as it’s used to predict answers it learns nothing.
      We need continuous reinforcement learning.

    • @andreylebedenko1260
      @andreylebedenko1260 3 ปีที่แล้ว

      @@terjeoseberg990 What about recurrent neural networks? Besides, the human's brain also first learns how to: see, grab, hold, walk, speak etc -- i.e. builds models -- and then it uses these models, improving them, but never reinventing them.

    • @terjeoseberg990
      @terjeoseberg990 3 ปีที่แล้ว

      @@andreylebedenko1260, Recurrent neural networks are also trained using gradient decent. I don’t believe our brains have a gradient descent mechanism. I have mo clue how our brains learn. Gradient descent is pretty simple to understand. What our brains do is a complete mystery.

    • @hi-gf5yl
      @hi-gf5yl 2 ปีที่แล้ว

      @@terjeoseberg990 th-cam.com/video/Q18ahll-mRE/w-d-xo.html

    • @moormanjean5636
      @moormanjean5636 2 ปีที่แล้ว

      @@terjeoseberg990 actually look up backpropogation in the brain, it is plausible that we are doing something similar to backprop at the end of the day

  • @ibraheemmoosa
    @ibraheemmoosa 3 ปีที่แล้ว +9

    Attention map at 7:00 looks fine to me. If you do not want to wander off out of the road, you should attend to the boundary of the road. And even after you add noise at 7:30, the attention still picks up the boundary which is pretty good.

    • @vegnagunL
      @vegnagunL 3 ปีที่แล้ว

      Yes, it still is a consistent pattern for the driving task.

    • @AsifShahriyarSushmit
      @AsifShahriyarSushmit 3 ปีที่แล้ว +3

      This sounds kinda like the MesaOptimizatier thing Robert Miles keeps talking about. th-cam.com/video/bJLcIBixGj8/w-d-xo.html
      A network can learn the same task in several ways with totally different inner objective which may or may not align with a biological agent doing the same task.

    • @ChrisJohnsonHome
      @ChrisJohnsonHome 3 ปีที่แล้ว +1

      Because the LTC Network is uncovering the causal structure, it performs much better in noise (33:26), heavy rain/occlusions (42:54) and crashes less in the simulation.
      Since it pays attention to the causes, I wonder if it's also giving itself more time to steer correctly?

    • @moormanjean5636
      @moormanjean5636 2 ปีที่แล้ว

      @@ChrisJohnsonHome I would guess that yes, the time constants of the network would learn to modulate in the face of uncertainty

    • @moormanjean5636
      @moormanjean5636 2 ปีที่แล้ว +2

      Looking at the boundary of the road is not how humans drive. We assume that we know where the nearby boundary is already and so look at the horizon to update our mental maps. It is reasonable that neural networks should look to do the same, and it is evidence of LTC's causal behavior.

  • @Seekerofknowledges
    @Seekerofknowledges ปีที่แล้ว +3

    I am moved beyond description.
    What an amazing privilege to be alive in this day and age.
    The future will be great for mankind.

  • @araneascience9607
    @araneascience9607 3 ปีที่แล้ว +1

    Great work,i hope that they publish the article of this model soon.

  • @paulcurry8383
    @paulcurry8383 3 ปีที่แล้ว +16

    How does the attention map showing that the LNN was looking at the vanishing point mean it’s forming “better” representations?
    Shouldn’t “better” representations only be understood as having better performance? If it’s more explainable that’s cool but there’s ways to train CNNs that make them more explainable while hurting performance.

    • @jellyboy00
      @jellyboy00 3 ปีที่แล้ว +6

      Can't agree more. It would be more persuasive if the Liquid Neural Networks are immune to some problem that previous architecture generally struggles, such as cases about adversarial examples.
      The fact that Liquid Neural Networks can't learn long term dependency, compared with LSTM is sort of disappointing, as LSTM is already underperforming compared with attention only model.
      Not to mention that spike neural network is something that I myself (not an expert though) would say are designed according to biological brain mechanism.

    • @rainmaker5199
      @rainmaker5199 3 ปีที่แล้ว +2

      Isn't the point of looking at the attention map to understand how the network is understanding the current issue? When they showed the attention maps for all the models we could see that the LSTM was mostly paying attention to the road like 5-10 feet ahead, making it sensitive to immediate changes in lighting conditions. The LNN was paying attention to the vanishing point to understand the way the road evolves (at least it seemed like that's what they were getting at), and therefore not being sensitive to immediate changes in light level? It doesn't mean its forming 'better' representations, just that being able to distinguish what each representation is using as key information allows us to make more robust models that are less sensitive to common pitfalls one might fall into.

    • @jellyboy00
      @jellyboy00 3 ปีที่แล้ว

      @@rainmaker5199 for me that is more like an interpretability issue. And for general auto driving, i think there is no definite answer about where the model should look at, otherwise it becomes a soft handcrafted constraint or curriculum learning. It is still reasonable for the ai to look at the side way as it also tell something about the curvature of the road. And in general auto driving, there might be obstacle or pedestrian poping anywhere, so this claim about attending the vanishing point of the road is better sounds less persuasive. Generally speaking one do not even know what part of the input should be attended in the fitst. place.

    • @rainmaker5199
      @rainmaker5199 3 ปีที่แล้ว +1

      @@jellyboy00 I think you misunderstood me, I'm not claiming that the model attending to the vanishing point is a better self driving model for all circumstances, just that it's better at understanding that the road shape can be determined ahead of time rather than in the current step of points. This allows us to have the possibility of distributing responsibility between multiple models focused on more specific tasks. So basically, the fact that its able to tell the road shape earlier and with less continuous information alongside the fact that we know more specifically what task is being accomplished (rather than a mostly black box) is the valuable contribution here.

  • @KeviPegoraro
    @KeviPegoraro 3 ปีที่แล้ว +8

    very good, the idea for that is simple, the problem relay in puting all of it to work together, that is the good stuff

  • @phquanta
    @phquanta 3 ปีที่แล้ว +9

    I'm curious would't numerical solver to ODE kill all gradients, getting error scaling exponentially as depth grows ?

    • @lorenzoa.ricciardi4264
      @lorenzoa.ricciardi4264 3 ปีที่แล้ว +1

      You can differentiate through an ODE solver, either manually or automatically. Even numerical gradients may work if you're careful with your implementation

    • @phquanta
      @phquanta 3 ปีที่แล้ว

      @@lorenzoa.ricciardi4264 You mean like and AdaGrad type thing ? Given that gradients can be computed exactly, i.e. solution to ODE exists in closed form - i would assume there would be no such problem. On the other hand, if there is no such thing as closed solution to ODE presented, one probably is limited by depth of neural net, even with approaches like LSTM/GRU, "higher-order" ODE solvers etc.

    • @lorenzoa.ricciardi4264
      @lorenzoa.ricciardi4264 3 ปีที่แล้ว

      @@phquanta I'm talking about automatic differentiation, there's several packages to do it in many languages. And you can compute those automatic gradients without knowing the closed solution of the ODE. Of course, if you have the closed form solution you can compute gradients manually, but that's not my point.

    • @phquanta
      @phquanta 3 ปีที่แล้ว

      @@lorenzoa.ricciardi4264 All NN have backprop and chain rule that basically unravels all derivatives exactly as nonlinearities are easily differentiable. In liquid NN, along with all other problems(vanishing/exploding gradients) you are adding a source of inherent numerical error on top of existing ones and even Runge-Kutta won't help. What I'm saying, you are limited by the depth of Liquid NN. As a concept, it might be cool, but I would assume it is not easily scalable.

    • @lorenzoa.ricciardi4264
      @lorenzoa.ricciardi4264 3 ปีที่แล้ว +1

      @@phquanta I'm not particularly expert in NN. From what I see in the presentation there's only a few layers in this approach, not dozens. The "depth" mostly seems to come from the continuous process described by the odes of the synapses.
      You shouldn't have particular problems when differentiating through ODEs if you know how to do it properly. One part of the problem may be related to the duration of time you're integrating your ODE for (not mentioned in the talk) and the nonlinearity of the underlying dynamics. When you deal with very nonlinear optimal control problems a naive approach like a single shooting (which is probably related to the simplest kind of backpropagation) you'll end up having horrible sensitivity problems. That's why multiple shooting was invented. Otherwise you can use collocation methods that work even better.
      As for numerical errors: with automatic differentiation your errors are down to machine epsilon by construction. They are as good as analytical ones, but most often they are way faster to execute, and one does not have to do the tedious job of computing them manually. If you combine a multiple shooting approach with automatic differentiation, you don't have numerical error explosion (or better, you can control it very well). That's why we can compute very complicated optimal trajectories for space probes in a really precise way, even though the integration times spans years or even decades and the dynamics is extremely nonlinear.

  • @Ali-wf9ef
    @Ali-wf9ef 3 ปีที่แล้ว +2

    The video showed up in my feed randomly and I clicked on it just cause the lecturer was Iranian. But the content was so interesting that I watched it to the end. Sounds like a really evolutionary breakthrough in ML and DL. Specially with the computational power of computing systems growing every day, training/inferencing such complex network models become more possible. Great job

  • @AA-gl1dr
    @AA-gl1dr 3 ปีที่แล้ว +3

    amazing video. thank you for uploading.

  • @zephyr1181
    @zephyr1181 ปีที่แล้ว +2

    I would need a simpler version of the 22:49 diagram to understand this.
    Ramin says here that standard NN neurons have a recursive connection to themselves. I don't know a ton about ANNs, but I overhear from my coworkers, and I never heard of that recursive connection. Is that for RNNs?
    Is there a "Reaching 99% on MNIST"-simple explanation, or does this liquidity only work on time-series data?

  • @Superkuh2
    @Superkuh2 23 วันที่ผ่านมา

    Very cool. Might be worth it to reconsider the liquid model in terms of the lipid bilayer's physical properties that allow propagation of the action potential rather than a 1950s electrical model that can't explain things like optical scattering/birefringence or reversible heat take-up and release during the passing of the action potential down an axon.

  • @0MVR_0
    @0MVR_0 5 หลายเดือนก่อน

    naming this liquid
    analogizes the conventional network as solid or rigid.
    The analogous mechanism is the utility of differential equations that represent multitudinous connections,
    just as water molecules can more freely interact with a greater degree of neighbors.

    • @0MVR_0
      @0MVR_0 5 หลายเดือนก่อน

      actually this is likely a nonsense statement.
      the liquidity is provided through the architecture
      the sensory are convoluted, and the others seem to be recurrent
      excluding the motor

  • @Tbone913
    @Tbone913 ปีที่แล้ว +1

    But why do the other methods have smaller error bands? There is further improvement that can be done here

  • @wangnuny93
    @wangnuny93 3 ปีที่แล้ว +9

    man i dont work in ml field but sure this is fascinating!!!!

  • @samowarow
    @samowarow 3 ปีที่แล้ว +48

    Feels like ML folks keep rediscovering things all over

    • @lorenzoa.ricciardi4264
      @lorenzoa.ricciardi4264 3 ปีที่แล้ว +11

      Yep. Basically if you have a good theoretical level of optimal control theory you can see that this approach is fusing state observation and control policy. There's literally no mention of it in the whole talk. I'll give the benefit of the doubt for the reason of this, but unfortunately I *very* often see, as you say, that ML people rediscover stuff and rebrand it as a ML invention (like backprop, which is literally just a discretized version of a standard technique in calculus of variations/optimal control).

    • @moormanjean5636
      @moormanjean5636 2 ปีที่แล้ว +1

      @@lorenzoa.ricciardi4264 this is definitely not just "rediscovering" stuff. I'm not sure how you managed to watch the whole talk yet missed all the parts that this technique has outperformed previous techniques by leaps and bounds. You sound like a salty calculus teacher but I'll give you the benefit of the doubt for the reason of this lol.

    • @ShpanMan
      @ShpanMan 5 หลายเดือนก่อน

      @@moormanjean5636 Please show me the "Liquid" AI model that outperforms any modern LLMs, 2.5 years after this talk.

  • @michaelflynn6952
    @michaelflynn6952 3 ปีที่แล้ว +12

    Why does no one in this video seem to have any plan for what they want to communicate and in what order? So hard to follow

    • @AM-ng8wc
      @AM-ng8wc 3 ปีที่แล้ว +2

      They have engineering syndrome

  • @AndyBarbosa96
    @AndyBarbosa96 ปีที่แล้ว +2

    What is the difference betweem these LNNs and coupled ODEs? Aren't we conflating these terms. If you drive a car with only 19, then what you have is an asynchronous network of coupled ODEs and not a neural network, the term is misleading.

  • @jiananwang2681
    @jiananwang2681 ปีที่แล้ว

    Hi, thanks for the great video! Can you share the idea of how to visualize the activated neurons as in 6:06 in this video? It's really cool and I'm curious about it!

  • @petevenuti7355
    @petevenuti7355 ปีที่แล้ว

    I've had unarticulated thoughts with resemblance to this concept for many decades, I never learned the math, so never could express my ideas. I still need someone to explain the math at a highschool level!

  •  ปีที่แล้ว +3

    Great work!
    Does this mean all the training done for autonomous driving with the traditional NN goes to the toilet?

    • @matthiaswiedemann3819
      @matthiaswiedemann3819 ปีที่แล้ว

      For sure 😂

    • @User_1795
      @User_1795 10 หลายเดือนก่อน

      No, these still use convolutional layers for image processing.

  • @MLDawn
    @MLDawn 2 ปีที่แล้ว +2

    Could you please share the link to the original paper? Thanks

  • @Dr.Z.Moravcik-inventor-of-AGI
    @Dr.Z.Moravcik-inventor-of-AGI 3 ปีที่แล้ว +6

    There are so many smart people on MIT that America must be already a superintelligent nation. Please continue your work and this world will become a wonderful place to live in.

    • @edthoreum7625
      @edthoreum7625 ปีที่แล้ว

      By now the entire human race should be at incredible level of intelligence ,, even traveling out of our solar system with fusion run space shuttles!

    • @AndyBarbosa96
      @AndyBarbosa96 ปีที่แล้ว

      Yeah, America is so "intellligent" flying high on borrowed talent ....

    • @quonxinquonyi8570
      @quonxinquonyi8570 ปีที่แล้ว

      @@AndyBarbosa96this intelligentsia drop to significant level to all the second generation of these first generation geniuses...simple fact...therefore that borrowing approach is single most numero uno policy of American technological might...as Hillary Clinton right lay said some years ago that “ power of America resides outside of America”

  • @1238a8
    @1238a8 6 หลายเดือนก่อน +1

    That's amazing concept. We should implement it out of spite.
    Too often we feel our brain to be a mush. Ai should suffer that way too.

    • @ShpanMan
      @ShpanMan 5 หลายเดือนก่อน

      Out of "Spike" maybe 😂

    • @tempname8263
      @tempname8263 5 หลายเดือนก่อน

      Your brain is just undertrained bro

  • @rickharold7884
    @rickharold7884 3 ปีที่แล้ว +5

    Always interesting. Thx

  • @vdwaynev
    @vdwaynev 2 ปีที่แล้ว +1

    How do these compare to neural ode?

  • @Alexander_Sannikov
    @Alexander_Sannikov 3 ปีที่แล้ว +48

    - let's make another attempt at implementing a biology-inspired neural network
    - proceeds implementing backprop

    • @fernbear3950
      @fernbear3950 3 ปีที่แล้ว

      Direct MLE over direct data is still the best (ATM, AFAIK) in class for implicitly performing regression over a distribution density w.r.t. the internal states/activations/features of a network.
      Generally the rule of thumb is to limit "big steps" from the main trunk of development so that impact can be measured, etc. It also helps to vastly (i.e. orders of magnitude) increase the change that something will succeed.
      Otherwise the chances of failure are much higher (and rarely get published, I would suspect from personal experience). I'm sure there is some nice interconnected minimally-required jump of feature subsets from this kind of research to a more Hebbian-kind-of-based approach, but then again there's nothing dictating we do it all at once (which can be exponentially expensive).
      Hopefully brain stuff comes in handy, but ATM the field is going towards massively nearly linear models instead of the opposite, since the prior affords better results (generally) for MLE-over-MCE.

    • @Gunth0r
      @Gunth0r ปีที่แล้ว

      @@fernbear3950 but linear models suck when there's big regime changes in the data

    • @fernbear3950
      @fernbear3950 ปีที่แล้ว

      @@Gunth0r I'm not sure what you mean by 'regime' changes here. I wasn't talking about anything linear at all here. MLE over a linear model would be, uh, interesting to say the least, lol.

  • @gameme-yb1jz
    @gameme-yb1jz 10 หลายเดือนก่อน

    this should be the next deep learning revolution.

  • @lufiporndre7800
    @lufiporndre7800 10 หลายเดือนก่อน

    Does anyone have code for an Autonomous car system, I would like to practice it. If anyone knows please share.

  • @d4rkn3s7
    @d4rkn3s7 3 ปีที่แล้ว +17

    Ok, after half of the talk, I stopped and read the entire paper, which kind of left me disappointed. LNNs are promoted as a huge step forward, but where are the numbers to back this up? I couldn't find them in the paper, and I strongly doubt that this is the "game-changer" as some suggest.

    • @JordanMetroidManiac
      @JordanMetroidManiac 3 ปีที่แล้ว +3

      Ramin makes a really good point about why LNNs could be better in a lot of situations, though. Time scale is continuous, allowing for the model to approximate any function with significantly fewer parameters. But I can imagine that the implementation might be so ridiculous that it will never replace DNNs.

    • @ChrisJohnsonHome
      @ChrisJohnsonHome 3 ปีที่แล้ว +2

      He goes over performance numbers starting at around 29:44

    • @moormanjean5636
      @moormanjean5636 2 ปีที่แล้ว

      They have a number of advantages, performance and efficiency being two of them. The problem I think a lot of people have is they expect ground-breaking results to be extremely obvious in terms of performance gains, as if the state-of-the-art wasn't extremely proficient to begin with. There are plenty of opportunities to scale the performance of LNNs, but it is their other theoretical properties that are what make them a game changer in my opinion.

    • @moormanjean5636
      @moormanjean5636 2 ปีที่แล้ว +1

      For example, their stability, their time-continuous nature, their causal nature, these are very important yet subtle properties of effective models. Not to mention you only need a handful to pull off what otherwise would take millions of parameters... how is that not a gamechanger??

    • @Gunth0r
      @Gunth0r ปีที่แล้ว

      I couldn't even find the paper, where is it?

  • @aminabbasloo
    @aminabbasloo 3 ปีที่แล้ว +3

    I am wondering how it does for RL scenarios!

    • @enricoshippole2409
      @enricoshippole2409 3 ปีที่แล้ว

      As am I. I plan on testing out some concepts using their LTC keras package. Will see how it goes

    • @moormanjean5636
      @moormanjean5636 2 ปีที่แล้ว +1

      I have used it and it works well, just use a slightly larger learning rate than LSTM.

  • @Eye_of_state
    @Eye_of_state 3 ปีที่แล้ว +2

    Must share technology that saves lives.

  • @maxlee3838
    @maxlee3838 6 หลายเดือนก่อน

    This guy is a genius.

  • @manasasb536
    @manasasb536 3 ปีที่แล้ว +3

    Can't wait to do a project on LNN and add it to my resume to stand out of the crowd.

  • @saeedrehman5085
    @saeedrehman5085 3 ปีที่แล้ว +4

    Amazing!!

  • @amanda.collaud
    @amanda.collaud 3 ปีที่แล้ว +7

    What about back propagations? Too bad he didnt finish his train of thought, this is rather an interview than source for knowledge/lesson.

    • @moormanjean5636
      @moormanjean5636 2 ปีที่แล้ว

      he explained two different ways of calculating gradients for ltc, each with their own pros and cons

  • @mishmohd
    @mishmohd ปีที่แล้ว

    Any association with Liquid Snake ?

  • @krishnaaditya2086
    @krishnaaditya2086 3 ปีที่แล้ว +3

    Awesome Thanks!

  • @StephenRoseDuo
    @StephenRoseDuo ปีที่แล้ว

    Can someone point to a simple LTC network implementation please?

  • @Tbone913
    @Tbone913 ปีที่แล้ว +1

    Can this be extended to a liquid transformer model?

    • @AndyBarbosa96
      @AndyBarbosa96 ปีที่แล้ว +1

      No, this is not an ANN. This is coupled ODEs for control. The term is misleading.

    • @Tbone913
      @Tbone913 ปีที่แล้ว

      @@AndyBarbosa96 ok thanks

  • @wadahadlan
    @wadahadlan 3 ปีที่แล้ว +3

    this was a great talk, this could change everything

  • @Axl_K
    @Axl_K 3 ปีที่แล้ว +6

    Fascinating... loved every minute.

  • @KCM25NJL
    @KCM25NJL 5 หลายเดือนก่อน

    Makes me wonder if the research of OpenAI and such will shift towards a multimodal language mode + universal approximators. I can imagine that the world can be differentially modelled from the pretrained weights of the collective knowledge of humanity .......... eventually.

  • @ian4692
    @ian4692 3 ปีที่แล้ว

    Where to get the slides?

  • @LarlemMagic
    @LarlemMagic 3 ปีที่แล้ว +3

    Get this video to the FSD team.

  • @VerifyTheTruth
    @VerifyTheTruth 3 ปีที่แล้ว +2

    Does The Brain Distribute Calculation Loads To Systemic Subsets, Initiating Feedback Loops From Other Cellular Systems With Different Calculative Specialties And Capacities, Based Upon The Types Of Contextual Information It Recieves From Extraneous Environmental Sources, Which It Then Uses To Construct Or Render The Most Relevant Context To Appropriate Consciousness Access To A Meaningful Response Field Trajectory?

    • @Niohimself
      @Niohimself 3 ปีที่แล้ว

      Pineapple

    • @VerifyTheTruth
      @VerifyTheTruth 3 ปีที่แล้ว

      @@Niohimself Pomegranate.

    • @k.k.9378
      @k.k.9378 3 ปีที่แล้ว

      @@Niohimself PineApple

    • @Gunth0r
      @Gunth0r ปีที่แล้ว

      Markov Blanket Stores.

  • @imanshahmari4423
    @imanshahmari4423 2 ปีที่แล้ว

    where can i find the paper ?

  • @matthiaswiedemann3819
    @matthiaswiedemann3819 ปีที่แล้ว +1

    To me it seems similar to variational inference ...

  • @fbomb3930
    @fbomb3930 5 หลายเดือนก่อน +1

    For all intents and purposes doesn't a liquid network work like a KAN? Change my mind.

  • @ibissensei1856
    @ibissensei1856 5 หลายเดือนก่อน +1

    As a newbie. It is hard to grasp

  • @shashidharkudari5613
    @shashidharkudari5613 3 ปีที่แล้ว +1

    Amazing talk

  • @zzmhs4
    @zzmhs4 3 ปีที่แล้ว

    I'm not an expert, but this sounds to me like this an implementation of different neurotransmitters, isn't?

    • @moormanjean5636
      @moormanjean5636 2 ปีที่แล้ว

      I don't see how it would be

    • @zzmhs4
      @zzmhs4 2 ปีที่แล้ว

      @@moormanjean5636 i've seen the video again to answer you, and still think the same.

    • @moormanjean5636
      @moormanjean5636 2 ปีที่แล้ว +1

      @@zzmhs4 Let me try to explain my POV. Different neurotransmitters in the brain serve specific and multifaceted roles, some of which are similar but usually not. I think of distinct neurotransmitters as essentially being subcircuits that are coupled together on diverse timescales and in various combinations. Evolution allowed these to emerge naturally, but in my opinion, you would need something like an neuroevolutionary algorithm to actually implement an analogue of neurotransmitters in neural networks. What LTCs propose is something fundamentally different, and I think more to do with a model of the neurons/synapses that is more biologically accurate than an attempted or indirect implementation of different neurotransmitters.

    • @zzmhs4
      @zzmhs4 2 ปีที่แล้ว

      @@moormanjean5636 Ok, I see, thanks for answer my first question

  • @9assahrasoum3asahboou87
    @9assahrasoum3asahboou87 3 ปีที่แล้ว

    fathi fes medos aziza 1 said Thank you so much

  • @jonathanperreault4503
    @jonathanperreault4503 3 ปีที่แล้ว

    at the end of the video he says these technologies are open sourced but there are no links in the video descriptions , can we gather the relevant code sources and git hub repos?

    • @Hukkinen
      @Hukkinen 3 ปีที่แล้ว +1

      Links are in the slides in the end

  • @imolafodor4667
    @imolafodor4667 ปีที่แล้ว

    is it really reasonable to "just" model a CNN for autonomous driving? it would be better to compare liquid nets with policies trained in an RL system (where at least some underlying goal was followed), no?

  • @danielgordon9444
    @danielgordon9444 3 ปีที่แล้ว +4

    ...it runs on water, man.

  • @dweb
    @dweb 3 ปีที่แล้ว +1

    Wow!

  • @ingenium7135
    @ingenium7135 3 ปีที่แล้ว +1

    Soo when AGI ?

    • @shadowkiller0071
      @shadowkiller0071 3 ปีที่แล้ว +5

      Gimme 5 minutes.

    • @egor.okhterov
      @egor.okhterov 3 ปีที่แล้ว +2

      Why they break the mental barrier of having to stick to backprop and gradient descent...

    • @afbeavers
      @afbeavers ปีที่แล้ว

      @@egor.okhterov Exactly. That would seem to be the roadblock.

  • @김화겸-y6e
    @김화겸-y6e ปีที่แล้ว +2

    1year later?

  • @sitrakaforler8696
    @sitrakaforler8696 3 ปีที่แล้ว +2

    Dam...nice.

    • @alwadud9243
      @alwadud9243 3 ปีที่แล้ว +1

      Yeah, I loved the part where he said '... and this is nice!'

  • @kayaba_atributtion2156
    @kayaba_atributtion2156 3 ปีที่แล้ว +1

    USA: I WILL TAKE YOUR ENTIRE MODEL

  • @jos6982
    @jos6982 2 ปีที่แล้ว

    good

  • @grimsk
    @grimsk 5 หลายเดือนก่อน

    루마니아의 보석

  • @ToddFarrell
    @ToddFarrell 3 ปีที่แล้ว +4

    To be fair though, he isn't at Stanford, so he hasn't sold out completely yet. Lets give him a chance :)

  • @abinaslimbu3057
    @abinaslimbu3057 ปีที่แล้ว

    John Venn 3 diagram class 10

  • @SphereofTime
    @SphereofTime 4 หลายเดือนก่อน

    7:00

  • @stc2828
    @stc2828 3 ปีที่แล้ว +1

    Very sad to see AI development fall into another hole. The last 10 years were fun while it lasted. See you guys 30 years later!

  • @MS-od7je
    @MS-od7je 3 ปีที่แล้ว

    Why is the brain a Mandelbrot set?

  • @yes-vy6bn
    @yes-vy6bn 3 ปีที่แล้ว +2

    @tesla 👀

  • @sahilpocker
    @sahilpocker 3 ปีที่แล้ว

    😮

  • @egor.okhterov
    @egor.okhterov 3 ปีที่แล้ว +7

    He failed at 2 things:
    1. He decided to solve differential equations.
    2. He didn’t get rid of back propagation.
    Probably he is required to do some good math in order to publish papers and be paid a salary. As long as we have such an incentive from a scientific community, we would be stuck with suboptimal narrow AI based on statistics and back propagation.

    • @adrianhenle
      @adrianhenle 3 ปีที่แล้ว +1

      The alternative being what, exactly?

    • @egor.okhterov
      @egor.okhterov 3 ปีที่แล้ว +2

      @@adrianhenle emulation of cortical columns, the way Numenta does it. For example, there is a video "Alternatives to Backpropagation in Neural Networks" if you're interested: th-cam.com/video/oXyQU0aScq0/w-d-xo.html

    • @zeb1820
      @zeb1820 3 ปีที่แล้ว +4

      The differential equations was an example of how the continuous process of synaptic logic from neuroscience was used to enhance a standard RNN. He showed how he merged the two concepts mathematically to improve the expressivity of model. I believe this was more for our educational benefit than to develop or test what he had already achieved.
      I do get your point about back propagation, but that was not an aim of this exercise. No doubt when that is solved it msy also, at some stage, be useful to merge that with the neuroscience enhanced NN described here.

  • @vikrantvijit1436
    @vikrantvijit1436 3 ปีที่แล้ว +2

    Path breaking new ground forming revolutionary research work that will change the face of futures liberating Force focused On digital humanities SPINNED Technologies INNOVATIONS Spectrums.

    • @pouya685
      @pouya685 3 ปีที่แล้ว +3

      My head hurts after reading this sentence

  • @Goldenhordemilo
    @Goldenhordemilo 3 ปีที่แล้ว +1

    μ Muon Spec

  • @tismanasou
    @tismanasou ปีที่แล้ว +2

    If liquid neural networks were a serious thing, they would have gained a lot more attention in proper ML/AI conferences, not just TEDx and the shit you are presenting here.

    • @DanielSanchez-jl2vf
      @DanielSanchez-jl2vf ปีที่แล้ว +2

      i dont know man, the transformer took 5 years for people to take it seriously; why wouldn't this?

  • @DarkRedman31
    @DarkRedman31 3 ปีที่แล้ว

    Not clear at all.

  • @niamcd6604
    @niamcd6604 ปีที่แล้ว +1

    PLEASE.... Do you mind bothering to pronounce other languages correctly!!?
    (And before people jump up .. I speak multiple languages myself).

  • @iirekm
    @iirekm หลายเดือนก่อน

    Terrible intro to the subject, understood nothing except they use differential equation. Maybe he's a good researcher, but a terrible teacher.

  • @tonyamyos
    @tonyamyos 3 ปีที่แล้ว +3

    Sorry but you make so many assumptions at almost every level. You are biased and your interpretation of the functionality and eventual use of this 'computational' model has nothing to do with how true intelligence arises. Start again. And this time leave your biases where they belong... in your professors heads.

  • @ToddFarrell
    @ToddFarrell 3 ปีที่แล้ว +2

    Really it is just an interview because he wants to get a job at Google and make lots of money to serve ads :)