David Duvenaud | Reflecting on Neural ODEs | NeurIPS 2019

แชร์
ฝัง
  • เผยแพร่เมื่อ 14 ธ.ค. 2019
  • Original paper: arxiv.org/abs/1806.07366
    David's homepage: www.cs.toronto.edu/~duvenaud/
    Summary:
    We introduce a new family of deep neural network models. Instead of specifying a discrete sequence of hidden layers, we parameterize the derivative of the hidden state using a neural network. The output of the network is computed using a black-box differential equation solver. These continuous-depth models have constant memory cost, adapt their evaluation strategy to each input, and can explicitly trade numerical precision for speed. We demonstrate these properties in continuous-depth residual networks and continuous-time latent variable models. We also construct continuous normalizing flows, a generative model that can train by maximum likelihood, without partitioning or ordering the data dimensions. For training, we show how to scalably backpropagate through any ODE solver, without access to its internal operations. This allows end-to-end training of ODEs within larger models.

ความคิดเห็น • 12

  • @tookerjerbs
    @tookerjerbs 4 ปีที่แล้ว +80

    I'm afraid this talk gave some people the impression that I was flippant about careful scholarship, or wrote a misleading paper with no substance. I want to be clear that I do take these things seriously, and that the paper made substantial contributions which still stand. The message I intended was that *even though we tried hard to be careful*, we still made mistakes, which we eventually fixed.
    Regarding novelty, I glossed over some details in the talk and maybe made things sound worse than they were. The original version of the paper clearly cited previous uses of the adjoint sensitivities method. The claim that Joel was annoyed about was that we (mistakenly thought) we were the first to have done so entirely using general and efficient vector-Jacobian products with autodiff. To check this claim, we had closely examined (and cited) the implementations in Stan, Fatode and Dolfin. After we published, Joel Andersson pointed out that his package, CasADi, did do a general vector-Jacobian implementation, which we cited in the next version. I think I also gave the wrong impression when I said "the only thing we're doing is bringing this to PyTorch" - I was just referring to the adjoint sensitivities method. There is plenty of other novelty in the paper.
    I also want to update the story about the MIT Tech Review article. After this talk, Karen Hao explained to me what she had meant about the 'inventing ODEs' claim. She initially thought that we were calling our new method 'ODE Solvers', which is why the first version of the article said we were proposing a new method with that name. As for the line about how we "need to work on our branding", she said: "That line was meant to tease the fact that you simply named your new neural network very literally, after ODEs, instead of choosing a simpler, perhaps more figurative, name. (Similar to if I had invented a new apple cutting device and just called it “apple cutting device” if you catch my drift.) Of course, I see now why it made it sound like you were the first to ever string together the words “ordinary differential equations.” Hence, why I corrected it upon request.". Until that conversation, I didn't realize that Karen Hao already knew about ODEs. The original version of the article did clearly give the impression that we had invented ODE solvers, but I'm sorry for having passed on my mistaken impression that Karen didn't understand ODEs at all in this talk.

    • @freddiekalaitzis5708
      @freddiekalaitzis5708 4 ปีที่แล้ว +3

      "you can't cross the streams with Neurals ODEs"
      I see what you did there

  • @Erotemic
    @Erotemic 4 ปีที่แล้ว +59

    This is how you science. Great job at owning your mistakes and valuing truth over profit.

    • @vicktorioalhakim3666
      @vicktorioalhakim3666 4 ปีที่แล้ว +2

      The fact that he got away with such bullshit, HIGHLIGHTS the problems with modern review process and *doing Science*, especially in ML. Neural ODEs... LOL There's no way you can convince me that this bullshit based on Euler's method is actually useful XDD

  • @revoiceful
    @revoiceful 4 ปีที่แล้ว +9

    Very valuable too see someone admitting to failures in science. Still, I think the idea is refreshing and we need that more even if there might be problems to begin with. Isn't that how science is supposed to work anyway ?

  • @keixi512
    @keixi512 4 ปีที่แล้ว +17

    Really appreciate his honesty.

  • @linglingfan8138
    @linglingfan8138 3 ปีที่แล้ว +1

    It works, and people are using it. That explains everything. I like this work a lot.

  • @technokicksyourass
    @technokicksyourass 3 ปีที่แล้ว +6

    One thing that isn't bullshit about ODE's. They are O(1) in memory. They also do a nice job of modelling dynamics.

  • @loremipsum7513
    @loremipsum7513 4 ปีที่แล้ว +5

    Simply respect.

  • @dewinmoonl
    @dewinmoonl 4 ปีที่แล้ว +3

    pay respect in the chat

  • @bocckoka
    @bocckoka 4 ปีที่แล้ว +8

    his mannerisms are like Elon Musk's

  • @impolitevegan3179
    @impolitevegan3179 4 ปีที่แล้ว

    F