Simple reverse-mode Autodiff in Python

แชร์
ฝัง
  • เผยแพร่เมื่อ 23 ธ.ค. 2024

ความคิดเห็น • 12

  • @sfdv1147
    @sfdv1147 ปีที่แล้ว +2

    Very clear explanation! Thanks and hope you'll get more views

  • @houkensjtu
    @houkensjtu ปีที่แล้ว +1

    Thank you for the clear explanation! I wonder where does the term "cotangent" come from? A google search shows it comes from differential geometry, do I need to learn differential geometry to understand it ...?

    • @MachineLearningSimulation
      @MachineLearningSimulation  ปีที่แล้ว

      You're welcome 🤗
      Glad you liked it. The term cotangent is borrowed from differential geometry, indeed. If you are using reverse-mode autodiff to compute the derivative of a scalar-valurd loss, you can think of the cotangent associated with a node in the computational graph to be the derivative of the loss wrt that node. More abstractly it is just the auxiliary quantity associated with each node.

  • @harikrishnanb7273
    @harikrishnanb7273 ปีที่แล้ว +1

    can you please tell your recommend resources to learn maths? or how did you learned math?

    • @MachineLearningSimulation
      @MachineLearningSimulation  ปีที่แล้ว +2

      Hi,
      it's a great question, but very hard to answer. I can't pinpoint it to this one approach, this one text book etc.
      I have an engineering math background (I studied mechanical engineering for my bachelor degree). Generally speaking, I prefer the approach taken in engineering math classes, being more algorithmically focused than theorem-proof focused. Over the course of my undergrad, I used various TH-cam resources (which also motivated me doing this channel). The majority were in German, some English-speaking include the vector calculus videos by Khan academy (which werde done by grant Sanderson) and of course 3b1b.
      For my graduate education, I figured that I really liked reading documentation and seeing API interfaces of various numerical Computer programs. JAX and tensorflow have amazing docs. This is also helpful for PDE simulations. Usually, I guide myself by Google, forum posts and a general sense of curiosity. 😊

    • @harikrishnanb7273
      @harikrishnanb7273 ปีที่แล้ว +1

      @@MachineLearningSimulation thanks for the reply

    • @MachineLearningSimulation
      @MachineLearningSimulation  ปีที่แล้ว +1

      You're welcome 😊
      Good luck with your learning journey

  • @zhenlanwang1760
    @zhenlanwang1760 ปีที่แล้ว

    Thanks for the great content as always. One question and one comment. How would you handle it if it is a DAG instead of a chain? Any reference (book/paper) that you can share? I noted that for symbolic differentiation, you pay the price of redundant calculation (quadratic in the length of chains) but with constant memory. On the other hand, the auto-diff caches the intermediate values and has linear calculation but also linear memory.

    • @MachineLearningSimulation
      @MachineLearningSimulation  11 หลายเดือนก่อน +2

      Hi,
      Thanks for the kind comment :), and big apologies for the delayed reply. I just started working my way through a longer backlog of comments; it's been a bit busy in my private life the past months.
      Regarding your question: For DAGs the approach is to either record a separate Wengert list or overload operations. It's a bit harder to truly find a taxonomy for this because different autodiff engines all have their own style (source transformation at various stages in the compiler/interpreter chain vs. pure operator overloading in the high-level language, restrictions to certain high-level linear algebra operations, etc.). I hope I can finish the video series with some examples of it in the coming months.
      These are some links that directly come to my mind: The "Autodidact" repo is one of the earlier tutorials of simple (NumPy-based) autodiff engines in Python, written by Matthew Johnson (co-author of the famous HIPS autograd package): github.com/mattjj/autodidact . The HIPS autograd authors are also involved in the modern JAX package (that is featured quite often on the channel). There is a similar tutorial called "autodidaX": jax.readthedocs.io/en/latest/autodidax.html
      The microgrid package by Andrey Karpathy is also very insightful: github.com/karpathy/micrograd . It is based on "PyTorch-like" perspective. His video on "Becoming a backdrop Ninja" can also be helpful.
      In the Julia world: you might find the documentation of the "Yota.jl" package helpful: dfdx.github.io/Yota.jl/dev/design/
      Hope that gave some first resources. :)