How do neural networks do modular addition?

แชร์
ฝัง
  • เผยแพร่เมื่อ 27 ก.ย. 2024
  • tl;dw it's clock arithmetic
    Art by ‪@hamishdoodles‬
    Clipped from episode 19 of AXRP: • 19 - Mechanistic Inter...
    Transcript of that episode: axrp.net/episo...
    ---
    AXRP patreon: / axrpodcast
    AXRP ko-fi: ko-fi.com/axrp...

ความคิดเห็น • 1

  • @campbellhutcheson5162
    @campbellhutcheson5162 ปีที่แล้ว +1

    I think part of the problem is that this is actually an explanation of why the Neural Network's method works, not an explanation of what its method actually is. Since, the network hasn't learned the concept of the trig functions, it's just learned how to embed the inputs (0-113) on a lossy version of the trig curves etc...
    A mechanical description, I think, would also be clearer to a less math-y audience. It feels to me like the (quite excellent) authors saw the math that they were familiar with and honed in on why it worked, rather than giving just a straight account of what the network is doing in a step by step fashion.