Proof of Convergence

แชร์
ฝัง
  • เผยแพร่เมื่อ 27 ธ.ค. 2024

ความคิดเห็น • 16

  • @abrar_mohiuddin1
    @abrar_mohiuddin1 ปีที่แล้ว +2

    nice lecture

  • @doglibrary
    @doglibrary 2 หลายเดือนก่อน

    👏👏👏👏👏👏👏

  • @prithwishguha309
    @prithwishguha309 5 หลายเดือนก่อน +1

    No, the whole Proof is wrong because the whole statement is wrong. You can't prove that perceptron always reaches convergence when we clearly know it doesn't. it depends on the starting point(that's why we had to improvise it and introduce learning rate), And Also it doesn't on non linear separable data. And it does toggle (you said it doesn't which is Also wrong) But not the way you introduced it, it doesn't depend on the positive half or the negative half, because the algorithm is correctly pulling it towards the positive half(w=w+x) and pushing it from the negative half(w=w-x) like a magnet. No, the toggling Actually happens when it's almost correct when it's too close to w*, that's why we need smaller step size, smaller learning rate. And now let's talk about the proof, it doesn't even make Any sense, you just showed that a bigger quantity is bigger, like 3/4 is Greater than 2/5. So what? that doesn't prove Anything. you had to compare the cos(Bnew) with cos(B) And show that the Cos(Bnew) is bigger so the B new is smaller And that isn't even the case Always (like when it's too close to w*) And that's why it doesn't always converge And that's why you can't prove that it does because it really doesn't

    • @alekhpand
      @alekhpand 3 หลายเดือนก่อน +5

      Base assumption for this proof, is that the labels are linearly separable.
      If the labels are separable, then perceptron exists. If perceptron exists, then convergence happens.
      You could get stuck in local optima or run out of epochs/iterations. But that doesn't mean convergence failure. You are just unlucky. Don't blame professor/Math for your bad choice of SGD parameters.

    • @doglibrary
      @doglibrary 2 หลายเดือนก่อน

      Are you okay?

    • @prithwishguha309
      @prithwishguha309 2 หลายเดือนก่อน

      @@alekhpand First of All what do mean by my bad choice of SGD parameters it's isn't stack overflow I'm not asking someone why isn't my code working, your Algorithm is Wrong, This is a theoritical Class, And I'm Saying totally Even theoretically(Even in Linearly Seperable Data) it doesn't Always Converge And I'm not Saying he is a bad professor he is a really Good professor otherwise I wouldn't have listened to his whole playlist of lectures but what is wrong is, And Exactly As you Said we can get stuck in local minima for bad choice of parameters so by definition that's our fault right? that's not being unlucky And that's definitely not success That's failure Stop blaming everything on luck And more over Even if we consider for a moment that no it is correct it does exist then the proof is wrong, someone can say he just oversimplified the cos(B) term but it's not over simplication it's just flat out wrong, if you write the total term And try to proof it you will you can't prove it, ironically you spend if more time and effort you will actually end up proving that it is in fact not always getting smaller, he is a respectable professor so I fact did the whole math before saying Anything I can even give a simple toy example And you will see it doesn't work but the msg will get even bigger so I'm stopping now

    • @prithwishguha309
      @prithwishguha309 2 หลายเดือนก่อน

      @@doglibrary better than ok😎

    • @doglibrary
      @doglibrary 2 หลายเดือนก่อน

      @@prithwishguha309 Can you make a thorough document or video Explaining same, I'll love to read that