XLNet: Generalized Autoregressive Pretraining for Language Understanding

แชร์
ฝัง
  • เผยแพร่เมื่อ 15 ธ.ค. 2024

ความคิดเห็น • 49

  • @clevuag
    @clevuag 5 ปีที่แล้ว +30

    Please keep making these videos. Your work is amazing:))

  • @connor-shorten
    @connor-shorten 5 ปีที่แล้ว +17

    Really cool! The "New York is a city" example helped a lot with my understanding of this!

  • @abcdxx1059
    @abcdxx1059 5 ปีที่แล้ว +1

    after a point searching on the internet gives you nothing this channel is the only place where i find explanations for very complex things in a way a newbie can understand please dont stop

  • @deeplearner2634
    @deeplearner2634 3 ปีที่แล้ว +4

    I didn't really understand the random permutation idea from other sources but this video made it clear on how shuffled permutation allows to combine AR and BERT's AE idea. Thanks!

  • @nikeshnaik5516
    @nikeshnaik5516 5 ปีที่แล้ว +3

    I was not getting core idea behind XLNet and you made it look like piece of cake. Subscribed!! . Thank you.

  • @rpcruz
    @rpcruz 5 ปีที่แล้ว +4

    I liked the quick digression into language modeling before getting into the meat of the paper. Awesome video!

  • @kaenovama
    @kaenovama ปีที่แล้ว

    7 min in and finally i get it where i didn't understand! Thank you!

  • @helloadventureworld
    @helloadventureworld 4 ปีที่แล้ว +1

    you are genuinely changing the way I read and understand papers. your work is amazing do more NLP papers plz

  • @limynet
    @limynet 3 ปีที่แล้ว

    This is a really nice rundown, compare to me half reading and half sleeping over the long paper, thank you so much.

  • @yuchengcho7471
    @yuchengcho7471 4 ปีที่แล้ว +3

    Thanks Yannic, this explanation is super helpful!!

  • @aleksandrbazanov3866
    @aleksandrbazanov3866 5 ปีที่แล้ว +6

    Yannic is the best guy on the internet

  • @vedantwalke1789
    @vedantwalke1789 4 ปีที่แล้ว +2

    Great Video. The explanation made it very simple to understand and was very helpful !!

  • @thepresistence5935
    @thepresistence5935 2 ปีที่แล้ว

    I took 2.20 hours to understand this, but worth I don't forgot anymore

  • @fahadqurashi7103
    @fahadqurashi7103 4 ปีที่แล้ว +1

    Excellent explanation, easy to understand and to the point 👌👌

  • @aayatrubab
    @aayatrubab 5 ปีที่แล้ว +1

    I was eagerly waiting for it... Thanks, Yannic :)

  • @darkmythos4457
    @darkmythos4457 5 ปีที่แล้ว

    Was actualy waiting for you to post this, thanks

  • @venkatalv7014
    @venkatalv7014 5 ปีที่แล้ว +1

    very clear explanation, thanks for the video

  • @BSelm05
    @BSelm05 5 ปีที่แล้ว +1

    Thank you for a very clear explanation. I wonder how many samples they perform for each sentence. I couldn't find it in the paper.

  • @Rednivrug
    @Rednivrug 4 ปีที่แล้ว +1

    Language Modelling where Autoregressive is used to predict the next word by using the windows of previous words and Autoencoding is predicting the missing words in the windows of words. Aren't These two techniques are the same which we used to train word embedding for Word2Vec where CBOW(continuous bag of words) used to predict the next word by taking the previous window of words and N-gram method which used to predict the missing word by using previous and next words. What's the difference? Am I missing something?

    • @YannicKilcher
      @YannicKilcher  4 ปีที่แล้ว

      The difference is that in autoregressive decoding you do it again and again in a sequence.

  • @keerthanajaganathan
    @keerthanajaganathan 4 ปีที่แล้ว +1

    Thanks for the video - it is very helpful. Could you please make a video on Cross-lingual Language Model Pretraining (XLM)?

  • @prateethnayak8422
    @prateethnayak8422 3 ปีที่แล้ว

    @12:40 is what model is listening to ! :D

  • @neilteng4161
    @neilteng4161 4 ปีที่แล้ว +1

    Thank you So Much!

  • @prabhikthapa4671
    @prabhikthapa4671 5 ปีที่แล้ว

    Hi, could you also clarify why are embedding being multiplied to the representation produced by network in the equation 1,2 formulation, my understanding was you could directly apply softmax to the representation to train?

  • @supertramp_og
    @supertramp_og 5 ปีที่แล้ว +4

    "Hmmmm " :P
    Great video.

  • @nenadsubat9489
    @nenadsubat9489 10 หลายเดือนก่อน

    This is so enlightening!!!

  • @aqibfayyaz1619
    @aqibfayyaz1619 3 ปีที่แล้ว

    Great effort.

  • @aj-tg
    @aj-tg 4 ปีที่แล้ว

    Thanks, You are doing god's work!

  • @narendraparmar1631
    @narendraparmar1631 4 ปีที่แล้ว +1

    Thanks

  • @srikanthkoraveni8210
    @srikanthkoraveni8210 5 ปีที่แล้ว +1

    Thank you

  • @AlphaMoury
    @AlphaMoury 2 ปีที่แล้ว

    Thank you man

  • @RajeshSharma-bd5zo
    @RajeshSharma-bd5zo 2 ปีที่แล้ว

    Cool video!! Thanks for it. However, the voice quality was not that great and clearly, there is a scope of improvement for it here.

  • @jingciwang587
    @jingciwang587 4 ปีที่แล้ว +1

    Now all my mind is like New Hmm is a Hmm, New York is a Hmm Hmm and Hmm~ Hmm~ Hmm~ Hmm~~~

  • @RAZZKIRAN
    @RAZZKIRAN 4 ปีที่แล้ว +1

    thankq

  • @robinranabhat3125
    @robinranabhat3125 5 ปีที่แล้ว +11

    In this AI journey, I find some explain papers. leave behind the code. some explain the code. hopelessly though. and leave the theory. Can't we have like a paper explanation followed by an explanation of the code in tensorflow or pytorch ?? OR maybe everyone just knows only the high-level overview and thus, ignoring that part. although requiring great necessity. please upvote guys.

    • @YannicKilcher
      @YannicKilcher  5 ปีที่แล้ว +8

      If I were to also review the code, the videos would be 2+ hours 😁 but thanks for the feedback, will consider doing separate code reviews

    • @robinranabhat3125
      @robinranabhat3125 5 ปีที่แล้ว +8

      @@YannicKilcher if you do code review as well, trust me your channel we be the one of its kind. Anyone strudy enough to learn these papers, would want to see implementation details

    • @abcdxx1059
      @abcdxx1059 5 ปีที่แล้ว

      @@YannicKilcher damn you would do that for us 🤗🤗🤗

    • @tanny411
      @tanny411 5 ปีที่แล้ว

      I swear to sit through the 2 hours+ videos. This channel is life!

  • @jwstolk
    @jwstolk 4 ปีที่แล้ว +1

    2 out of 5 words is closer to 40%

  • @DeborahRodriguez-q8l
    @DeborahRodriguez-q8l 2 หลายเดือนก่อน

    Carley Roads

  • @emuccino
    @emuccino 4 ปีที่แล้ว +1

    18:23 😳😂

  • @CynthiaHarris-q5d
    @CynthiaHarris-q5d 2 หลายเดือนก่อน

    Price Vista

  • @RoxieRingelspaugh-n6r
    @RoxieRingelspaugh-n6r 2 หลายเดือนก่อน

    Velva Tunnel

  • @MarcelaApker-c9r
    @MarcelaApker-c9r 3 หลายเดือนก่อน

    Tremblay Parkways

  • @AlfredMag-g8h
    @AlfredMag-g8h 2 หลายเดือนก่อน

    Hailie Mountain

  • @wongmikeho
    @wongmikeho 5 ปีที่แล้ว +2

    Hmm..hmm...hmm...hmmm

  • @KennethWilliams-s6y
    @KennethWilliams-s6y 2 หลายเดือนก่อน

    Peggie Key