It would better to watch this video after you have some basic knowledge about word2ver , like me, I already learned it for several days. And thank you John, this video answers me many questions!
Great video! One question - how do we get the second set of weight vectors? You mention that the first set of weights are randomly inited. But the 2nd set of weights don't seem to be random, since they form a probability distribution. So how are they generated?
Hi John, Great explanation. However, I have a question. How does the model output different context words, if they all share the same embedding vectors(hidden->output)?
Thank you for your question! Note that there is a word-embedding matrix which results in the manifold of output prediciton vectors. In the video, I just said "second set of weight vectors" 9:30 Basically, they don't share the same embedding vectors.
@@johnplins Thank you so much for your response. You said "there is a word-embedding matrix which results in the manifold of output prediciton vectors". But how exactly does the model able to produce different output predictions with only a single embedding matrix? From my understanding, the output of each prediction is the product of the projection layer and the second matrix. If there is only "one" matrix in between the hidden->output layer, how is the second prediction going to be different from the first? Thank you.
@@figoaranta3505 To clarify, there is a matrix for each vector output. Each matrix transforms the vector from the previous layer differently. I understand now why my explanation was a bit convoluted; I'm not saying there is a single matrix that outputs multiple word vectors from the same input, but rather a matrix for each context word each with its own corresponding output.
@@johnplins Hi John, now that makes a lot of sense. So instead of only one matrix, multiple matrices contribute to different vector outputs. Thank you so much for the replies, they have been very helpful😄.
whenever you display a single slide with lots of diagrams, i get lost. I couldn't follow what you are talking about. you better break it down into a sequence of slides.
Incredibly clear explanation. Thank you!
It would better to watch this video after you have some basic knowledge about word2ver , like me, I already learned it for several days. And thank you John, this video answers me many questions!
Great video! One question - how do we get the second set of weight vectors? You mention that the first set of weights are randomly inited. But the 2nd set of weights don't seem to be random, since they form a probability distribution. So how are they generated?
This was outstanding!
This was outstanding
Hi John, Great explanation. However, I have a question. How does the model output different context words, if they all share the same embedding vectors(hidden->output)?
Thank you for your question! Note that there is a word-embedding matrix which results in the manifold of output prediciton vectors. In the video, I just said "second set of weight vectors" 9:30
Basically, they don't share the same embedding vectors.
@@johnplins Thank you so much for your response. You said "there is a word-embedding matrix which results in the manifold of output prediciton vectors". But how exactly does the model able to produce different output predictions with only a single embedding matrix?
From my understanding, the output of each prediction is the product of the projection layer and the second matrix. If there is only "one" matrix in between the hidden->output layer, how is the second prediction going to be different from the first?
Thank you.
@@figoaranta3505 To clarify, there is a matrix for each vector output. Each matrix transforms the vector from the previous layer differently. I understand now why my explanation was a bit convoluted; I'm not saying there is a single matrix that outputs multiple word vectors from the same input, but rather a matrix for each context word each with its own corresponding output.
@@johnplins Hi John, now that makes a lot of sense. So instead of only one matrix, multiple matrices contribute to different vector outputs.
Thank you so much for the replies, they have been very helpful😄.
Hi,
Have u checked your snippet? Does it work?
great content
epicc
Lions don't have stripes. Tigers have stripes
can you share you ppt?
Lions don't have stripes
Go...
whenever you display a single slide with lots of diagrams, i get lost. I couldn't follow what you are talking about. you better break it down into a sequence of slides.