Word Embeddings - EXPLAINED!

CodeEmporium

มุมมอง 14 043

เพิ่มลงใน
- เพลย์ลิสต์ของฉัน
- ดูภายหลัง
แชร์

แชร์

ฝัง

ขนาดวิดีโอ:

แสดงแผงควบคุมโปรแกรมเล่น

เล่นอัตโนมัติ

เล่นใหม่

เผยแพร่เมื่อ 8 ก.ย. 2024

ความคิดเห็น • 22

@user-in4ij8iq4c ปีที่แล้ว ⁺¹
best explaining embedding so far from the video I watched on youtube. thanks and subscribed.
@Jonathan-rm6kt 10 หลายเดือนก่อน ⁺²
Thank you! This is the perfect level of summary I was looking for. I’m trying to figure out a certain use case, maybe someone reading can point me in the right direction..
How can one create embeddings that retain an imposed vector/parameter that represents the word chunks semantic location in a document? I.e, a phrase occurs in chapter 2 is meaningfully different from the same phrase in chapter 4. This seems to be achieved through parsing document by hand and inserting metadata. But it feels like there should be a more automatic way of doing this.
@sgrimm7346 หลายเดือนก่อน
Good video...explains things from a very high level, very well. But I'm trying to figure out why/how a single word would result in a large vector. So are the meanings of the word encoded into the vector? As an example, Cat would have ' fur, claws, mammal, kitten, animal...' etc, and result in a vector of say 100 elements? Even if the vector is generated by the computer, which obviously it does, each element in the vector has to represent something. I can't seem to get past this point. I understand what vec2vec does, I just don't know why it does it. Any help? Thanks.
@_seeker423 6 หลายเดือนก่อน
Can you explain after training CBOW / Skip-gram models, how do you generate embeddings at inference time?
With Skip-gram, it is a bit intuitive that you would 1-hot encode the word and extract the output of embedding layer. Not sure how it works with CBOW where the input is a set of context words.
@_seeker423 4 หลายเดือนก่อน
I think I saw in some other video that while the problem formulation is different in cbow vs skipgram, ultimately the training setup is reduced to pairs of words.
@markomilenkovic2714 ปีที่แล้ว ⁺²
I still don't understand how to convert words into numbers
@bofloa ปีที่แล้ว ⁺¹
you have to convert word first to corpus, which are word seperated by space, and also group this word into sentences, then decided what is going to be the vectorsize, this is an hyperparemeter value for each word then generate random number for each word to the number of vectorsize, all this must be store in 2 dimenssion array or Dictionary where the word become key to access the vector, also note that you have to cater for co-occurence of word or rather word frequencies in the corpus, so that you know number of time a particular word occured. once this done you can then decide if you want to use CBOW or Skip-Gram, the puporse of this two method is actually to create data for trainning where in CBOW you generate context as input and targetword as output, skip-gram however is opposite, you generate Target word as input and context words as ouput, then train the module in a form of supervice and unsupervice way...
@larrybird3729 ปีที่แล้ว ⁺²
great video but Im still a bit confused with what is currently being used for embedding? are you saying BERT is the next word2vec for embedding? is that what chatGPT4 uses? sorry if I didn't understand!
@lorenzowottrich467 ปีที่แล้ว ⁺¹
Excellent video, you're a great teacher.
@CodeEmporium ปีที่แล้ว ⁺¹
Thanks a lot for the kind words :)
@RobertOSullivan ปีที่แล้ว
This was so helpful. Subscribed
@CodeEmporium ปีที่แล้ว
Thank you so much! And super glad this was helpful
@MannyBernabe 6 หลายเดือนก่อน
really good. thx.
@edwinmathenge2178 ปีที่แล้ว
That some Great Gem Right here....
@CodeEmporium ปีที่แล้ว
Thanks so much for watching :)
@thekarthikbharadwaj ปีที่แล้ว
As always, well explained 😊
@CodeEmporium ปีที่แล้ว
Thanks a ton :)
@creativeuser9086 ปีที่แล้ว ⁺¹
It’s a little confusing Cz In many examples, a full chunk of text is converted into 1 embedding vector instead of multiple embedding vectors (one for each token of that chunk). Can you explain that ?
@CodeEmporium ปีที่แล้ว ⁺¹
Yea. There are versions that produced sentence embeddings as well. For example, Sentence Transformers use BERT at its core to aggregate word vectors to construct sentence vectors that preserve meaning.
Not all of these sentence to vector frameworks work the same way. For example, frameworks like TF-IDF vector is constructed from word co occurrence in different documents. This however is not a continuous dense vector representation as opposed to sentence transformers though. But both of these are worth checking out.
@creativeuser9086 ปีที่แล้ว
Are embedding models part of the base LLMs or are they a completely different model with different weights, and how does the training of embedding models look like?
@CodeEmporium ปีที่แล้ว ⁺¹
LLMs = large language models. Models trained to perform language modeling (predict the next token given context). Aside from BERT and GPT, these are not language models as they don’t solve for this objective.
So while these models may learn some way to represent words as vectors, not all of them are language models.
The training of each depends on the model. I have individual videos called “BERT explained” and “GPT explained” on the channel for details on these. For the other cases like word2vec models, I’ll make a video next week hopefully outlining the process clearer
@VishalKumar-su2yc 6 หลายเดือนก่อน
hi

ต่อไป

เล่นอัตโนมัติ