Thank you for watching this video! This was a part of my preparation for AWS Machine Learning Specialty exam. If you liked this video, check one more related here: - NLP with Tensorflow and Keras. Tokenizer, Sequences and Padding (th-cam.com/video/qw7rkwsk0oc/w-d-xo.html)
your idf was wrong, if idf = number of docs containing term/total number of docs, result will return the value less than or equal to 0, IDF must be equal to "total number of docs/number of docs containing term"
There's an error at 4:29 when you describe IDF calculation. The numerator is the 'total number of documents in the corpus', not the denominator. I guess picking up an example where word frequency and number of documents are not the same number , here 2, would have helped. Thanks!
People are saying IDF calculation was wrong? If IDF = N / {d element of D: t element of d}, so N documents divided by the amount of documents which does contain the term, then this will obviously give us 2/2. What is wrong here? Some people propose 2/5, but then, why 5? The term "fox" appears 5 times across all documents that is true, but the total number of documents which contain the term "fox" is still 2.
In this example, the TF-IDF score doesn't reflect that the word "fox" appears more times in d2. And therefore it loses that information that could help to distinguish d1 and d2
You forgot to remove stop words and perform lemmatization and stemming before calculating the term frequency so invariably the entire problem becomes wrong
I think there is an error when you calculate the IDF in the logarithm part , we do have total no of "5" terms of "fox" in the corpus I think it should be log(5/2).
Thank you for watching this video! This was a part of my preparation for AWS Machine Learning Specialty exam.
If you liked this video, check one more related here:
- NLP with Tensorflow and Keras. Tokenizer, Sequences and Padding (th-cam.com/video/qw7rkwsk0oc/w-d-xo.html)
your idf was wrong, if idf = number of docs containing term/total number of docs, result will return the value less than or equal to 0, IDF must be equal to "total number of docs/number of docs containing term"
He probably forgot the inverse part.
idf=total number of docs/number of docs containing term
Great video! there's an error tho. IDF=total number of docs/number of docs containing term
short, precise,and easy to understand Tutorial Thanks!
quem veio pelo Guruja? Vamos vencer, aqui SEFAZ, aqui se passa! Pra cima !
Amém!
There's an error at 4:29 when you describe IDF calculation. The numerator is the 'total number of documents in the corpus', not the denominator. I guess picking up an example where word frequency and number of documents are not the same number , here 2, would have helped. Thanks!
I think you got the IDF part wrong, the denominator and nominator should be the other way around
Great video! Thank you man for effecient expression. I'm from Turkiye. I like your videos.
Thanks for watching! Appreciate your feedback! :)
People are saying IDF calculation was wrong? If IDF = N / {d element of D: t element of d}, so N documents divided by the amount of documents which does contain the term, then this will obviously give us 2/2. What is wrong here? Some people propose 2/5, but then, why 5? The term "fox" appears 5 times across all documents that is true, but the total number of documents which contain the term "fox" is still 2.
it is wrong
wow, clearly the best explanation
Thanks a lot! :)
10q
In this example, the TF-IDF score doesn't reflect that the word "fox" appears more times in d2.
And therefore it loses that information that could help to distinguish d1 and d2
term frequency does that
Thank you for your effort for this content!
You forgot to remove stop words and perform lemmatization and stemming before calculating the term frequency so invariably the entire problem becomes wrong
"The big D"
I think there is an error when you calculate the IDF in the logarithm part , we do have total no of "5" terms of "fox" in the corpus I think it should be log(5/2).
I think it should be log(2/5)
No
which software are you using for explaing?
For this tutorial: simple PowerPoint and Camtasia
is still tf-idf work to optimize content for beter ranking ?
Great video! can you share the your slides if its possible?
Sadly I dont't have slides of that, just this video... :/
Pause the video, take a screenshot. Paste in the Powerpoint. Voila!
Fantastic Explanation !!!
Thank you for feedback! :)
sarunas pao religion
great content! thank u!
I think, IDF calculation is wrongly explained. It's just opposite of what he said for denominator and numerator.
Very Helpful thanks
Thank you
Thanks for watching this! :)
Extremely good explained!
Really appreciate your feedback, thank you for watching! :)
@@DataScienceGarage clear explanation but its wrong dude
Great video thanks!
Thanks for watching! Hoping it was useful. :)
nice! easy explanation :)
Thanks for watching! :)
Fix your video. in IDF calculations you swapped the numerator and denumerator.
Excellent
Thanks for watching!
Just be aware that 2 / 2 = 1 ! Not 0 like you hear in the video.
Hi! I have no idea where you saw 2/2=0 in this video... There was log(2/2)=0, which is true.
@@DataScienceGarage Check 4:54
@@YouPI227...but while I said "two divided by two equal to zero" I pointed to log(2/2)=0. Log(1)=0.
The big D
Love from india
Thanks for watching this!
great
your IDF calculation is wrong