TFIDF : Data Science Concepts

แชร์
ฝัง
  • เผยแพร่เมื่อ 25 ม.ค. 2025

ความคิดเห็น • 113

  • @vinson2233
    @vinson2233 3 ปีที่แล้ว +5

    I'm really glad to choose this video instead wasting my time watching 30minutes explanation of tf-idf. Great job for explaining this

  • @pohkeamtan9876
    @pohkeamtan9876 3 ปีที่แล้ว +10

    This is really good. Concise , straight to the point, and there is no need to show a line of code !

  • @akshikaakalanka
    @akshikaakalanka 2 หลายเดือนก่อน

    always the best place to look for a concept explained. Always grateful.

  • @gunbac74
    @gunbac74 4 ปีที่แล้ว +7

    I read this explanation in a book, but not as clear as this video. Well done!

  • @rt58528
    @rt58528 4 ปีที่แล้ว +3

    Being a math lover, within a minute of your explanation I became your fan, was always in a search of videos like this

    • @mango-strawberry
      @mango-strawberry 9 หลายเดือนก่อน

      true. his channel hasn't been picked up by TH-cam yet.

  • @hemantsah8567
    @hemantsah8567 4 ปีที่แล้ว +2

    Your videos before sleep... Keep nightmares away...

  • @stanlukash33
    @stanlukash33 3 ปีที่แล้ว +2

    I started googling tf-idf and then I was like "Hey, maybe that guy has a video on it", and you do! Thanks!

    • @ritvikmath
      @ritvikmath  3 ปีที่แล้ว +3

      😂 "that guy" says you're welcome

    • @stanlukash33
      @stanlukash33 3 ปีที่แล้ว

      @@ritvikmath haha sorry, Ritvik!

  • @summerxia7474
    @summerxia7474 2 ปีที่แล้ว +1

    Such a clear explanation!!! Much better than my teacher in the class. Why can't they just make it this simple? Thank you so much.

  • @Scar_
    @Scar_ ปีที่แล้ว +2

    Thank you for this! You saved me much time! Your explanation is legit!

  • @mosca-tse-tse
    @mosca-tse-tse 4 ปีที่แล้ว +3

    Excellent teaching! Perfectly designed, clearly explained and not even one sentence that would be redundant. I’m your fan my friend 👍🏼🙏🏼

    • @ritvikmath
      @ritvikmath  4 ปีที่แล้ว

      Wow, thank you!

    • @leod1740
      @leod1740 2 ปีที่แล้ว

      @@ritvikmath Yes excellent explanation

  • @luuz_study_yt7123
    @luuz_study_yt7123 2 ปีที่แล้ว

    Thank you for the video, we are working at a Movie recommender System and this helps a lot for NLP.

  • @imdadood5705
    @imdadood5705 3 ปีที่แล้ว +1

    I wish to have your coherence when explaining. Awesome explanation as always.

  • @elsywehbe2897
    @elsywehbe2897 11 หลายเดือนก่อน +1

    Your examples are excellent! Thank you!

    • @ritvikmath
      @ritvikmath  11 หลายเดือนก่อน +1

      You're very welcome!

  • @BhuvaneshSrivastava
    @BhuvaneshSrivastava 4 ปีที่แล้ว +1

    I like your videos first and then start watching your Data Science videos because I am sure that after I am done watching it, I will like it anyway.
    Keep it up.. 🙏

  • @martand_05
    @martand_05 4 ปีที่แล้ว +3

    What a classy explanation. So good man!

    • @ritvikmath
      @ritvikmath  4 ปีที่แล้ว +1

      Much appreciated!

  • @maefiosii
    @maefiosii 3 ปีที่แล้ว +1

    that explanation was so smooth and clear.. great job

  • @hameddadgour
    @hameddadgour 2 ปีที่แล้ว

    Great presentation!

  • @michaelhaag3367
    @michaelhaag3367 2 ปีที่แล้ว

    Lucid explanation, my man back at it again!

  • @balaganesh3440
    @balaganesh3440 ปีที่แล้ว

    Outstanding explanation!

  • @alexfeng75
    @alexfeng75 11 หลายเดือนก่อน

    great video with depth and simplicity at the same time!

  • @mila4real1
    @mila4real1 ปีที่แล้ว

    Cool! Loved your simple but extremely efficient explanation

  • @hariharvyas4741
    @hariharvyas4741 2 หลายเดือนก่อน +1

    What if the word we are checking does not appear in any of the document, then in the denominator it would be 0 which is not possible

  • @ericzhang5987
    @ericzhang5987 3 ปีที่แล้ว

    Excellent explanation !

  • @David-nw6rz
    @David-nw6rz 3 ปีที่แล้ว

    When using the whiteboard, your videos are even better than with pen and paper! Thanks for your videos!

  • @vaibhavmourya65
    @vaibhavmourya65 ปีที่แล้ว

    Great explanation buddy🙌🏻

  • @cesarreinoso2203
    @cesarreinoso2203 3 ปีที่แล้ว

    Awesomeeee Simple and Clear

  • @MrKqsami
    @MrKqsami 2 ปีที่แล้ว

    Great Job sir!

  • @butterfly34457
    @butterfly34457 9 หลายเดือนก่อน

    So simple and concise! Thank you so much!

  • @srijitbhattacharya6770
    @srijitbhattacharya6770 2 ปีที่แล้ว

    Excellent , simply briliant

  • @nehimomo
    @nehimomo 2 ปีที่แล้ว

    veyr great explanation, much better than my lecturer

  • @adrianramirez9729
    @adrianramirez9729 2 ปีที่แล้ว

    Amazing explanation!

  • @robertodigiacomo3910
    @robertodigiacomo3910 2 ปีที่แล้ว

    Great explanation

  • @devidurga392
    @devidurga392 ปีที่แล้ว

    clear cut explanation. Thank you

  • @hannahb.9454
    @hannahb.9454 8 หลายเดือนก่อน

    This came in clutch, thanks

  • @eramy1
    @eramy1 4 ปีที่แล้ว +1

    Good explanation in a simple way... keep doing well man

  • @pallavijog912
    @pallavijog912 2 ปีที่แล้ว

    Nice explanation. Thanks!

  • @mohamedshatarah7264
    @mohamedshatarah7264 2 หลายเดือนก่อน

    Thank you so much for explaining this clearly sir

  • @_instanze_
    @_instanze_ ปีที่แล้ว

    Youre an excellent explanar man. And I don't mean that lightly (I rarely compliment people wallah).
    You got a knack. Truly!
    Subscribed!!

    • @ritvikmath
      @ritvikmath  ปีที่แล้ว

      I appreciate that!

  • @sebastiancamilopuertogalin4478
    @sebastiancamilopuertogalin4478 2 ปีที่แล้ว

    For any given word/term, we want to know how important is that term for a given document, relative to the entire corpus of documents. E.g. for Clinton these subset of words is really important in his inauguration speech, relative to the other inaguration speeches. TF-IDF is simply a multiplication of the metrics TF (term frequency) and IDF (inverse document frequency).

  • @fustigate8933
    @fustigate8933 2 ปีที่แล้ว

    Nice explanation!

  • @amadios9874
    @amadios9874 2 ปีที่แล้ว

    That was crystal clear, thanks

  • @dalvirsingh4070
    @dalvirsingh4070 3 ปีที่แล้ว

    Explanation was awesome!

  • @Justrelaxx101
    @Justrelaxx101 2 ปีที่แล้ว

    Perfectly explained

  • @AKapich
    @AKapich ปีที่แล้ว

    Very succinct explanation, thank you very much

    • @ritvikmath
      @ritvikmath  ปีที่แล้ว +1

      You are welcome!

  • @22malman
    @22malman 3 ปีที่แล้ว

    Superb!!

  • @emna143
    @emna143 ปีที่แล้ว +1

    Hi, first of all, thanks for the great explanation. I have watched your videos about Word2Vec and TF-IDF, and I need help, please. I'm a student working on a project about binary classification of SQL injection attacks. The dataset I have contains two columns: 'sentence' and 'label.' I need to extract features, but I'm confused about which technique to use: Word2Vec or TF-IDF. Can you help me decide?

  • @chaitu2037
    @chaitu2037 3 ปีที่แล้ว

    Very well explained

  • @warislthong3149
    @warislthong3149 ปีที่แล้ว

    Excellent !

  • @cleansquirrel2084
    @cleansquirrel2084 4 ปีที่แล้ว +1

    Awesome video!!

  • @ernestanonde3218
    @ernestanonde3218 2 ปีที่แล้ว

    Powerful...Thank yoiu

  • @mitadrubanerjeechowdhury9092
    @mitadrubanerjeechowdhury9092 3 ปีที่แล้ว

    Amazing stuff, thanks man for letting me pass the exam.

  • @bananalord8575
    @bananalord8575 7 หลายเดือนก่อน

    Sweet and simple!

  • @RedditFam
    @RedditFam 2 ปีที่แล้ว

    Very useful! Thank you Sir!

  • @VishalKhopkar1296
    @VishalKhopkar1296 2 ปีที่แล้ว +1

    if the word 'healthcare' did occur in all 3 speeches, but occurs in the Obama speech 26 times, but only once in Clinton's and Bush's speeches. Using this mechanism, the IDF of healthcare would still be 0, but since the word has been used a considerably large number of times in the Obama speech, it is definitely important

    • @redpz
      @redpz ปีที่แล้ว

      in a more realistic situation the # of D would be much larger so cases like this would be extremely rare

    • @TheTranscending
      @TheTranscending ปีที่แล้ว

      Good point

  • @athena9357
    @athena9357 ปีที่แล้ว

    You saved me! My professor explained this in 3 hours, I watched it 2 times and I don't get it. This guy explained the same concept in 7 minutes and I get it!

  • @mupetman1214
    @mupetman1214 2 ปีที่แล้ว

    Would you advise to take out stopping words and run tdidf on the new set of documents?

  • @ralphhennen5769
    @ralphhennen5769 4 ปีที่แล้ว +1

    How do you model multiple objects associated to a term class: Dental Care: United Health Care, Blue Shield, ..., by state? This becomes contextual and local within the text - how close is the word dental care in the text to UHC, for instance. The result would show which states address dental care in their health insurance regulations and which insurance companies make it available - both in a positive and negative way. Understand that this is a narrow example. Thanks

  • @Shaan11s
    @Shaan11s 11 หลายเดือนก่อน

    YES! I get it now, much love bro

  • @user-or7ji5hv8y
    @user-or7ji5hv8y 3 ปีที่แล้ว

    Clear and concise.

  • @k_anu7
    @k_anu7 3 ปีที่แล้ว

    If anyone dislikes this explanation god will have to come down to explain him/her.

  • @MultiRockxD
    @MultiRockxD 2 ปีที่แล้ว

    Is this a good tool to create a top of "important" words in a dataset? or it just helps to see the relevance in a particular document, I want to use it so I can maybe sum all the tdidf of all the documents and create a top words but I don't know if this is the best approach/solution to what I want, thank you in advance

  • @superbatman1462
    @superbatman1462 4 ปีที่แล้ว +1

    Nice Explanation

  • @mariapazherrera4306
    @mariapazherrera4306 3 ปีที่แล้ว

    Amazing!!!!!

  • @negusuworku1871
    @negusuworku1871 11 หลายเดือนก่อน

    iT IS REALLY NICE. KEEP IT UP

    • @ritvikmath
      @ritvikmath  11 หลายเดือนก่อน

      Thanks a lot 😊

  • @pushkarparanjpe
    @pushkarparanjpe 3 ปีที่แล้ว +1

    This is a great explanation. Thanks.
    I have a question about differences between the implementation described in this video and another implementation commonly found on the web.
    Can you explain how these two details would impact the final representation:
    1) Term frequency simply calculated as term count
    2) Applying vector normalisation (L2) to the document vector obtained in this video
    Another question which is more open-ended: why is TfIdf still relevant ? Or less provocatively - is there a sweet spot where one would prefer TfIdf over the modern dense vector representations (such as word2vec, doc2vec, etc.) ?

  • @almonddonut1818
    @almonddonut1818 2 ปีที่แล้ว

    Thank you so much!!! 🤩

  • @0xjrr
    @0xjrr 4 ปีที่แล้ว +4

    love that in this alternative timeline the last speech is from Obama

    • @skeletonrowdie1768
      @skeletonrowdie1768 4 ปีที่แล้ว +2

      a certain president would really bias the vocabulary data

  • @Begooder
    @Begooder 3 ปีที่แล้ว

    many thanks

  • @mango-strawberry
    @mango-strawberry 9 หลายเดือนก่อน

    damn.. that was a solid explanation

  • @zephyrsurfteam
    @zephyrsurfteam 4 ปีที่แล้ว +1

    Great video! Thanks! I would love to see more content on TFIDF.

  • @swagatggautam6630
    @swagatggautam6630 ปีที่แล้ว

    I wonder why my teachers couldn't explain so simply.

  • @MYanton1994
    @MYanton1994 2 ปีที่แล้ว

    thank you very much

  • @sia-watsonlee
    @sia-watsonlee 2 ปีที่แล้ว

    amazing

  • @ai-force3792
    @ai-force3792 ปีที่แล้ว

    very Good

  • @vasundharasingh8216
    @vasundharasingh8216 10 หลายเดือนก่อน

    in cases where all the 3 documents contain the word, even if 2 of them contain the word only once and the 3rd doc contains it a 100 times, tf idf would be 0 as idf would be 0. isn't this misleading then?

  • @Soutehkeshan
    @Soutehkeshan 2 ปีที่แล้ว

    Useful :)

    • @ritvikmath
      @ritvikmath  2 ปีที่แล้ว

      Glad you think so!

  • @sepideh1111
    @sepideh1111 ปีที่แล้ว

    Thanks , great teacher if I could I would have given you 3 thumb

  • @xxxxxx-wq2rd
    @xxxxxx-wq2rd 4 ปีที่แล้ว

    but if healthcare appears 100 times in one document, and only once in each of the other 2 documents, then the result will be zero!

    • @nicholasdavis9529
      @nicholasdavis9529 3 ปีที่แล้ว

      This was my question. If you found out let me know.

    • @nicholasdavis9529
      @nicholasdavis9529 3 ปีที่แล้ว

      Great video btw, best explanation.

  • @durasaksham
    @durasaksham 4 หลายเดือนก่อน

    It was that easy

  • @aleynapolat1545
    @aleynapolat1545 2 ปีที่แล้ว

    Bro, you are a good narrator but a bad organizer. It would be better that the next time you write on the board more regularly in order to make it easier to follow what you sayin

  • @australianperson2582
    @australianperson2582 4 ปีที่แล้ว

    Thanks for politicising education with that exclusion with that example, unsubbed - so partisan.

    • @ritvikmath
      @ritvikmath  4 ปีที่แล้ว +4

      Sorry to see you go, it was not my intention to politicize but rather just to use this as an example.

  • @abdulbasit0123
    @abdulbasit0123 8 หลายเดือนก่อน

    That was a great explanation, Thanks 🤍

  • @jb_kc__
    @jb_kc__ 10 หลายเดือนก่อน

    your explanations are great bro cut to the heart of the issue + ensure conceptual understanding 🫡🫡

    • @ritvikmath
      @ritvikmath  10 หลายเดือนก่อน

      Thank you so much 😀