Tutorial 49- How To Apply Naive Bayes' Classifier On Text Data (NLP)- Machine Learning

แชร์
ฝัง
  • เผยแพร่เมื่อ 15 ก.ย. 2024

ความคิดเห็น • 148

  • @hemantdas9546
    @hemantdas9546 4 ปีที่แล้ว +20

    Sir at 9:02 it should be 2/2 and not 2/4. Because we have to consider only yes cases. Please clarify if I am wrong.

    • @sonusingh-hj2dw
      @sonusingh-hj2dw 4 ปีที่แล้ว +1

      yes
      you are right
      i thought the same

    • @pavanidubey3399
      @pavanidubey3399 4 ปีที่แล้ว +1

      yes even I think the same

    • @siddharthdedhia11
      @siddharthdedhia11 4 ปีที่แล้ว +1

      Its correct. 2 times yes when food is present / total # of times food is present

    • @tanvishinde805
      @tanvishinde805 3 ปีที่แล้ว

      yes i think the same , but in that case P(Bad | yes ) = 0 and so the whole term P(yes | sentence1) = 0 , so this should not happen

    • @sarsizchauhan8948
      @sarsizchauhan8948 3 ปีที่แล้ว

      ​@@tanvishinde805 He's considering sent1 as an example and not that BOW matrix. In sent1 three words are present, so only those features would be considered to calculate the probability.
      If sent has both delicious and bad words present then it won't give definitive o/p, which he should've explained.

  • @premtejak
    @premtejak 4 ปีที่แล้ว +8

    Nice explanation Krish. Can you explain
    1.How do we solve new word (taste) is presented in the existed sentence (The food is Delicious).
    2.What will happen Dataset is imbalanced.
    You mentioned like will upload this two problems in next video .I did not find that video,please upload Krish. Thank you

    • @vinaychitturi5183
      @vinaychitturi5183 3 ปีที่แล้ว

      I came across a concept called Laplace smoothing which helps how to deal when test data contains a new word which is not available in Training Data Set. This might help you.

  • @Waurice
    @Waurice 3 ปีที่แล้ว +7

    Your videos are really amazing. A quick note: P(1|yes) = 0.1 (10%) :)

  • @rahulsaha4728
    @rahulsaha4728 4 ปีที่แล้ว +20

    The calculation of conditional probabilities are wrong. P(x2|y=yes) = 2/2 not 2/4. There are 2 yes values, hence denominator is 2. Out of 2 yes values, both have x2 and so numerator is 2.

    • @aryantyagi2189
      @aryantyagi2189 ปีที่แล้ว

      Thanks bro, you are right :)

    • @atharavmahajan8202
      @atharavmahajan8202 ปีที่แล้ว

      @@aryantyagi2189 🧐

    • @santhiyaA-sz8db
      @santhiyaA-sz8db 5 หลายเดือนก่อน

      no. food =4
      food/yes ,, =2
      p(food/yes)=2/4.... i think its right

    • @abdelerahmanekhaldi6228
      @abdelerahmanekhaldi6228 4 หลายเดือนก่อน

      @@santhiyaA-sz8db no man it's conditional probability ,when P(A/B) you go look for the portion of A is true when B is true not overall

  • @soowoonchung5607
    @soowoonchung5607 4 ปีที่แล้ว +25

    Hi Sir, isn't p(x2 | y=yes) = 2/2= 1? not 2/4? (9:02)
    because p(food=1 | y=yes) = p(food=1, y=1)/ p(y=1) = ( 2/5 ) / (2/5) = 1

    • @uttamchoudhary5229
      @uttamchoudhary5229 4 ปีที่แล้ว +2

      @shawn chung yes,i think so ,
      because p(x2) given that yes will be 2/2(we have two yes and both are having x2=1) not 2/4.
      please correct me if my assumption is wrong.

    • @ppsheth91
      @ppsheth91 4 ปีที่แล้ว +2

      @@uttamchoudhary5229
      Actually it will be p(x2 | y=yes) = no. of words (x2) / total no. of words in Class Y = Yes
      no. of words (x2) in yes class= 2
      total no. of words in Class( Y = Yes) = 5
      So it should be = (2/5) = 0.4

    • @piyushtalreja6855
      @piyushtalreja6855 3 ปีที่แล้ว

      @@ppsheth91 There are only 3 unique words where Class( y = yes ). It should be 2/3

  • @ashwinshetgaonkar6329
    @ashwinshetgaonkar6329 2 ปีที่แล้ว +4

    8:24 p(the/yes) =number of times the==1 when o/p==1 also for p(food/yes)=2/2

    • @mini22q11
      @mini22q11 4 หลายเดือนก่อน

      you are right brother

  • @dipanwitamitra3029
    @dipanwitamitra3029 3 ปีที่แล้ว +43

    Sir, I have a doubt. When we are calculating p(X2/y=yes), should it not be equal to p(X2 intersection y=yes) / p(yes) = 2/2?

    • @santhoshreddy9161
      @santhoshreddy9161 3 ปีที่แล้ว +2

      yes , your explanation is correct

    • @dhruvgrover7416
      @dhruvgrover7416 3 ปีที่แล้ว

      yes, you are correct

    • @mohdkashif7295
      @mohdkashif7295 3 ปีที่แล้ว +4

      yeah you are correct, you can also see in this way that p(X2/y=yes) is basically -> number of times word X2 appear given output is 'yes'/ number of times 'yes' appear.

    • @ltoco4415
      @ltoco4415 2 ปีที่แล้ว

      That's right. It should be 2/2 i.e. 1.

    • @garvitgupta13
      @garvitgupta13 2 ปีที่แล้ว

      Yes you are correct.

  • @user-ni3uw5cc7u
    @user-ni3uw5cc7u 8 หลายเดือนก่อน +1

    I think P(x2 | Y=yes) should be 1 ... Same goes for the other cases x3. the sample space given y=yes should only count events where y=1.... Please Correct me if I am wrong....

  • @RanveerSingh-sp3uj
    @RanveerSingh-sp3uj 4 ปีที่แล้ว +5

    it was good content, thanks for making such video, its really nice to learn thing like this

  • @mrfolk84
    @mrfolk84 3 ปีที่แล้ว +5

    For word 'bad' , will its conditional probability be zero and will make whole probability zero since 0*anything is 0 ?

  • @mandeepsinghnegi1931
    @mandeepsinghnegi1931 4 ปีที่แล้ว +2

    eagerly waiting for NLP series... btw your work is amazing. Thank you!

  • @lamnguyentrong275
    @lamnguyentrong275 3 ปีที่แล้ว +6

    how can i find the next video, the youtube doesn't recommend :(. thank you for ur work, really clear

  • @Satyam-ic4tl
    @Satyam-ic4tl ปีที่แล้ว

    thank you sir ur videos has helped me a lot and i can't thank u enough for the great work that u r doing

  • @badrlakhal5440
    @badrlakhal5440 3 หลายเดือนก่อน

    We calculate the probability of xi given y=1 this means we filter by y=1 and then calculate the probability. So, P(x2/y=1)=2/2 etc

  • @ARSH_DS007
    @ARSH_DS007 4 ปีที่แล้ว +13

    I believe probability of No|Sentence-1 is zero due to the word delicious. So after normalization P(Yes|Sentence-1) is 100% and not 70-80. Please correct me if required.

    • @srinivasarukonda8768
      @srinivasarukonda8768 4 ปีที่แล้ว +1

      you are correct

    • @pradeep611
      @pradeep611 3 ปีที่แล้ว +2

      @@srinivasarukonda8768 YES, but if u calculate, P(The/yes)=1/2, P(Food/YES)=2/2, P(delicious/Yes)=2/2; Bcoz formulae for P(A/B)=P(A intersection B)/P(B) ; So the final answer will be 1/5. Can anybody confirm this?

    • @sudhirkv8292
      @sudhirkv8292 3 ปีที่แล้ว

      @@pradeep611 P(Food/YES) IS 2/4

    • @arpita0608
      @arpita0608 ปีที่แล้ว

      yes but why he used 0.03 and also it will be 0.1 not 0.01

  • @rahulm774
    @rahulm774 4 ปีที่แล้ว +33

    There is a calculation mistake. 1/10= .1 and not .01 . That's why he got the wrong result initially.

    • @maheshmec1
      @maheshmec1 3 ปีที่แล้ว

      Decimal power goes off as we do normalize. so results should be ok.

  • @zohaibramzan6381
    @zohaibramzan6381 3 ปีที่แล้ว +1

    Wrongly calculated the probabilities P(x1|yes) and others. Anyhow great content

  • @jinal0217
    @jinal0217 3 ปีที่แล้ว +3

    Thank you for the video. Do you have Tutorial 50. I mean the next part explaining what if when the data set is imbalanced. Where does Naiye Bayes fail ?

  • @deepthisudhakaran6417
    @deepthisudhakaran6417 3 ปีที่แล้ว +6

    Thank you for this video. Can you please share videos for the implementation of these concepts using python also?

  • @sumitkumar79598
    @sumitkumar79598 ปีที่แล้ว +1

    sir according to me P(x2/y)= P( x2 ^ y)/P(y) , where P(x2^ y) = how many time x2 = 1 when out output is 1 = 2, and P(y) is how many times we are getting output as 1. Therefore P(x2/y) = 2/2= 1, nor 2/4. correct me if I am wrong

  • @VikashKumar-ty6uy
    @VikashKumar-ty6uy 4 ปีที่แล้ว +3

    Waiting for NLP series eagerly....

  • @devdaskamath975
    @devdaskamath975 4 ปีที่แล้ว +3

    Hi krish, Your videos are really amazing and been following you since the start of my ML study.
    can you upload the video about how to solve the problem of imbalanced datasets and also whenever the new word is present in unkown dataset i.e when probability becomes zero.?
    thankyou!

  • @ManashreeKorgaonkar
    @ManashreeKorgaonkar ปีที่แล้ว

    Thank you sir, my concepts got cleared

  • @vishalvanpariya1466
    @vishalvanpariya1466 7 หลายเดือนก่อน

    Awesome video Krish, I feel the condition probability calculation is not correct. I've read some blogs and watched other videos there their method is different. and they all are the same, I referred AAIC, codebasis, and some blogs from analytics Vidhya and medium.

  • @sushilchauhan2586
    @sushilchauhan2586 4 ปีที่แล้ว +4

    i want a video on this too ...How To Apply Decision Tree' Classifier On Text Data (NLP)- Machine Learning.. on naive bayes it is easy but on dt i was confused.. pls help maae baap
    New Intro is awesome

  • @sandipansarkar9211
    @sandipansarkar9211 4 ปีที่แล้ว

    Superb explanation. Thanks a lot Krish

  • @krishj8011
    @krishj8011 3 หลายเดือนก่อน

    great tutorial..

  • @darpansalunke1729
    @darpansalunke1729 3 ปีที่แล้ว

    Thank you sir... Great work sir... 👍👍👍👍🙏🙏🙏

  • @codingquiz
    @codingquiz 4 ปีที่แล้ว +4

    great content keep doing

  • @siddaramhalli
    @siddaramhalli 3 ปีที่แล้ว

    The video is good, but noticed few things :
    1. "The" is also a stopword. 2. it's 25% and not 0.25%

  • @sudhirkv8292
    @sudhirkv8292 3 ปีที่แล้ว +1

    I wounder why last feature was not calulated or explained. what will be the probabilty of (Bad|yes)? Is it 0? If yes, the whole answer will become zero

  • @winyourself553
    @winyourself553 3 ปีที่แล้ว +2

    Sir, where is that next video related to the problems in the Naive Bayes theorem I really want that and don't want to lose the movement.

  • @rajuneelakantam8099
    @rajuneelakantam8099 4 ปีที่แล้ว +3

    GOOD CONTENT TY...

  • @skc1995
    @skc1995 4 ปีที่แล้ว +1

    Sir it would be greatful of you, if you make a video explaining the output of all clasifiers and regressors. I mean, SVM, naive bais, logistic all returns coefficients. Its hugely confusing similar in regression aswell. It will great if you address this. You are the last hope sir

  • @manthanrathod1046
    @manthanrathod1046 2 ปีที่แล้ว +1

    I don't understand why isn't the queries below are being addressed. Seems like the video was made hurryingly and all the calculation and concepts are messed up.

  • @yukeshnepal4885
    @yukeshnepal4885 4 ปีที่แล้ว +1

    thanks for this awesome tutorial. Hats off to you sir.
    Would you please make video on Support Vector machines with its mathematical concepts...

  • @shekhargaikwad5767
    @shekhargaikwad5767 2 ปีที่แล้ว

    Great Explanation sir when will you post Tutorial 50
    on to deal with word which is not present in the training dataset

  • @nandeesh_2005
    @nandeesh_2005 ปีที่แล้ว

    Excellent explanation 👌

  • @fro4e
    @fro4e 2 ปีที่แล้ว

    Good video. Wrong calculations on p(x1|y=yes) though

  • @subhrajeetbiswal6942
    @subhrajeetbiswal6942 ปีที่แล้ว +1

    p(x1/y=yes) should be 2/2 =1

  • @shakeelahmed8624
    @shakeelahmed8624 3 ปีที่แล้ว +2

    Well explained :) Do you have python implementation of same example ? Naive bayes implementation without library ?

  • @amitpingale8247
    @amitpingale8247 3 ปีที่แล้ว

    Hello Krish,
    Is likelihood the same as the probability for discrete variables?
    Here we are substituting the likelihood with the probability of the word in the Naive Bayes but when a continuous variable eg. income is an independent variable then we calculate the mean and sd to find the likelihood for that particular point in the distribution. So the confusion arises when we talk about categorical aka discrete variables we can interchange the terms probability and likelihood as it means the same. Kindly help

  • @swarupgorai
    @swarupgorai 4 ปีที่แล้ว +2

    please solve P(y=no/sentence) there is some problem in it

  • @sunnysavita9071
    @sunnysavita9071 4 ปีที่แล้ว +1

    nice video as usual

  • @shiv9475
    @shiv9475 4 ปีที่แล้ว +1

    Probability of the No will Be 0 because of delicious features probability is 0

  • @abhijitkunjiraman6697
    @abhijitkunjiraman6697 ปีที่แล้ว

    You're the best!

  • @osamaosama-vh6vu
    @osamaosama-vh6vu 2 ปีที่แล้ว

    Thanks

  • @divyark7557
    @divyark7557 2 ปีที่แล้ว

    P(y=No|Sent1)=3/5 * 2/3 * 1/3 * 3/3 = 0.13 is this computation correct?

  • @rajeshdoolla8623
    @rajeshdoolla8623 4 ปีที่แล้ว

    @krish Naik, Thanks for the nice explanation. I have this doubt, why we always apply Multinomial NB for text classification, why not binomial or Gaussian NB. could you please explain ?

  • @Vinay1272
    @Vinay1272 2 ปีที่แล้ว

    Really helpful❤

  • @mansidnailartist
    @mansidnailartist 2 ปีที่แล้ว

    hi krish... in the example shown, you computed P(y=yes|sentence)..... shouldnt this sentence be a query sentence and not one of the training sentence?

  • @shadabmathematics9672
    @shadabmathematics9672 3 ปีที่แล้ว

    I learn naive Bayes before also,,, but with the real life use case I understand ,,,,,I have a question Krish sir ,,,how you take the values in the table (like 0,1),,,pls clear my doubt ,,,

  • @mayank113463
    @mayank113463 4 ปีที่แล้ว +2

    the is also stop words ?

  • @hamareshsaivarma6720
    @hamareshsaivarma6720 ปีที่แล้ว

    Which input taken to predict the early reviewers by using navie bayes ??

  • @sonalmaheshwari8222
    @sonalmaheshwari8222 4 ปีที่แล้ว +3

    Sir please tell how to implement it practically

  • @poojayadav-pq6rd
    @poojayadav-pq6rd 4 ปีที่แล้ว

    How did you calculate output column in BOW step? If we don't have that column then in that case how we will proceed?

  • @dhainik.suthar
    @dhainik.suthar 3 ปีที่แล้ว +1

    Sir please also add code portion

  • @arpita0608
    @arpita0608 ปีที่แล้ว

    I am confused
    First p(y=yes) is 0.1
    second p(y=no) is 0
    now after normalization, we get 1 for yes and 0 for No...
    correct if i am wrong

  • @vipingautam9501
    @vipingautam9501 2 ปีที่แล้ว

    Sir how do we do it on Image data... if we have pixels as feature of our data set..how can we find the P(features/Class=k) ??

  • @louerleseigneur4532
    @louerleseigneur4532 3 ปีที่แล้ว

    I don't understand, how you choose this feature table.

  • @vincentdepaulsavarimuthu779
    @vincentdepaulsavarimuthu779 2 ปีที่แล้ว

    well explained

  • @vedmodikauratg1865
    @vedmodikauratg1865 3 ปีที่แล้ว

    Sir very good video. Will it be possible to make video based on naive bayes using TF-IDF processed data

  • @rakshitsinha4392
    @rakshitsinha4392 2 ปีที่แล้ว

    At 10:00 shouldn't we also take p( x4 | y=yes) ??

  • @mohammadsalman2145
    @mohammadsalman2145 3 ปีที่แล้ว +1

    Sir, why are you not taking the probability of word 'bad', when you are computing the probability for yes.

    • @pitchthewoo
      @pitchthewoo 3 ปีที่แล้ว

      He's only showing us Sentence1 which doesn't have the word 'bad'.

  • @elhadigasmi3122
    @elhadigasmi3122 3 ปีที่แล้ว

    The p(x=food| yes) =2/4 is correct?
    But "yes" is apparent just 2 times not 4!!!! Or im wrong??

  • @vanitapatel6391
    @vanitapatel6391 4 ปีที่แล้ว

    sir....plz upload video on how to solve the problem when the naive Bayes is fail

  • @kalyanputatunda5806
    @kalyanputatunda5806 หลายเดือนก่อน

    P(y=no/sentence)=0.15
    .Please check.

  • @user-tg3tg9gh3q
    @user-tg3tg9gh3q 2 ปีที่แล้ว

    1 - There are problems in calculation of probabilities, for example p(X2/y=yes) and others. Please fix them, because it causes misunderstanding for all people who watch this video.
    2 - 1/10 is 0.1 not 0.01
    3 - For a positive sentence, how the probability of yes (25%) is less than no(75%) ?

  • @pratheeeeeesh4839
    @pratheeeeeesh4839 4 ปีที่แล้ว

    Great content!

  • @maheshenumula9473
    @maheshenumula9473 4 ปีที่แล้ว

    Boss..It is not Bayes theorem formula.But Everyone are saying the same.The end calculation giving us right Bayes theorem.

  • @ishtiakahmed3272
    @ishtiakahmed3272 4 ปีที่แล้ว

    sir would you like to share tutorial 50 you were supposed to share and would you please arrange machine learning playlist according to order

  • @pawankumar.a8451
    @pawankumar.a8451 4 ปีที่แล้ว

    Sir I am unable to find ur naive bayes video after 49th one in machine learning.. Please upload it..

  • @deepthib7588
    @deepthib7588 2 ปีที่แล้ว

    why is f4 /x4, "Bad" not taken for p(yes/sentence)

  • @harendrakumar7647
    @harendrakumar7647 4 ปีที่แล้ว

    Could you please make a video on AB testing

  • @webdeveloper9704
    @webdeveloper9704 3 ปีที่แล้ว

    great sir

  • @chenlou7783
    @chenlou7783 ปีที่แล้ว

    1/10 is 0.01? but everything else makes sense. ty for sharing man.

  • @AjithlalK
    @AjithlalK 4 ปีที่แล้ว

    Thanks krish

  • @Miles2Achieve
    @Miles2Achieve 4 ปีที่แล้ว

    Can you please explain how to predict same for new sentence

  • @anirudhbalaji1042
    @anirudhbalaji1042 4 ปีที่แล้ว +1

    9:30 guys it 0.1 not 0.01

  • @poppychan1953
    @poppychan1953 3 ปีที่แล้ว

    dude amma see them all though i know nothing about programming

  • @harshitvishwakarma310
    @harshitvishwakarma310 3 ปีที่แล้ว

    I am not able to find the next part about imbalanced dataset and how to deal with the drawback plz anyone can send the link ?? Also what if i have multi class dataset ??

  • @rickymehra109
    @rickymehra109 4 ปีที่แล้ว

    Can you please share the next part of this video ??

  • @umeshsuggala3932
    @umeshsuggala3932 4 ปีที่แล้ว

    how to calculate when output is not there for other sentences?

  • @amitbudhiraja7498
    @amitbudhiraja7498 3 ปีที่แล้ว

    Where is the next video ? I mean the next part of naive Bayes after this

  • @deekshas5737
    @deekshas5737 3 ปีที่แล้ว +5

    The explanation is wrong Sir. Please correct it and upload a new video on this. It will misguide a lot of students.P(x2|y=yes) = 2/2 not 2/4. There are 2 yes values, hence denominator is 2. Out of 2 yes values, both have x2 and so numerator is 2.

  • @hardikaggarwal446
    @hardikaggarwal446 4 ปีที่แล้ว

    beautiful

  • @kumarraju2923
    @kumarraju2923 4 ปีที่แล้ว

    Which is the common tool for data science

  • @FidDid
    @FidDid ปีที่แล้ว

    good ()great) explanation but ... Calculation mistake !!!

  • @akshayakki3631
    @akshayakki3631 3 ปีที่แล้ว

    How to use this algoritham in digital marketing?

  • @namanmehra3570
    @namanmehra3570 3 ปีที่แล้ว +1

    So many calculation error and wrong probabilities taken

  • @Kickass3131
    @Kickass3131 3 ปีที่แล้ว

    Where did 0.03 come from? @10:50

  • @aaryamansharma6805
    @aaryamansharma6805 3 ปีที่แล้ว +1

    lot of errors Krish. Good video though.

  • @tanvishinde805
    @tanvishinde805 3 ปีที่แล้ว

    9:24 why P(Bad | Yes) is not considered in this calculation ?

    • @MindScape322
      @MindScape322 3 ปีที่แล้ว +1

      We are calculating y=yes for the sentence 'The food is delicious' where after pre processing(strop words removal) the sentence becomes 'The food delicious' and the BOW is applied to get the feature matrix (chart ) shown in the video. Remember, BOW makes a feature matrix of all words in the data.Since we have 3 sentences and the word bad is also present, thus it will be there in the chart. however, when we are finding y= yes for sentence '"The food delicious" we wont include word ''bad' in the calculation as its not part of the sentence.
      If u want to calculate y= yes for the second sentence 'The food is bad' which after preprocessing becomes ''The food bad' then we wont calculate the value of delicious. Hope this clears your doubt.
      In case u are confused check out any TH-cam video for Bag of Words and stopwords removal.

    • @tanvishinde805
      @tanvishinde805 3 ปีที่แล้ว

      @@MindScape322 oh yes! Thank you

  • @MercyGraceThomas
    @MercyGraceThomas 4 ปีที่แล้ว

    Gibbs algorithm?

  • @aniruddhadeshmukh3571
    @aniruddhadeshmukh3571 ปีที่แล้ว

    can anyone send link of next video that mean 50

  • @Basav555
    @Basav555 3 ปีที่แล้ว

    do videos in kannada also

  • @santhiyaA-sz8db
    @santhiyaA-sz8db 5 หลายเดือนก่อน

    p(yes/sent1)=1/10=0.1
    p(no/sent1)=0( delicious 0/2) 0* anythink..=0
    p(yes)=.1/0+.1
    =.1/0.1
    =1
    p(no)=1- p(yes)
    =1-1
    =0
    is correct ?

  • @swagatachakraborty8213
    @swagatachakraborty8213 3 ปีที่แล้ว

    where is tutorial 50?

  • @chandrakanttarse5115
    @chandrakanttarse5115 4 ปีที่แล้ว +1

    BA student can become data scientist?

  • @dibyasahoo7632
    @dibyasahoo7632 2 ปีที่แล้ว

    Can anyone help me finding the tutorial -50