NLP | Fine Tuning BERT to perform Spam Classification

แชร์
ฝัง
  • เผยแพร่เมื่อ 17 ต.ค. 2024
  • In this video, We will show you how to fine-tune a pre-trained BERT model using PyTorch and Transformers library to perform spam classification on a dataset.
    Here are some important links -
    Article: www.analyticsv....
    Github Repo: github.com/pra...

ความคิดเห็น • 56

  • @hemalshah1410
    @hemalshah1410 2 ปีที่แล้ว +4

    Simply crisp explanation of content on surface level . Thank you for making such contribution towards learning Community. Kudos !

  • @alidi5616
    @alidi5616 3 ปีที่แล้ว

    Great walkthrough with clear explanation and keeping it simple. Thank you so much. Exactly what i needed. :)

  • @ganeshkharad
    @ganeshkharad 4 ปีที่แล้ว

    this is one of best walkthrough i have seen👍👍

  • @Zelloss67
    @Zelloss67 9 หลายเดือนก่อน

    I do not understand where is a funetuning of the Bert takes place. You freezed gradients in Bert so Bert didnt adjust itself for your specific dataset. For example if you didnt freeze Bert tensors the gradient could overflow to Bert so it generate tokens, which are better representation of text in you dataset.
    What I saw just now was you training neural network for multi-class classification, which uses Bert tokens as inputs.
    Could you please provide some qlarifications on misunderstanding mentioned above?
    Thank you very much

    • @Analyticsvidhya
      @Analyticsvidhya  8 หลายเดือนก่อน

      You're absolutely right! Fine-tuning BERT in this case involves using its pre-trained embeddings and lower layers as a fixed feature extractor for your specific spam classification task. Freezing these layers doesn't mean BERT doesn't learn anything.
      Here's why:
      ➡️ The "fixed" BERT layers still represent complex relationships between words and sentences, providing valuable context for your task.
      ➡️ The newly added neural network (your "BERT_Arch") is where the actual fine-tuning happens. This network learns to interpret and weigh the information extracted by BERT to distinguish spam from non-spam emails.
      ➡️ Freezing BERT prevents gradient overflow issues that could arise from backpropagating through its massive pre-trained parameters. This allows for stable and efficient training of your specific classification head.
      In short, you're training a new classifier on top of a powerful pre-trained feature extractor, where the "fine-tuning" happens in the newly added layers, not directly in the pre-trained BERT weights.
      Remember, freezing BERT is a common and effective technique in fine-tuning, especially when dealing with limited datasets. ❤️

  • @vrbabu228
    @vrbabu228 2 ปีที่แล้ว +1

    Nice video! What should be changed if we were to look into multi class text classification? I know that the output layer dimension needs to be changed. Anything else to be changed?

    • @kaverianuranjana9787
      @kaverianuranjana9787 2 ปีที่แล้ว

      Even I would be interested in a multiclass classification implementation (maybe NLI?) that is as clear as this example. Thanks!

  • @urbanholds4944
    @urbanholds4944 4 ปีที่แล้ว +2

    Where is format_time function while evaluating the model?

  • @kakamondol1322
    @kakamondol1322 3 ปีที่แล้ว

    In the bert architecture if I want to concatenate some extra static feature before fc layer , how to do?

  • @victorthomas6844
    @victorthomas6844 3 ปีที่แล้ว

    Can you please suggest to me how to use the Bert model to compare two sentences for semantic similarity and assign them to a class. Your suggestions would be appreciated. Thanks.

  • @itsme1674
    @itsme1674 ปีที่แล้ว

    Very nice explanation

  • @avirajbevli7268
    @avirajbevli7268 3 ปีที่แล้ว +8

    Nice tutorial, but this is not "Fine tuning BERT" in the true sense since you are freezing the weights in the BERT model. You are essentially doing "Feature Extraction", i.e. using the feature vectors obtained from the BERT model and giving that as an input to our simple classification model(The 2 linear layers defined in the model) and only training the final 2 linear layers to perform spam classification.

    • @basmahhyder5695
      @basmahhyder5695 ปีที่แล้ว

      Could you elaborate what do you mean by not freezing the BERT model parameters?

  • @archanadurgam3711
    @archanadurgam3711 3 ปีที่แล้ว

    i'm getting an error in Start Model Training
    NameError Traceback (most recent call last)
    in ()
    7
    8 #for each epoch
    ----> 9 for epoch in range(epochs):
    10
    11 print('
    Epoch {:} / {:}'.format(epoch + 1, epochs))
    NameError: name 'epochs' is not defined
    help me with this please

  • @tahahuraibb5833
    @tahahuraibb5833 3 ปีที่แล้ว

    can we add a data loader for the prediction? I keep getting cuda out of memory error.

  • @kumarvaibhav7203
    @kumarvaibhav7203 3 ปีที่แล้ว +1

    The "Start Training Model" is not working when I tried code from github. This gives attribute error
    #train model
    train_loss, _ = train()
    AttributeError: 'str' object has no attribute 'dim'

    • @manarhamad782
      @manarhamad782 3 ปีที่แล้ว +1

      Yes , i got the same problem

    • @kumarvaibhav7203
      @kumarvaibhav7203 3 ปีที่แล้ว

      @@manarhamad782 I found the fix. They are using Tensorflow 3.0. Just add this line as first line - !pip install transformers==3.0.0

    • @manarhamad782
      @manarhamad782 3 ปีที่แล้ว

      Vaibhav Kumar ok I will try ,, thank you very much

    • @vikassalaria24
      @vikassalaria24 3 ปีที่แล้ว

      @Analytics Vidya After fixing this, when training the model getting NameError: name 'cross_entropy' is not defined

    • @ammaarahmad5747
      @ammaarahmad5747 3 ปีที่แล้ว

      @@vikassalaria24 How did you fix this error?

  • @sanketjaiswal6687
    @sanketjaiswal6687 3 ปีที่แล้ว

    how to visualize this above model in pytorch?

  • @hasanbasriakcay
    @hasanbasriakcay 3 ปีที่แล้ว

    Thank you for video. May I learn Transformers version and Tensorflow version?

  • @dw8200
    @dw8200 3 ปีที่แล้ว

    Nice walkthrough. Thanks for sharing.

  • @lukasnielsen1263
    @lukasnielsen1263 4 ปีที่แล้ว +8

    Very nice tutorial, clear architecture and nice and structured code. But you are not really fine-tuning a BERT model (as you freeze all the BERT models parameters), but rather the two dense layers atop the BERT model.
    It's important to underscore this, as the BERT model parameters remained unchanged after your fine-tuning, thus the BERT model is not fine-tuned, the classification "head" you added is :)

    • @lukasnielsen1263
      @lukasnielsen1263 3 ปีที่แล้ว

      ​@fsociety Yes, if you dont freeze the BERT parameters, they are fine-tuned. An easy way to check if this is the case is to encode the same word before and after fine-tuning the bert model, and compare the before and after vectors.

    • @basmahhyder5695
      @basmahhyder5695 ปีที่แล้ว

      Could you elaborate what do you mean by not freezing the BERT model parameters?

    • @lukasnielsen1263
      @lukasnielsen1263 ปีที่แล้ว

      @@basmahhyder5695 By freezing the parameters, they are not updated during fine-tuning. If you dont freeze them (which is standard practise btw) then the BERT model parameters are updated during fine-tuning

  • @alexsudakov8554
    @alexsudakov8554 ปีที่แล้ว

    class_wts = compute_class_weight('balanced', np.unique(train_labels), train_labels)
    TypeError: compute_class_weight() takes 1 positional argument but 3 were given
    How to fix?

    • @alexsudakov8554
      @alexsudakov8554 ปีที่แล้ว

      Fixed!
      class_wts = compute_class_weight(class_weight = 'balanced', classes = np.unique(train_labels), y = train_labels)

  • @НикитаНагорный-ч3о
    @НикитаНагорный-ч3о 3 ปีที่แล้ว

    Nice work!!!

  • @amenmechlaoui8717
    @amenmechlaoui8717 2 หลายเดือนก่อน

    bro i need data of call center fraude , where can i find this data ?

  • @jimharrington2087
    @jimharrington2087 4 ปีที่แล้ว

    Nice coding walkthrough!!

  • @sachinborse4178
    @sachinborse4178 5 หลายเดือนก่อน

    Please make on video as soon as possible because there is error showing in start the training train_loss,_=train() i try my best to solve this but no any single trick working🙏🏻

    • @Analyticsvidhya
      @Analyticsvidhya  5 หลายเดือนก่อน

      Let me get back to you on this..

  • @anbinh3967
    @anbinh3967 3 ปีที่แล้ว

    unreal! thanks!

  • @tonyafields4432
    @tonyafields4432 3 ปีที่แล้ว

    THANK YOU!

  • @khushbootaneja6739
    @khushbootaneja6739 ปีที่แล้ว

    Nice

  • @andrea-mj9ce
    @andrea-mj9ce 2 ปีที่แล้ว

    I can't manage to read the data (1:05)

    • @Analyticsvidhya
      @Analyticsvidhya  ปีที่แล้ว

      We'll suggest you to set video to 1080p. Then, your issue should be resolved.

    • @andrea-mj9ce
      @andrea-mj9ce ปีที่แล้ว

      @@Analyticsvidhya I was meaning to read the data in my computer with `read_csv`

  • @moshiwei8454
    @moshiwei8454 4 ปีที่แล้ว

    would be nice to reduce the accent a bit

  • @HariNarayan-s7t
    @HariNarayan-s7t ปีที่แล้ว

    I'm getting this error:
    Epoch 1 / 10
    ---------------------------------------------------------------------------
    TypeError Traceback (most recent call last)
    in ()
    12
    13 #train model
    ---> 14 train_loss, _ = train()
    15
    16 #evaluate model
    4 frames
    /usr/local/lib/python3.10/dist-packages/torch/nn/modules/linear.py in forward(self, input)
    112
    113 def forward(self, input: Tensor) -> Tensor:
    --> 114 return F.linear(input, self.weight, self.bias)
    115
    116 def extra_repr(self) -> str:
    TypeError: linear(): argument 'input' (position 1) must be Tensor, not str
    How do I fix this ?

    • @Analyticsvidhya
      @Analyticsvidhya  ปีที่แล้ว

      It looks like you're encountering a TypeError in your code. The error message indicates that there's an issue with the input data type. Specifically, the linear function is expecting a PyTorch Tensor as input, but it's receiving a string (str) instead.
      To fix this issue, check the part of your code where you're passing data to the linear function and ensure that you're passing a PyTorch tensor with the correct shape and data type.

    • @kartikeylohani6934
      @kartikeylohani6934 9 หลายเดือนก่อน +1

      @@Analyticsvidhya Hi, I am getting the same error, when i am printing the cls_hs variable, it is giving the output as pooler_output.
      PS please respond fast

    • @tianxiaoye-gn7jf
      @tianxiaoye-gn7jf 6 หลายเดือนก่อน

      @@Analyticsvidhya Hi, I met the same error, I think there may be some changes in the bert, making the output becomes str.

    • @tianxiaoye-gn7jf
      @tianxiaoye-gn7jf 6 หลายเดือนก่อน

      In the block """class BERT_Arch(nn.Module):'''''''
      #pass the inputs to the model
      _, cls_hs = self.bert(sent_id, attention_mask=mask, return_dict=False)
      Add the third arg, then it works

    • @Analyticsvidhya
      @Analyticsvidhya  6 หลายเดือนก่อน

      Thank you for sharing your experience and solution

  • @RinoNucara
    @RinoNucara ปีที่แล้ว

    I get an error, solved:
    from sklearn.utils.class_weight import compute_class_weight
    #compute the class weights
    class_wts = compute_class_weight('balanced', classes=np.unique(train_labels), y=train_labels)
    print(class_wts)

    • @Analyticsvidhya
      @Analyticsvidhya  ปีที่แล้ว

      Hey Rino, so your error is solved, right?

    • @RinoNucara
      @RinoNucara ปีที่แล้ว

      @@Analyticsvidhya yes! Thank you!