Text Augmentation | Data Augmentation I Improve Model Performance

แชร์
ฝัง
  • เผยแพร่เมื่อ 12 พ.ย. 2024

ความคิดเห็น • 34

  • @srinisuman
    @srinisuman 4 ปีที่แล้ว +1

    thank you. I boosted the accuracy almost 17% by applying the data augmentation techniques for a multi class text classifier use case

  • @mfarooq28
    @mfarooq28 8 หลายเดือนก่อน

    Thanks Aarohi, helpful contribution.

  • @hossain9410
    @hossain9410 3 หลายเดือนก่อน

    How to apply augmentation for multiclass classification problem

  • @Kishi1969
    @Kishi1969 7 หลายเดือนก่อน

    Thank you so much madam🙏🙏🙏, received 🌹🌹🌹

  • @footballtocricket6989
    @footballtocricket6989 3 ปีที่แล้ว

    Can we use this method for NER dataset ??

  • @emreozan2206
    @emreozan2206 3 ปีที่แล้ว

    thank you, you told it basically.

  • @tarekferradji
    @tarekferradji 7 หลายเดือนก่อน

    can i use textaugment for the french language?

    • @CodeWithAarohi
      @CodeWithAarohi  7 หลายเดือนก่อน

      Yes, you can
      from textaugment import EDA
      # Initialize the EDA (Easy Data Augmentation) object
      eda = EDA()
      # Sample French text
      french_text = "Je suis en train d'apprendre."
      # Augment the text using synonym replacement
      augmented_text = eda.synonym_replacement(french_text, alpha_sr=0.1)
      print("Original text:", french_text)
      print("Augmented text:", augmented_text)

  • @HARIKAIPALLY
    @HARIKAIPALLY 24 วันที่ผ่านมา

    Is data augmentation used for regression model ( numerical data)?

    • @CodeWithAarohi
      @CodeWithAarohi  23 วันที่ผ่านมา

      Yes, data augmentation can be used for regression models but it's more commonly associated with image and text data.
      For example - You can add small random noise to the numerical features which will help the model generalize better.

  • @yasirabdulkareem9844
    @yasirabdulkareem9844 2 ปีที่แล้ว

    thanks. great explanation. I'm new to NLP, and I'm working on a short text fake news project. I have 5000 as real and 15000 as fake so I have to increase the real news. I plan to use one of these methods to improve my model. which approach do you prefer to apply?

  • @vibecatalyst3420
    @vibecatalyst3420 4 ปีที่แล้ว

    Very good presentation and informative too, for people in NLP.
    How will you implement these 4 methods of EDA on datasets?

    • @CodeWithAarohi
      @CodeWithAarohi  4 ปีที่แล้ว

      thanks for appreciating my work.
      And you can put this whole augmentation code in a function and then you can call it where ever you want.

    • @vibecatalyst3420
      @vibecatalyst3420 3 ปีที่แล้ว

      @@CodeWithAarohi Thank you
      Consider I have dataset of 12.5k and need to apply EDA on it.
      Do I need to implement four processes on each data using a function and combine the resultant with the 12.5k datasets?

  • @CodeWithAarohi
    @CodeWithAarohi  3 ปีที่แล้ว

    Join My Channel for Additional Benefits : Click on JOIN Button and choose the Membership Plan on the basis of benefits or Perks which I am offering to Members.

  • @abhishekprajapat415
    @abhishekprajapat415 4 ปีที่แล้ว +1

    This is all good, but can you also mention how to use the text after augmentation. For all I know, I do cleaning and spell correction on my data before training and hence most of these will only increase my workload.
    Also when we train embeddings if we use the following augmentation except for the synonym we will be training the embedding wrongly as it relies on the nearby words for getting similarity.
    So, kindly take a classification task and show its usage.

  • @lalithavanik5022
    @lalithavanik5022 4 ปีที่แล้ว

    Could you please make a video on DCGAN using dataset insert from google drive

  • @suybi2006
    @suybi2006 4 ปีที่แล้ว

    Thank you madam for this very informative video. However, I have a question. I'm new in NLP , and i'm working on a multi-class text classification project. I plan to use this method to improve my model.For this , i want to use it to increase my training and testing dataset. is this a good approach?

    • @CodeWithAarohi
      @CodeWithAarohi  4 ปีที่แล้ว

      Yes, this is a good approach because with data augmentation you can increase data and model will learn from bigger dataset which will give you good result.

    • @suybi2006
      @suybi2006 4 ปีที่แล้ว

      @@CodeWithAarohi Thank you for your reply . But, is it also necessary to increase the test data ?

    • @CodeWithAarohi
      @CodeWithAarohi  4 ปีที่แล้ว

      @@suybi2006 there is no need to increase the test data because we are increasing training data as we want to give more data to our model so that it get trained on more examples. But while testing - you can test on less data

    • @suybi2006
      @suybi2006 4 ปีที่แล้ว

      @@CodeWithAarohi Thanks you very much

  • @randomforrest9251
    @randomforrest9251 3 ปีที่แล้ว

    Well prepared collection of augmentation techniques. Unfortunately you sometimes repeated your words and stretched the video to a length way longer than necessary. There is some space for improvement. Still enjoyed the video
    Thank you =)

    • @CodeWithAarohi
      @CodeWithAarohi  3 ปีที่แล้ว +1

      Thankyou for the feedback. Will work on it and glad you still liked the video

  • @amankumarsingh415
    @amankumarsingh415 ปีที่แล้ว

    No module bol rha hai yaar

    • @CodeWithAarohi
      @CodeWithAarohi  ปีที่แล้ว

      No module means that module is not installed. Install it using pip.

    • @amankumarsingh415
      @amankumarsingh415 ปีที่แล้ว

      @@CodeWithAarohi sorry kal project bnakar comment change krna bhul gya..
      Thank you so much arohi❤

    • @amankumarsingh415
      @amankumarsingh415 ปีที่แล้ว

      @@CodeWithAarohi augmentation jo kr rhe hai phir duplicate sentences aa rhe hai bohot. Aisa kohi method hai jo increase kre dataset ko but with unique sentences not with duplicate sentences