Yes, you can from textaugment import EDA # Initialize the EDA (Easy Data Augmentation) object eda = EDA() # Sample French text french_text = "Je suis en train d'apprendre." # Augment the text using synonym replacement augmented_text = eda.synonym_replacement(french_text, alpha_sr=0.1) print("Original text:", french_text) print("Augmented text:", augmented_text)
Yes, data augmentation can be used for regression models but it's more commonly associated with image and text data. For example - You can add small random noise to the numerical features which will help the model generalize better.
thanks. great explanation. I'm new to NLP, and I'm working on a short text fake news project. I have 5000 as real and 15000 as fake so I have to increase the real news. I plan to use one of these methods to improve my model. which approach do you prefer to apply?
@@CodeWithAarohi Thank you Consider I have dataset of 12.5k and need to apply EDA on it. Do I need to implement four processes on each data using a function and combine the resultant with the 12.5k datasets?
Join My Channel for Additional Benefits : Click on JOIN Button and choose the Membership Plan on the basis of benefits or Perks which I am offering to Members.
This is all good, but can you also mention how to use the text after augmentation. For all I know, I do cleaning and spell correction on my data before training and hence most of these will only increase my workload. Also when we train embeddings if we use the following augmentation except for the synonym we will be training the embedding wrongly as it relies on the nearby words for getting similarity. So, kindly take a classification task and show its usage.
Thank you madam for this very informative video. However, I have a question. I'm new in NLP , and i'm working on a multi-class text classification project. I plan to use this method to improve my model.For this , i want to use it to increase my training and testing dataset. is this a good approach?
Yes, this is a good approach because with data augmentation you can increase data and model will learn from bigger dataset which will give you good result.
@@suybi2006 there is no need to increase the test data because we are increasing training data as we want to give more data to our model so that it get trained on more examples. But while testing - you can test on less data
Well prepared collection of augmentation techniques. Unfortunately you sometimes repeated your words and stretched the video to a length way longer than necessary. There is some space for improvement. Still enjoyed the video Thank you =)
@@CodeWithAarohi augmentation jo kr rhe hai phir duplicate sentences aa rhe hai bohot. Aisa kohi method hai jo increase kre dataset ko but with unique sentences not with duplicate sentences
thank you. I boosted the accuracy almost 17% by applying the data augmentation techniques for a multi class text classifier use case
Srini’s welcome
Thanks Aarohi, helpful contribution.
Glad it is helpful!
How to apply augmentation for multiclass classification problem
Thank you so much madam🙏🙏🙏, received 🌹🌹🌹
Most welcome 😊
Can we use this method for NER dataset ??
thank you, you told it basically.
Welcome
can i use textaugment for the french language?
Yes, you can
from textaugment import EDA
# Initialize the EDA (Easy Data Augmentation) object
eda = EDA()
# Sample French text
french_text = "Je suis en train d'apprendre."
# Augment the text using synonym replacement
augmented_text = eda.synonym_replacement(french_text, alpha_sr=0.1)
print("Original text:", french_text)
print("Augmented text:", augmented_text)
Is data augmentation used for regression model ( numerical data)?
Yes, data augmentation can be used for regression models but it's more commonly associated with image and text data.
For example - You can add small random noise to the numerical features which will help the model generalize better.
thanks. great explanation. I'm new to NLP, and I'm working on a short text fake news project. I have 5000 as real and 15000 as fake so I have to increase the real news. I plan to use one of these methods to improve my model. which approach do you prefer to apply?
Very good presentation and informative too, for people in NLP.
How will you implement these 4 methods of EDA on datasets?
thanks for appreciating my work.
And you can put this whole augmentation code in a function and then you can call it where ever you want.
@@CodeWithAarohi Thank you
Consider I have dataset of 12.5k and need to apply EDA on it.
Do I need to implement four processes on each data using a function and combine the resultant with the 12.5k datasets?
Join My Channel for Additional Benefits : Click on JOIN Button and choose the Membership Plan on the basis of benefits or Perks which I am offering to Members.
This is all good, but can you also mention how to use the text after augmentation. For all I know, I do cleaning and spell correction on my data before training and hence most of these will only increase my workload.
Also when we train embeddings if we use the following augmentation except for the synonym we will be training the embedding wrongly as it relies on the nearby words for getting similarity.
So, kindly take a classification task and show its usage.
Will make video on that soon
Could you please make a video on DCGAN using dataset insert from google drive
I will try to make it
Thank you madam for this very informative video. However, I have a question. I'm new in NLP , and i'm working on a multi-class text classification project. I plan to use this method to improve my model.For this , i want to use it to increase my training and testing dataset. is this a good approach?
Yes, this is a good approach because with data augmentation you can increase data and model will learn from bigger dataset which will give you good result.
@@CodeWithAarohi Thank you for your reply . But, is it also necessary to increase the test data ?
@@suybi2006 there is no need to increase the test data because we are increasing training data as we want to give more data to our model so that it get trained on more examples. But while testing - you can test on less data
@@CodeWithAarohi Thanks you very much
Well prepared collection of augmentation techniques. Unfortunately you sometimes repeated your words and stretched the video to a length way longer than necessary. There is some space for improvement. Still enjoyed the video
Thank you =)
Thankyou for the feedback. Will work on it and glad you still liked the video
No module bol rha hai yaar
No module means that module is not installed. Install it using pip.
@@CodeWithAarohi sorry kal project bnakar comment change krna bhul gya..
Thank you so much arohi❤
@@CodeWithAarohi augmentation jo kr rhe hai phir duplicate sentences aa rhe hai bohot. Aisa kohi method hai jo increase kre dataset ko but with unique sentences not with duplicate sentences