Good sir, You have genuinely solved the question that was bugging me for the past two months. I have had no trouble training models using the Huggingface Trainer API but never knew what to do afterwards and the course is extremely ambiguous about this. I am not a very good student generally. You have saved me
Thank you for this introduction! The Finetuning part at the end was a little fast, so I'll be checking out your other videos to learn more. Thanks again.
Excellent tutorial. Cannot believe only 3.7k likes for this. I recommend typing the code out in vscode or other and running in parallel while watching the video.
This is a great resource right here! Feels like I just got some new super powers and can tackle a whole bunch of Natural Language problems. Thanks a lot
2 questions : i get some warning about "right padding", dk how to fix it. I was trying to use conversational pipes but why they are such trash? 50% of times they just repeat, and cant do anything about it. Send help.
If we could do all the things (like training 13:57 and inferencing 1:50 ) without tensorflow/pytorch, as shown in the video, why do we even need these frameworks?
just for the record that it is now october 2024, so 2 years after this video, and the number of models on HF is not 35000 anymore. It's more than 1 million.
Hey! Thanks for the great tutorial. How can you make the AI remember the full convo, so it remembers the outputs it gave? For example it says my name is Andrew, but then will say that it's name is Anna when asked again Sorry if its a dumb question:D
So first you install pytorch, then you type pip install transformers into the cmd, then you get to the next part of the tutorial by figuring it out yourself? What's even going on?
I STILL, despite all these "daily advancements", have yet to find one that can handle this particular case usage, all in one go, can anyone solve for this?: Preprocess the markdown files: Tokenize the text. Remove stop words. Apply TF-IDF (Term Frequency-Inverse Document Frequency) to identify significant words and phrases. Apply deep learning techniques: Utilize deep learning algorithms like RNNs (Recurrent Neural Networks) and word embeddings. Leverage attention mechanisms and transformer-based models. Use pre-trained language models: Consider using pre-trained models such as BERT (Bidirectional Encoder Representations from Transformers) or GPT (Generative Pre-trained Transformer). Fine-tune the models: Train the pre-trained models on your specific dataset to improve their performance. Evaluate the generated summaries: Use metrics like ROUGE (Recall-Oriented Understudy for Gisting Evaluation) to assess the quality of the summaries. Iterate and refine: Continuously experiment and adjust the model architecture and hyperparameters based on feedback. Ensure computational resources: Allocate sufficient computational resources such as GPUs (Graphics Processing Units) for efficient training and inference All to achieve this desired output: to take 2 million words documented for one singular project, and extract out of it all the cross references in a 10K word transcript equivalent, am I really the only person with such a, demand with no supply? lol...I have been searching and searching, but seems like I am indeed both in no man's land and the pioneer of an undiscovered continent.
I enjoyed the video but perhaps need a "100-level" one with specific use-cases and how you would plan out your development and configuration to get after that use-case. :)
Hi, I want to use tokenizer along with some other AI models(not decided yet). I have a set of files of text (source code). They are already classified for multiple attributes. Such as file A has x and y true, file b has x true y false.. Regarding that I have two questions. 1- I want to tokenize the file contents but in my case not single words but groups of words (such as groups of 3, 4 tokens) makes sense. So, I want to tokenize in a way to group those sets of words if that is possible. 2- Each file would have different lengths, thus different number of tokens. How can I use these sets of mergedtokens for prediction. I mean what is the best AI technique to use afterwards to prepare a model for prediction?
Can you please help explain what to do if you have a Mac OS and your Jupyter Notebook keeps crashing when you try to import pipeline from transformers? The kernel keeps dying.
Tokenizers can be rather complicated; in many cases they're learned/trained; you'd want to save after fine-tuning them (and just to have them offline i reckon)
It says that no model was supplied and it's using a default model. Did it actually download that default model or used an API to connect to it remotely? If it's the latter, is that free of charge?
I love the fact you're smiling like a genuinely happy individual. Good vibes. We need more that in the world.
That is because you cant see the gun behind him....
@@10xFrontend Cuz he is using Happying Face,,,
He gets a positive score of around 0.9434.
He’s smiling because he knows his accent is horrific
He covered each details for a beginner and its great.
I cant invent by myself, how to save history of chat for conversation
By the time I am watching this video, total model numbers are 749,434. Insane!!
Good sir, You have genuinely solved the question that was bugging me for the past two months. I have had no trouble training models using the Huggingface Trainer API but never knew what to do afterwards and the course is extremely ambiguous about this. I am not a very good student generally. You have saved me
Thank you for this introduction! The Finetuning part at the end was a little fast, so I'll be checking out your other videos to learn more. Thanks again.
this is way more better than what AWS guys do ...
I like your clear voice, easy explanation and smile.😄
Excellent tutorial. Cannot believe only 3.7k likes for this. I recommend typing the code out in vscode or other and running in parallel while watching the video.
I've been trying to figure this out for days and in the first couple minutes of the video you helped me a lot.
Thanks for making this task more inviting. Now I definitively want to know details on the hot-to do the fine tuning over custom content
This is a very helpful introduction! Thank you so much for putting it up!
Glad it was helpful!
You are the best, in a few minutes I get more then from previouse 15-20 videos
This is a great resource right here! Feels like I just got some new super powers and can tackle a whole bunch of Natural Language problems. Thanks a lot
Thanks for the tutorial!
In only 1 year we came from 34K models available to almost 300K models, almost 10x increase! 😮
Thank you for this quick tutorial! It's exactly what I was looking for :)
love the fact you are so positive!!!! keep it up!!
Totally appreciate this, thank you - had to install tensorflow==2.13.0 and kares==2.13.1 - worked great
Very helpful tutorial! Many thanks for it!
By the way, you speak very clear English and that is extremely helpful for me as a 'non-native speaker' 😊
Still the best intro to HF in my opinion!
PS: would love to know which theme is used in VS Code, it's great.
I think it is Night Owl
Simple and straight to the point!
It's the best tutorial I've ever see. Thanks!
Awesome summary / introduction. Thank you so much!
But what are the system requirements? Do need a GPU? How powerful a computer do you need?
I'm watching this video in October 2023 and it's so cool to see that now there are 376,348 models
This is awesome. Thanks for the excellent walk through transformers library
Great and concise explanation! Thank you so much!
Exactly what I was looking for. Thank you for sharing!
thanks for using VSC, it is a better good feeling for me
Thanks for the very usefull video, in 15min you explain what other will do it in 5hours
Thank you so much for the tutorial. It was very helpful. The explanation was precise and concise!
Thanku so much for putting together important things in an easy way. Waiting for more of such videos from you.
eyebrows are talking in there own language
😅
You mean German?
Actually the eyebrows are communicating and Morse code and the messages I'm being held hostage help me escape exclamation
Hahaha I love TH-cam comments
yeah what the hell going on here
This is fantastic I hope you can make more videos covering other pipelines for HuggingFace transformers package.. Many thanks.. 😀
Just found you. So grateful
2 questions : i get some warning about "right padding", dk how to fix it.
I was trying to use conversational pipes but why they are such trash? 50% of times they just repeat, and cant do anything about it.
Send help.
you are eye blinking that's awesome 😊😊
Amazing explanation. Thanks for sharing your knowledge!
If we could do all the things (like training 13:57 and inferencing 1:50 ) without tensorflow/pytorch, as shown in the video, why do we even need these frameworks?
just for the record that it is now october 2024, so 2 years after this video, and the number of models on HF is not 35000 anymore. It's more than 1 million.
when to train a model with PyTorch loop and when to user the Trainer API? if you can give me a scenario
3:01 i understood up to this point,the first example 🤗
What is the best natural language model for answering multiple choice questions in mathematics? CHATGPT is not so good at this..
Bro gives off good vibes
Thanks! Clear and informative
What an awesome introduction! Thanks a lot mate! 💪
This is a superb tutorial. Many thanks.
complete and short. awesome.
Does the finetuning part still work or do I need to make any changes ?
im running windows on my PC is there any problem downloading Linux and the other codes you described?
Ur best teacher
Thank you :)
Great explanation, danke!
What is this vscode theme? Readability is awesome
I think this was the Night Owl theme
Hey! Thanks for the great tutorial. How can you make the AI remember the full convo, so it remembers the outputs it gave? For example it says my name is Andrew, but then will say that it's name is Anna when asked again
Sorry if its a dumb question:D
So first you install pytorch, then you type pip install transformers into the cmd, then you get to the next part of the tutorial by figuring it out yourself? What's even going on?
can i use these transformers for commercial use
ModuleNotFoundError: No module named 'transformers'
how can I fix this?
Tutorial for peoples already know Transformers, Pipeline, Tokenizer and Models :)
please can you tell me how to use CoreML models, how to download them
Just found you. So happy rn LOL
Thanks for clearly explained tutorial !
what software are you using i love the interface so streamlined i'm using jupyter notebooks
its ok i found it visual studio thanks 🙂
thank you so much
I STILL, despite all these "daily advancements", have yet to find one that can handle this particular case usage, all in one go, can anyone solve for this?:
Preprocess the markdown files:
Tokenize the text.
Remove stop words.
Apply TF-IDF (Term Frequency-Inverse Document Frequency) to identify significant words and phrases.
Apply deep learning techniques:
Utilize deep learning algorithms like RNNs (Recurrent Neural Networks) and word embeddings.
Leverage attention mechanisms and transformer-based models.
Use pre-trained language models:
Consider using pre-trained models such as BERT (Bidirectional Encoder Representations from Transformers) or GPT (Generative Pre-trained Transformer).
Fine-tune the models:
Train the pre-trained models on your specific dataset to improve their performance.
Evaluate the generated summaries:
Use metrics like ROUGE (Recall-Oriented Understudy for Gisting Evaluation) to assess the quality of the summaries.
Iterate and refine:
Continuously experiment and adjust the model architecture and hyperparameters based on feedback.
Ensure computational resources:
Allocate sufficient computational resources such as GPUs (Graphics Processing Units) for efficient training and inference
All to achieve this desired output:
to take 2 million words documented for one singular project, and extract out of it all the cross references in a 10K word transcript equivalent, am I really the only person with such a, demand with no supply? lol...I have been searching and searching, but seems like I am indeed both in no man's land and the pioneer of an undiscovered continent.
How come I didn’t see any labels data used for training or fine tuning in your examples?
do you have a repo for the code shown in the video?
Is there a pipeline for translating a text into a dense vector representation?
If you do it like you did it, you will need to install tf_keras as well.
Great video!
I enjoyed the video but perhaps need a "100-level" one with specific use-cases and how you would plan out your development and configuration to get after that use-case. :)
Very kool. It's on my watch list
Awesome video, thanks so much!
This is a brilliant video!!!
Hi, I want to use tokenizer along with some other AI models(not decided yet). I have a set of files of text (source code). They are already classified for multiple attributes. Such as file A has x and y true, file b has x true y false.. Regarding that I have two questions. 1- I want to tokenize the file contents but in my case not single words but groups of words (such as groups of 3, 4 tokens) makes sense. So, I want to tokenize in a way to group those sets of words if that is possible. 2- Each file would have different lengths, thus different number of tokens. How can I use these sets of mergedtokens for prediction. I mean what is the best AI technique to use afterwards to prepare a model for prediction?
Can you please help explain what to do if you have a Mac OS and your Jupyter Notebook keeps crashing when you try to import pipeline from transformers? The kernel keeps dying.
I hope one day you will fix your pip install problems.
The best explanation
Glad you think so!
How to do text generation without the pipeline? I'm not sure why everyone everywhere is doing sentence classification.
what app did you switch to one minute in?
Nice summary! Thanks!
Great intro video!
it throws SSL certification error when I do the commands in python. Please help!
I am using a gated model and I am getting authorization error in JS can you please suggest me some resources to add authorization
thanks for this video. But towards 11:32 timestamp, why do we save tokenizer as well? Isn't it a static module that we can simply import again?
Tokenizers can be rather complicated; in many cases they're learned/trained; you'd want to save after fine-tuning them (and just to have them offline i reckon)
Dude I so love you❤ for making such useful videos😊😊😊
It says that no model was supplied and it's using a default model. Did it actually download that default model or used an API to connect to it remotely? If it's the latter, is that free of charge?
great video! Thank you
Great stuff, thank you.
Can i use pycharm, no vs
love ur vids man
Great video! Do you know how I would be able to view a classification report and confusion matrix of a sentiment analysis pipeline?
outstanding, Thank You
when I import pipeline, my kernel crashes. Any suggestions? I pip installed the transformers in virtual environment
me too! :( I don't know what to do
really useful intro, thank you!
How do you get VS Code to show the completions (in a python virtual environment)?
Thank you very much for the very informative video and the great examples.
Great lecture. Thank you for sharing your knowledges
Do you run this and visual basics?
What is the app you are using to code
one level of abstraction inside 6:07
tokenization 8:22
U r really awesome keep going