NLP Demystified 4: Advanced Preprocessing (part-of-speech tagging, entity tagging, parsing)

Future Mojo

มุมมอง 12 397

เพิ่มลงใน
- เพลย์ลิสต์ของฉัน
- ดูภายหลัง
แชร์

แชร์

ฝัง

ขนาดวิดีโอ:

แสดงแผงควบคุมโปรแกรมเล่น

เล่นอัตโนมัติ

เล่นใหม่

เผยแพร่เมื่อ 23 พ.ย. 2024

ความคิดเห็น • 27

@futuremojo 2 ปีที่แล้ว ⁺¹
Timestamps:
00:00:00 Advanced Preprocessing
00:00:18 Part-of-Speech (PoS) Tagging
00:01:06 Uses of PoS tags
00:02:24 Named Entity Recognition (NER)
00:03:17 Uses of NER tags
00:04:08 The challenges of NER
00:04:57 PoS- and NER-tagging as sequence labelling tasks
00:07:30 Constituency parsing
00:10:24 Dependency parsing
00:12:22 Uses of parsing
00:13:46 Which parsing approach to use
00:14:14 DEMO: advanced preprocessing with spaCy
00:21:11 Preprocessing recap
@radhikawadhawan4235 หลายเดือนก่อน
Watched lots of content on youtube for NLP. But this series is the BEST. Concise and to -the- point!
@davypeterbraun ปีที่แล้ว ⁺⁸
Your series is a life-saver. You an EXCELLENT teacher, and your voice is radio-like. Thanks so much!
@futuremojo ปีที่แล้ว
Thank you, Davy!
@fayezalhussein7115 20 วันที่ผ่านมา
the BEST series in NLP domain
@Momiji1998 7 หลายเดือนก่อน
This series is incredible! I can't believe we get to access such content for free online... what an era
@Engineering_101_ ปีที่แล้ว ⁺¹
This series is excellent! I'm glad I came across this series while searching for TF-IDF & cosine similarity related materials.
@joely2k83 ปีที่แล้ว ⁺¹
the best NLP course in youtube at least to me!! Thanks so much.
@caiyu538 ปีที่แล้ว
With your Great NLP lectures, understand a lot of NLP concepts
@caiyu538 ปีที่แล้ว ⁺¹
Great lectures.
@xray788 หลายเดือนก่อน
Amazing video. Thank you so much for these videos. I hope you will make more videos on other concepts in deep learning. By the way I do think you teaching this has made your speech on point as a by product. I too had doubts like others that your voice might have been generated but I picked up some words you say, that sounded Canadian, am I right hahah. Thank you and look forward to more videos!
@futuremojo หลายเดือนก่อน
LMAO which words gave me away?
@xray788 หลายเดือนก่อน ⁺¹
Words like about haha. And words containing "O". Maybe I've been watching Robin from HIMYM too much lol.
@toyomicho ปีที่แล้ว ⁺¹
Great video (and great course). Thank you, thank you, thank you.
Nitpick at 4:25
Hamilton was never a president. He was "the ten-dollar founding father without a father", but was never a president
LOL
@futuremojo ปีที่แล้ว
OMFG of all the things I didn't google LOL
@Ahbab91 ปีที่แล้ว ⁺¹
Hello Sir!
Your course is great!
Please suggest us a course on Generative AI to easily learn the concepts like we are learning in this course!
Very grateful Sir!
Thank you!
@spencergameing3575 ปีที่แล้ว
have u find any good course> bro
@shivaram8930 ปีที่แล้ว ⁺¹
Which software u used to generate that voice?
@futuremojo ปีที่แล้ว
What makes you think it's software?
@shivaram8930 ปีที่แล้ว ⁺¹
@@futuremojo it’s so clean, no disturbance nothing and fount to be bit artificial. That’s why I got this doubt. btw your explanatory skills are amazing. Hope you make more technical videos on LLM , finetuning or on transfer learning.
@malikrumi1206 ปีที่แล้ว
Why is there no 'end of entity' tag? I'm sure some might say it's redundant and unnecessary because when you come to the 'o' you are *obviously* at the end of the entity. But it is just as possible that the 'o' is a mistake, especially in a long multi word name. An end tag would be both more explicit and eliminate any ambiguity. But maybe that's just me....
@FrankCai-e7r ปีที่แล้ว
After transformer is used, are these preprocessing steps still very useful?
@futuremojo ปีที่แล้ว
It depends on what you want to do. The transformer is a model architecture. It's not something that automatically takes care of NLP tasks for you end to end.
When you load a transformer-based model under the hood using a library from Hugging Face, the library itself is taking care of things like tokenization (and you can customize what tokenizer it uses).
When you want to use an LLM to embed a document, you might need to do certain types of preprocessing depending on the LLM's context length or what you're trying to accomplish (e.g. you might need to enrich your data).
@FrankCai-e7r ปีที่แล้ว ⁺¹
@@futuremojo Thank you so much. when I learned the hugging face, it skips a lot of concepts you mentioned. After you explained, I understand now because HF provide the functions to take care of these in their tokenization process.
@dmytrokulaiev9083 ปีที่แล้ว ⁺¹
Machine learning is a niche topic, sure, but how does this have so few views?
@futuremojo ปีที่แล้ว ⁺³
Thanks for the comment. I think it's just the search patterns. The three most-viewed videos in this series are:
- The first introduction video. This is probably people looking for introductions to NLP.
- Neural networks from scratch. This is probably because of deep learning hype and people wanting to learn fundamentals.
- Transformers. Self-explanatory.
None of these are surprising. I think the people most interested in NLP are the ones who go over the rest.
If you look at the most popular NLP-related videos now, they're about prompt engineering, Langchain, etc. Whatever's current and applied rather than foundational.
@hamidadesokan6528 ปีที่แล้ว
I have no idea! It beats me as well! The views on these courses should be running into millions!

ต่อไป

เล่นอัตโนมัติ

NLP Demystified 5: Basic Bag-of-Words and Measuring Document Similarity