10 years of NLP history explained in 50 concepts | From Word2Vec, RNNs to GPT

If LLMs are text models, how do they generate images?

Why Does Diffusion Work Better than Auto-Regression?

BABYMONSTER (베이비몬스터) - DRIP @인기가요 inkigayo 20241110

โดนเล่นหนัก! JOHNY SOMALI สตรีมเมอร์เกรียนชื่อดัง 💀 โดนฟ้องจะติดคุกในเกาหลี 10ปี

ทำไมแจ็คถึงถูกส่งไปอนาคต #samuraijack #aku #เล่าเรื่อง

Multimodal AI from First Principles - Neural Nets that can see, hear, AND write.

Neural Breakdown with AVB

มุมมอง 9 696

เพิ่มลงใน
- เพลย์ลิสต์ของฉัน
- ดูภายหลัง
แชร์

แชร์

ฝัง

ขนาดวิดีโอ:

แสดงแผงควบคุมโปรแกรมเล่น

เล่นอัตโนมัติ

เล่นใหม่

เผยแพร่เมื่อ 10 พ.ย. 2024

ความคิดเห็น • 25

@thegigasurgeon หลายเดือนก่อน ⁺¹
very clear survey of multimodal models. Everything under one roof. Great work
@avb_fj หลายเดือนก่อน
Thanks a lot!
@madsfrederiksen6213 ปีที่แล้ว ⁺⁴
Great and clear video! Heard about multimodal models for the first time today, and i already feel like i have a better grasp of it, thanks to you :)
@boogati9221 5 หลายเดือนก่อน ⁺³
Dude this video was so fucking good. Keep it up.
@xxlvulkann6743 3 หลายเดือนก่อน ⁺¹
This was a useful summary for finding papers to research developments in multimodal machine learning models!
@avb_fj 3 หลายเดือนก่อน
Thanks! Super glad you found the video resourceful!
@meet_minimalist 10 หลายเดือนก่อน ⁺²
Excellent video with all the paper references. Lot to read and learn from papers. Thanks. :)
@avb_fj 10 หลายเดือนก่อน
Thanks!🙏🏽
@joshuatettey7771 3 หลายเดือนก่อน ⁺¹
Awesome video. Thanks mate🤩
@tomm9716 3 หลายเดือนก่อน
Really good stuff mate, subbed
@syoyazhou8657 ปีที่แล้ว ⁺¹
Like your videos. Explain things in a very clear way. Thx for sharing.
@avb_fj ปีที่แล้ว
Thank you!
@xspydazx 7 หลายเดือนก่อน
CODE IS BETTER ??
rom transformers import VisionEncoderDecoderModel, VisionTextDualEncoderProcessor, AutoImageProcessor, AutoTokenizer
print('Add Vision...')
# ADD HEAD
# Combine pre-trained encoder and pre-trained decoder to form a Seq2Seq model
Vmodel = VisionEncoderDecoderModel.from_encoder_decoder_pretrained(
"google/vit-base-patch16-224-in21k", "LeroyDyer/Mixtral_AI_Tiny"
)
_Encoder_ImageProcessor = Vmodel.encoder
_Decoder_ImageTokenizer = Vmodel.decoder
_VisionEncoderDecoderModel = Vmodel
# Add Pad tokems
LM_MODEL.VisionEncoderDecoder = _VisionEncoderDecoderModel
# Add Sub Components
LM_MODEL.Encoder_ImageProcessor = _Encoder_ImageProcessor
LM_MODEL.Decoder_ImageTokenizer = _Decoder_ImageTokenizer
LM_MODEL
This is how you add vision to llm (you can embed the head inside )
print('Add Audio...')
#Add Head
# Combine pre-trained encoder and pre-trained decoder to form a Seq2Seq model
_AudioFeatureExtractor = AutoFeatureExtractor.from_pretrained("openai/whisper-small")
_AudioTokenizer = AutoTokenizer.from_pretrained("openai/whisper-small")
_SpeechEncoderDecoder = SpeechEncoderDecoderModel.from_encoder_decoder_pretrained("openai/whisper-small","openai/whisper-small")
# Add Pad tokems
_SpeechEncoderDecoder.config.decoder_start_token_id = _AudioTokenizer.cls_token_id
_SpeechEncoderDecoder.config.pad_token_id = _AudioTokenizer.pad_token_id
LM_MODEL.SpeechEncoderDecoder = _SpeechEncoderDecoder
# Add Sub Components
LM_MODEL.Decoder_AudioTokenizer = _AudioTokenizer
LM_MODEL.Encoder_AudioFeatureExtractor = _AudioFeatureExtractor
LM_MODEL
This is how you can add vision :
@AI_ML_DL_LLM ปีที่แล้ว ⁺¹
Wow, there is lots of works behind it, thank you
@avb_fj ปีที่แล้ว
Haha thanks for the comment! It’s an emerging area, and a lot of groundbreaking research really has happened in the past few years.
@ahmed_hefnawy1811 8 หลายเดือนก่อน ⁺¹
Excellent
@vobbilisettyveera2973 ปีที่แล้ว ⁺¹
awesome!!!!!!!!!!
@IsmailIfakir หลายเดือนก่อน
some multimodal llm can fine-tuning for sentiment analysis
@420_gunna 8 หลายเดือนก่อน
7:55 lol
@avb_fj 8 หลายเดือนก่อน
Honest reactions lol😅
@deliciouspops ปีที่แล้ว
do you think you should tune your audio levels or what? according to youtube, i am your 666th view
@avb_fj ปีที่แล้ว
Always open for feedback. What kind of tuning are we talking about?
@avb_fj ปีที่แล้ว
@@LonewolfeSlayer Sounds good... something to keep in mind for my next one. :)

ต่อไป

เล่นอัตโนมัติ

10 years of NLP history explained in 50 concepts | From Word2Vec, RNNs to GPT

10 years of NLP history explained in 50 concepts | From Word2Vec, RNNs to GPT

If LLMs are text models, how do they generate images?

If LLMs are text models, how do they generate images?

Why Does Diffusion Work Better than Auto-Regression?

Why Does Diffusion Work Better than Auto-Regression?

BABYMONSTER (베이비몬스터) - DRIP @인기가요 inkigayo 20241110

BABYMONSTER (베이비몬스터) - DRIP @인기가요 inkigayo 20241110

โดนเล่นหนัก! JOHNY SOMALI สตรีมเมอร์เกรียนชื่อดัง 💀 โดนฟ้องจะติดคุกในเกาหลี 10ปี

โดนเล่นหนัก! JOHNY SOMALI สตรีมเมอร์เกรียนชื่อดัง 💀 โดนฟ้องจะติดคุกในเกาหลี 10ปี

ทำไมแจ็คถึงถูกส่งไปอนาคต #samuraijack #aku #เล่าเรื่อง

ทำไมแจ็คถึงถูกส่งไปอนาคต #samuraijack #aku #เล่าเรื่อง

มายคราฟแต่ถ้าผมเห็น "สีน้ำเงิน" คลิปนี้จะระเบิด!?

มายคราฟแต่ถ้าผมเห็น "สีน้ำเงิน" คลิปนี้จะระเบิด!?

The moment we stopped understanding AI [AlexNet]

The moment we stopped understanding AI [AlexNet]

NEW Multi-Modal AI by APPLE

NEW Multi-Modal AI by APPLE

GPT-4o, AI overviews and our multimodal future

GPT-4o, AI overviews and our multimodal future

Text to Image Diffusion AI Model from scratch - Explained one line of code at a time!

Text to Image Diffusion AI Model from scratch - Explained one line of code at a time!

From Attention to Generative Language Models - One line of code at a time!

From Attention to Generative Language Models - One line of code at a time!

Cohere For AI - Community Talks: Lucas Beyer

Cohere For AI - Community Talks: Lucas Beyer

How To Build The Future: Sam Altman

How To Build The Future: Sam Altman

Generative AI in a Nutshell - how to survive and thrive in the age of AI

Generative AI in a Nutshell - how to survive and thrive in the age of AI

How large language models work, a visual intro to transformers | Chapter 5, Deep Learning

How large language models work, a visual intro to transformers | Chapter 5, Deep Learning

Rodtang’s kisses heal all wounds 😂

Rodtang’s kisses heal all wounds 😂

แจ๊สสร้างตำนานอีกแล้ว 😂😂 #แจ๊สชวนชื่น #แจ๊สแจง #แจงปุณณาสา #ก็มาดิครับ #ตลก #shorts

แจ๊สสร้างตำนานอีกแล้ว 😂😂 #แจ๊สชวนชื่น #แจ๊สแจง #แจงปุณณาสา #ก็มาดิครับ #ตลก #shorts

MISS GRAND SONGKHLA 2025 | FINAL SHOW

MISS GRAND SONGKHLA 2025 | FINAL SHOW

BEFRIEDIGENDES BALLONHANDWERK 🎈| EINEN KRISTALL ZERPLATZEN LASSEN 😲

BEFRIEDIGENDES BALLONHANDWERK 🎈| EINEN KRISTALL ZERPLATZEN LASSEN 😲

[Full] 4 ต่อ 4 Celebrity EP.922 | 10 พ.ย. 67 | one31

[Full] 4 ต่อ 4 Celebrity EP.922 | 10 พ.ย. 67 | one31

Incredibox Sprunki: Who's Really Friend ? Oren or Raddy or Fun Bot #shorts #animation

Incredibox Sprunki: Who's Really Friend ? Oren or Raddy or Fun Bot #shorts #animation

无论如何，请相信自己#非遗 #重庆 #国粹 #街头卖艺 #杂技 #show #shorts

无论如何，请相信自己#非遗 #重庆 #国粹 #街头卖艺 #杂技 #show #shorts

มายคราฟสุ่มเอาชีวิตรอด "ใช่หรือไม่" นะหรือไม่ใช่!?

มายคราฟสุ่มเอาชีวิตรอด "ใช่หรือไม่" นะหรือไม่ใช่!?