OpenAI's CLIP for Zero Shot Image Classification

Transformers (how LLMs work) explained visually | DL5

Why Does Diffusion Work Better than Auto-Regression?

🔴LIVE กัมพูชา vs ติมอร์-เลสเต | ฟุตบอล ASEAN Mitsubishi Electric Cup™ 2024 | รอบแรก กลุ่ม A

总算是用上情侣手机壳了 #玩一种很新的东西 #手机壳 #情侣

ถ้าต้องทำ การบ้าน ตลอดชีวิต? คุณจะเลือกแบบไหน!

OpenAI CLIP Explained | Multi-modal ML

James Briggs

มุมมอง 24 590

เพิ่มลงใน
- เพลย์ลิสต์ของฉัน
- ดูภายหลัง
แชร์

แชร์

ฝัง

ขนาดวิดีโอ:

แสดงแผงควบคุมโปรแกรมเล่น

เล่นอัตโนมัติ

เล่นใหม่

เผยแพร่เมื่อ 17 ม.ค. 2025

ความคิดเห็น • 34

@ricardojung3849 2 ปีที่แล้ว ⁺³
Thanks for reporting, explaining and lastly opening up recent ML!
I found clip to be very interesting since I always frowned at the lost potential of two different embeddings being arbitrary and methodically separate. This is huge!
@jamesbriggs 2 ปีที่แล้ว ⁺¹
yes there will be plenty more on CLIP and other similar models very soon - some of stuff I've built (and will demo) is awesome and nothing more than zero-shot CLIP, excited to share!
@mszak50 ปีที่แล้ว
This was really excellent - some of the pieces are starting to make sense
@adrianarroyo9839 ปีที่แล้ว ⁺¹
Nice video and explanation! I think on min 28:45 you plotted cos_sim instead of dot_sim!
@konichiwatanabi ปีที่แล้ว
Thank you so much for this great walkthrough! Looking forward to more
@DallanQuass 2 ปีที่แล้ว
Great video! Looking forward to your next video diving more into using CLIP for zero-shot classification!
@jamesbriggs 2 ปีที่แล้ว
Me too, it's fascinating. Thanks for watching!
@ismailashraq9697 2 ปีที่แล้ว
This is amazing James. Thanks for the detailed explanation. I am excited for the future CLIP videos 🙂.
@jamesbriggs 2 ปีที่แล้ว
Thanks Ashraq! As you know, I'm excited for them too
@anantzen171 ปีที่แล้ว
10:23 I believe CLIP is an abbreviation of Contrastive Language Image Pretraining
@justinmiller7150 ปีที่แล้ว ⁺¹
Great video. I think you may be plotting the same graph twice though (cos sim). In practice it is almost the same though it would seem.
ปีที่แล้ว
Thanks James, very good video about CLIP. Funny thing is that you display twice the cos_sim, so the second time it is not the dot_sim which is displayed. And you fighted to find any difference between the two similarity matrices. LOL 🤣
@jamesbriggs ปีที่แล้ว
ah did I do that, oops 😅
@mvrdara 2 ปีที่แล้ว ⁺¹
Excellent explanation! We can build a TH-cam video search engine powered by clip, perhaps you can iterate on the Nlp TH-cam search video you did?
@jamesbriggs 2 ปีที่แล้ว ⁺¹
That's a great idea, but it might be difficult for TH-cam videos where it is just someone talking, as the image embedding would just be something like "a person talking"
Possibly it could be interesting to embed both the text + images with CLIP, and maybe even an averaged text+image embedding for parts of videos where both the speech + image are important.
I will think about this more, it's a great idea so thankyou!
@valentinfontanger4962 ปีที่แล้ว
Excellent video
@PurpleRivar ปีที่แล้ว
Thanks. It is very informative. Can you pls explain and teach us how to do fine tunning on the custome dataset. Pls
@AdeleHaghighatHoseiniA 2 ปีที่แล้ว
Thank you for the good explanation, if we have 2 different embeddings like texts and 3D images, we can use CLIP to predict images?
@debashisghosh3133 2 ปีที่แล้ว
Really liked the content...thanks for sharing
@jamesbriggs 2 ปีที่แล้ว
Thanks for watching!
@abdirahmann ปีที่แล้ว
is there a hosted API for clip where you can provide your image data and get the vectors instead of having to host it yourself, kinda like how you give an input to `ada-002`?
@Gabriel-ey5ky 2 ปีที่แล้ว
Great video really ! I have just one thing to say, you should let the images longer in the screen I had to pause the video multiple times to be able to understand them
@jamesbriggs 2 ปีที่แล้ว
Thanks Gabriel, I head the same from another viewer - will do this going forwards :)
@sharanbabu2001 2 ปีที่แล้ว
Nice explanation!
@dancinghoka ปีที่แล้ว
Thanks a lot!
@behnamplays 2 ปีที่แล้ว
Excellent content! As a suggestion, can you please keep the images/diagrams a bit longer? They move pretty fast in the video, which means I'll have to rewind the video every now and then.
@jamesbriggs 2 ปีที่แล้ว
Sure that’s great feedback, thanks!
@shaheerzaman620 2 ปีที่แล้ว
fantastic stuff!
@pyalgoGPT 2 ปีที่แล้ว
Plz post on Deep Reinforcement Learning tutorials & projects with python !
@jamesbriggs 2 ปีที่แล้ว ⁺¹
Eventually I’m sure I will, RL is very cool
@debayudhmitra9432 9 หลายเดือนก่อน
can you give the github code please
@mackenzieclarkson8322 9 หลายเดือนก่อน
Transitions are too flashy and triggering to my eyes. Good explainer however.
@davide0965 หลายเดือนก่อน
Too much talk and very few illustrations

ต่อไป

เล่นอัตโนมัติ

OpenAI's CLIP for Zero Shot Image Classification

OpenAI's CLIP for Zero Shot Image Classification

Transformers (how LLMs work) explained visually | DL5

Transformers (how LLMs work) explained visually | DL5

Why Does Diffusion Work Better than Auto-Regression?

Why Does Diffusion Work Better than Auto-Regression?

🔴LIVE กัมพูชา vs ติมอร์-เลสเต | ฟุตบอล ASEAN Mitsubishi Electric Cup™ 2024 | รอบแรก กลุ่ม A

🔴LIVE กัมพูชา vs ติมอร์-เลสเต | ฟุตบอล ASEAN Mitsubishi Electric Cup™ 2024 | รอบแรก กลุ่ม A

总算是用上情侣手机壳了 #玩一种很新的东西 #手机壳 #情侣

总算是用上情侣手机壳了 #玩一种很新的东西 #手机壳 #情侣

ถ้าต้องทำ การบ้าน ตลอดชีวิต? คุณจะเลือกแบบไหน!

ถ้าต้องทำ การบ้าน ตลอดชีวิต? คุณจะเลือกแบบไหน!

ใครคือฆาตกรตัวจริง ?! EP.11 (ver. คืนคริสมาสต์ สุดสยอง !!!

ใครคือฆาตกรตัวจริง ?! EP.11 (ver. คืนคริสมาสต์ สุดสยอง !!!

SPLADE: the first search model to beat BM25

SPLADE: the first search model to beat BM25

Visualizing transformers and attention | Talk for TNG Big Tech Day '24

Visualizing transformers and attention | Talk for TNG Big Tech Day '24

LangGraph 101: it's better than LangChain

LangGraph 101: it's better than LangChain

How AI Reasons | From AlphaGo to ChatGPT

How AI Reasons | From AlphaGo to ChatGPT

AI can't cross this line and we don't know why.

AI can't cross this line and we don't know why.

Variational Autoencoders | Generative AI Animated

Variational Autoencoders | Generative AI Animated

Fast intro to multi-modal ML with OpenAI's CLIP

Fast intro to multi-modal ML with OpenAI's CLIP

Diffusion models from scratch in PyTorch

Diffusion models from scratch in PyTorch

Andrew Ng Explores The Rise Of AI Agents And Agentic Reasoning | BUILD 2024 Keynote

Andrew Ng Explores The Rise Of AI Agents And Agentic Reasoning | BUILD 2024 Keynote

แพนด้าจะไม่ทน #cartoon #cartoonnetwork #short

แพนด้าจะไม่ทน #cartoon #cartoonnetwork #short

🔴LIVE สด! PGC 2024 ศึกชิงแชมป์โลกพับจี Circuit 3 วันที่ 2

🔴LIVE สด! PGC 2024 ศึกชิงแชมป์โลกพับจี Circuit 3 วันที่ 2

#โด่งดัง!ญี่ปุ่นซูฮก บอลอาเซียนเร้าใจ!! โค๊ชสิงคโปร์พูดแบบนี้ถึงไทย!! มาเลย์ขอบคุณไทยที่ให้ชีวิต..?

#โด่งดัง!ญี่ปุ่นซูฮก บอลอาเซียนเร้าใจ!! โค๊ชสิงคโปร์พูดแบบนี้ถึงไทย!! มาเลย์ขอบคุณไทยที่ให้ชีวิต..?

Cool Items!🥰 New Gadgets, Smart Appliances, Kitchen Tools Utensils, Home Cleaning, Beauty #shorts

Cool Items!🥰 New Gadgets, Smart Appliances, Kitchen Tools Utensils, Home Cleaning, Beauty #shorts

Highlight : นายใหญ่ฉุนใคร?

Highlight : นายใหญ่ฉุนใคร?

Cat mode activated 🤣

Cat mode activated 🤣

#อึ้ง!เหลือจะเชื่อ!ไทยพลิกนรกดับสิงคโปร์คาบ้าน ทะลุเข้ารอบรองชนะเลิศ! คารวะอิชิอิโคตรการเปลี่ยนแปลง!

#อึ้ง!เหลือจะเชื่อ!ไทยพลิกนรกดับสิงคโปร์คาบ้าน ทะลุเข้ารอบรองชนะเลิศ! คารวะอิชิอิโคตรการเปลี่ยนแปลง!