HuggingFace Fundamentals with LLM's such as TInyLlama and Mistral 7B

Anthropic MCP with Ollama, No Claude? Watch This!

Merge LLMs to Make Best Performing AI Model

【พากย์ไทย】สาวใช้ในวังจะถูกประหารชีวิต แต่เธอมีฐานะที่ไม่ธรรมดา คือพระราชบุตรีแท้ๆ ของพระราชา!

ใครคือฆาตกรตัวจริง ?! EP.11 (ver. คืนคริสมาสต์ สุดสยอง !!!

หัวหน้าแก๊งพาลูกสาวไปกินไก่ทอด เจอกลุ่มนักเลงหาเรื่อง เลยจัดการพวกนั้นจนพ่ายแพ้

Inside the LLM: Visualizing the Embeddings Layer of Mistral-7B and Gemma-2B

Chris Hay

มุมมอง 6 663

เพิ่มลงใน
- เพลย์ลิสต์ของฉัน
- ดูภายหลัง
แชร์

แชร์

ฝัง

ขนาดวิดีโอ:

แสดงแผงควบคุมโปรแกรมเล่น

เล่นอัตโนมัติ

เล่นใหม่

เผยแพร่เมื่อ 28 ธ.ค. 2024

ความคิดเห็น • 33

@chrishayuk 9 หลายเดือนก่อน ⁺²
this is the github repo: github.com/chrishayuk/embeddings
@sumandawnmobile 9 หลายเดือนก่อน ⁺³
Its an great video to understand the internals via the visualization. Thanks Chris.
@scitechtalktv9742 9 หลายเดือนก่อน ⁺³
Fantastic video !
I am wondering: I think it would also be very interesting to also be able have a visualization of not only the static embeddings you already did, but also a visualization of the so-called contextualized embeddings in a later layer of the model! These are the embeddings that are exposed to the attention mechanism. That why they are also called dynamic embeddings.
It adds another layer of abstraction, but are better embeddings because they are able to distinguish between homonyms: words that are the same but have completely other meanings if used in another context. A good example is the word “bank”, that has several different meanings when used in another context (for example financial institution or river bank and several other meanings! ). As a consequence the word “bank” will be represented by several different vectors in embedding space, depending on the context it is used in!
This technique is called Word Sense Disambiguation (WSD).
Would it be possible to visualize that too? I am curious….
@chrishayuk 9 หลายเดือนก่อน ⁺¹
yep, you got what i'm doing... i'm literally walking the stack
@chrishayuk 9 หลายเดือนก่อน ⁺²
so those videos will be coming
@scitechtalktv9742 9 หลายเดือนก่อน ⁺¹
@@chrishayukFantastic ! Those embeddings are crucially important for the workings of Large Language Models !
@rajneesh31 6 หลายเดือนก่อน ⁺¹
Damn, thank you TH-cam for recommending this channel. @chrishayuk is a gun. Thanks Chris
@chrishayuk 6 หลายเดือนก่อน
Very kind, glad you like the channel
@NERDDISCO 9 หลายเดือนก่อน ⁺⁴
This came to the absolute right time! Thank you very much! I was just trying to understand this. Now I know how it works ❤
@chrishayuk 9 หลายเดือนก่อน ⁺¹
Glad it was helpful!
@guaranamedia 6 หลายเดือนก่อน ⁺¹
Excellent explanation. Thanks for making these examples.
@chrishayuk 6 หลายเดือนก่อน
You're very welcome!
@johntdavies 9 หลายเดือนก่อน ⁺²
Great insight, thanks for posting this. It would be interesting to show how a fine-tuned model differs in similarities and "vocabulary". I'm also curious on the effects of quantisation, i.e. Q4, Q6, Q8, fp16 etc. on the internal "workings" of the LLM. Thanks again.
@chrishayuk 9 หลายเดือนก่อน ⁺¹
It’s almost like you’re reading my roadmap
@andypai 9 หลายเดือนก่อน ⁺¹
Thank you! Great video!
@chrishayuk 7 หลายเดือนก่อน
thank you, glad it was useful
@khalilbenzineb 9 หลายเดือนก่อน ⁺²
I was playing a bit with finetuning to force an output schema for some 7B Models, but lately I discovered schema grammar, which is a way to dynamically play with the EOS tokens, by limiting them to a specific set of tokens, to generate the output you want, This is very stable and way efficient for many cases that we may think it requires finetuning, For me it felt like a new dimension to get the model intentions inline, I loved the unique and efficient way you create your videos, So I wanted to ask you if possible to create a video for us about this, I feel it's very important
@chrishayuk 9 หลายเดือนก่อน ⁺²
that's a good shout
@khalilbenzineb 9 หลายเดือนก่อน
Thx@@chrishayuk
@enlightenment5d 8 หลายเดือนก่อน ⁺¹
Good! Where can I find your programs?
@chrishayuk 7 หลายเดือนก่อน
in my github repo github.com/chrishayuk
@kenchang3456 9 หลายเดือนก่อน ⁺¹
Thanks the visualization really helped me.
@chrishayuk 9 หลายเดือนก่อน ⁺¹
so glad, seeing it at a lower level really demystifies what's going on
@Memes_uploader 9 หลายเดือนก่อน ⁺¹
Thank you so much! Thank you youtube algorithm for showing such a great video!
@chrishayuk 9 หลายเดือนก่อน
Glad you enjoyed it!
@gregherringer7700 9 หลายเดือนก่อน ⁺¹
This helps thanks!
@chrishayuk 9 หลายเดือนก่อน
Glad it helped! :)
@lfzuniga31 9 หลายเดือนก่อน ⁺¹
based

ต่อไป

เล่นอัตโนมัติ

HuggingFace Fundamentals with LLM's such as TInyLlama and Mistral 7B

HuggingFace Fundamentals with LLM's such as TInyLlama and Mistral 7B

Anthropic MCP with Ollama, No Claude? Watch This!

Anthropic MCP with Ollama, No Claude? Watch This!

Merge LLMs to Make Best Performing AI Model

Merge LLMs to Make Best Performing AI Model

【พากย์ไทย】สาวใช้ในวังจะถูกประหารชีวิต แต่เธอมีฐานะที่ไม่ธรรมดา คือพระราชบุตรีแท้ๆ ของพระราชา!

【พากย์ไทย】สาวใช้ในวังจะถูกประหารชีวิต แต่เธอมีฐานะที่ไม่ธรรมดา คือพระราชบุตรีแท้ๆ ของพระราชา!

ใครคือฆาตกรตัวจริง ?! EP.11 (ver. คืนคริสมาสต์ สุดสยอง !!!

ใครคือฆาตกรตัวจริง ?! EP.11 (ver. คืนคริสมาสต์ สุดสยอง !!!

หัวหน้าแก๊งพาลูกสาวไปกินไก่ทอด เจอกลุ่มนักเลงหาเรื่อง เลยจัดการพวกนั้นจนพ่ายแพ้

หัวหน้าแก๊งพาลูกสาวไปกินไก่ทอด เจอกลุ่มนักเลงหาเรื่อง เลยจัดการพวกนั้นจนพ่ายแพ้

ไฮไลท์การแข่งขัน สิงคโปร์ 2-4 ไทย | ฟุตบอล ASEAN Mitsubishi Electric Cup™ 2024

ไฮไลท์การแข่งขัน สิงคโปร์ 2-4 ไทย | ฟุตบอล ASEAN Mitsubishi Electric Cup™ 2024

LCM: The Ultimate Evolution of AI? Large Concept Models

LCM: The Ultimate Evolution of AI? Large Concept Models

Why LLMs Are Going to a Dead End? Explained | AGI Lambda

Why LLMs Are Going to a Dead End? Explained | AGI Lambda

Attention in transformers, visually explained | DL6

Attention in transformers, visually explained | DL6

RAG But Better: Rerankers with Cohere AI

RAG But Better: Rerankers with Cohere AI

How the Gemma/Gemini Tokenizer Works - Gemma/Gemini vs GPT-4 vs Mistral

How the Gemma/Gemini Tokenizer Works - Gemma/Gemini vs GPT-4 vs Mistral

Stanford Webinar - Large Language Models Get the Hype, but Compound Systems Are the Future of AI

Stanford Webinar - Large Language Models Get the Hype, but Compound Systems Are the Future of AI

Andrew Ng Explores The Rise Of AI Agents And Agentic Reasoning | BUILD 2024 Keynote

Andrew Ng Explores The Rise Of AI Agents And Agentic Reasoning | BUILD 2024 Keynote

Mistral 8x7B Part 1- So What is a Mixture of Experts Model?

Mistral 8x7B Part 1- So What is a Mixture of Experts Model?

Practical LLM Fine Tuning For Semantic Search | Dr. Roman Grebennikov

Practical LLM Fine Tuning For Semantic Search | Dr. Roman Grebennikov

Players vs Trophies 🤯

Players vs Trophies 🤯

OHANA บ้าพลัง EP.134 : เกมการ์ดโอฮาน่า X วัยหนุ่ม 2544

OHANA บ้าพลัง EP.134 : เกมการ์ดโอฮาน่า X วัยหนุ่ม 2544

มายคราฟแต่ "น้ำกับลาวา" สลับกัน!?

มายคราฟแต่ "น้ำกับลาวา" สลับกัน!?

หนูกับเต้ รัก ”พี่อู๋จูน“ นะ

หนูกับเต้ รัก ”พี่อู๋จูน“ นะ

Highlight | อัจฉริยะสาวไส้...เบื้องลึกเหตุยิง "สจ.โต้งปราจีนบุรี" | เปิดโต๊ะข่าว | 17 ธ.ค.67

Highlight | อัจฉริยะสาวไส้...เบื้องลึกเหตุยิง "สจ.โต้งปราจีนบุรี" | เปิดโต๊ะข่าว | 17 ธ.ค.67

ทำผิดกฏหมาย 100 ข้อ ในวันเดียว!!

ทำผิดกฏหมาย 100 ข้อ ในวันเดียว!!

หมวกกันน็อค - TaitosmitH |Official MV|

หมวกกันน็อค - TaitosmitH |Official MV|

ทัวร์สตรีมเมอร์ ROV ชิงเงินรางวัลรวม 25,000 บาท 8 ทีม : รอบ 8 ทีม

ทัวร์สตรีมเมอร์ ROV ชิงเงินรางวัลรวม 25,000 บาท 8 ทีม : รอบ 8 ทีม