Are LLaVA variants better than original?

Cracking the Enigma of Ollama Templates

Llama 3.2 VISION Tested - Shockingly Censored! 🤬

การแข่งขัน RoV นานาชาติ AIC 2024 รอบ Swiss Stage วันที่ 4

Cool Items!🥰 New Gadgets, Smart Appliances, Kitchen Tools Utensils, Home Cleaning, Beauty #shorts

เมื่อฉันเป็นคนตัวเล็ก

Llama 3.2-vision: The best open vision model?

Learn Data with Mark

มุมมอง 4 268

เพิ่มลงใน
- เพลย์ลิสต์ของฉัน
- ดูภายหลัง
แชร์

แชร์

ฝัง

ขนาดวิดีโอ:

แสดงแผงควบคุมโปรแกรมเล่น

เล่นอัตโนมัติ

เล่นใหม่

เผยแพร่เมื่อ 11 ธ.ค. 2024

ความคิดเห็น • 18

@marwangs686 13 วันที่ผ่านมา ⁺¹
I think its good enough, but what if it was uncensored?
@focusedstudent464 19 วันที่ผ่านมา
can you make a viedo on comparing the llava 7b vs llam3.211b vision
please
@lovol2 24 วันที่ผ่านมา ⁺³
Not sure what it is with TH-camrs and vision models.
These are not the use cases for business or even pet projects.
If we want OCR, we'll use OCR. What you want is to intelligently answer questions.
E.g. What level is the water at on this reservoir - the image should have a measure stick on the water.
How many boxes are in this image.
Does this look safe or dangerous
Etc.
@learndatawithmark 23 วันที่ผ่านมา ⁺²
Oh interesting, those are good ideas. I suppose you could also do how many parking spaces are free or something like that.
I do use ChatGPT for the first three examples, although admittedly not the last one comparing the footballers. This is the first open model (that I've tried) that pulls code out of an image - all the other ones I've tried start hallucinating at some point.
The TH-cam thumbnail critique as well - I think that's more than just OCR?
I do sometimes use vision models to interpret graphs/charts and compare them to each other, but I haven't tried that with Llama 3.2 vision
@cherubin7th 14 วันที่ผ่านมา ⁺¹
OCR still sucks. Always gets it wrong in real life. But good point!
@MichaelDeeringMHC 13 วันที่ผ่านมา
Face identification is on the list of things the tech companies don't want you doing with the models. It ties into to many of the dystopian futures depicted in fiction.
@learndatawithmark 12 วันที่ผ่านมา
I found most of the Llava models were able to identify famous people at least! I mean it doesn't really matter, but it's interesting that they seem to censor it like this
@notgiven-nv5ix 4 วันที่ผ่านมา
Facial identification would be moving into their lane. How do you think they make all their money with free products? You are the product. Google DeepFace for one example related to meta. (google will auto correct to deepfake so search DeepFace -deepfake)
@jovokrneta1412 23 วันที่ผ่านมา
What is the hardware configuration you are using?
@learndatawithmark 18 วันที่ผ่านมา ⁺¹
I have a Mac M1 Max with 64GB RAM that it splits between the GPU and CPU
@sebingtoon 25 วันที่ผ่านมา
As for the model repeatedly failing to identify Ronaldo in the picture, perhaps lowering the temperature would be an idea? EDIT: I've tried playing with the temperature (same LLM and same image), but it doesn't seem to have a significant effect on the results (except when temp=0, of course). After several runs I'd say Ronaldo is identified about 33% of the time.
@RuairiODonnellFOTO 24 วันที่ผ่านมา
What prompts did you use for the 30% success? Did temp change any behaviour/success?
@sebingtoon 24 วันที่ผ่านมา
@@RuairiODonnellFOTO The prompt I used was a double question, something like "Can you describe the picture? Who is the person depicted?" I've tried a few more runs and now I'd say it's less than 33%, maybe 10% success. Most of the time the model says it cannot provide names of people based on their photograph. As I said, changing the temperature doesn't seem to have a measurable effect on the answers.
@learndatawithmark 23 วันที่ผ่านมา
It's weird why it can't pick it up. IIRC all the Llava models were able to identify him and for every other example that I tried Llama 3.2 vision is better than Llava.
@learndatawithmark 23 วันที่ผ่านมา ⁺¹
I wonder whether it's deliberately not identifying people. It's sometimes even reluctant to say anything at all about a photo e.g. when I give it photos of myself
@sebingtoon 21 วันที่ผ่านมา
@@learndatawithmark I think you're right. It looks like they've done something during training (or after) that makes it behave that way. Obviously, it's not been a total success. No matter how well trained LLMs are, they are still hard to tame!
@RuairiODonnellFOTO 24 วันที่ผ่านมา
Has anyone tried the 90B model to see if it can name messi or ronaldo?
@learndatawithmark 18 วันที่ผ่านมา
I think that would be insanely slow on my machine so I haven't tried it!

ต่อไป

เล่นอัตโนมัติ

Are LLaVA variants better than original?

Are LLaVA variants better than original?

Cracking the Enigma of Ollama Templates

Cracking the Enigma of Ollama Templates

Llama 3.2 VISION Tested - Shockingly Censored! 🤬

Llama 3.2 VISION Tested - Shockingly Censored! 🤬

การแข่งขัน RoV นานาชาติ AIC 2024 รอบ Swiss Stage วันที่ 4

การแข่งขัน RoV นานาชาติ AIC 2024 รอบ Swiss Stage วันที่ 4

Cool Items!🥰 New Gadgets, Smart Appliances, Kitchen Tools Utensils, Home Cleaning, Beauty #shorts

Cool Items!🥰 New Gadgets, Smart Appliances, Kitchen Tools Utensils, Home Cleaning, Beauty #shorts

เมื่อฉันเป็นคนตัวเล็ก

เมื่อฉันเป็นคนตัวเล็ก

🔴 LIVE ศึกมวยไทยพลังใหม่ I 11 ธ.ค. 67

🔴 LIVE ศึกมวยไทยพลังใหม่ I 11 ธ.ค. 67

Llama: The Open-Source AI Model that's Changing How We Think About AI

Llama: The Open-Source AI Model that's Changing How We Think About AI

Llama 3.2 Vision 11B LOCAL Cheap AI Server Dell 3620 and 3060 12GB GPU

Llama 3.2 Vision 11B LOCAL Cheap AI Server Dell 3620 and 3060 12GB GPU

How to Fine-Tune LLama-3.2 Vision language Model on Custom Dataset.

How to Fine-Tune LLama-3.2 Vision language Model on Custom Dataset.

QWEN 2.5 Coder (32B) LOCALLY with Ollama, Open WebUI and Continue

QWEN 2.5 Coder (32B) LOCALLY with Ollama, Open WebUI and Continue

How good is llama 3.2 REALLY? Ollama SLM & LLM Prompt Ranking (Qwen, Phi, Gemini Flash)

How good is llama 3.2 REALLY? Ollama SLM & LLM Prompt Ranking (Qwen, Phi, Gemini Flash)

Llama 3.2 just dropped and it destroys 100B models… let’s run it

Llama 3.2 just dropped and it destroys 100B models… let’s run it

Why You Should Think Twice Before Using Returns in Python

Why You Should Think Twice Before Using Returns in Python

Llama 3.2 Vision + Ollama: Chat with Images LOCALLY

Llama 3.2 Vision + Ollama: Chat with Images LOCALLY

Local GraphRAG with LLaMa 3.1 - LangChain, Ollama & Neo4j

Local GraphRAG with LLaMa 3.1 - LangChain, Ollama & Neo4j

เจ้าของค่ายงง ถูกนักร้องสาวหักหลัง #Shorts #ดาวจรัสฟ้า(รีรัน) | oneม่วนม่วน

เจ้าของค่ายงง ถูกนักร้องสาวหักหลัง #Shorts #ดาวจรัสฟ้า(รีรัน) | oneม่วนม่วน

sisters checkk👩🏻‍🌾 ≽^• ˕ • ྀི≼/🩷 #wiwawawowtv #siblings #sister #shorts #dance #dancechallenge

sisters checkk👩🏻‍🌾 ≽^• ˕ • ྀི≼/🩷 #wiwawawowtv #siblings #sister #shorts #dance #dancechallenge

The Mysterious Lips of ISSEI💋

The Mysterious Lips of ISSEI💋

🔴LIVE เมียนมา vs อินโดนิเซีย | ฟุตบอล ASEAN Mitsubishi Electric Cup™ 2024 | รอบแรก กลุ่ม B

🔴LIVE เมียนมา vs อินโดนิเซีย | ฟุตบอล ASEAN Mitsubishi Electric Cup™ 2024 | รอบแรก กลุ่ม B

RoV : เปิดศึก!!ชนตี้แอดวีแอบเรียกกิตงายมาช่วย งานนี้จบไม่สวย!!

RoV : เปิดศึก!!ชนตี้แอดวีแอบเรียกกิตงายมาช่วย งานนี้จบไม่สวย!!

🔴LIVE สิงคโปร์ vs กัมพูชา | ฟุตบอล ASEAN Mitsubishi Electric Cup™ 2024 | รอบแรก กลุ่ม A

🔴LIVE สิงคโปร์ vs กัมพูชา | ฟุตบอล ASEAN Mitsubishi Electric Cup™ 2024 | รอบแรก กลุ่ม A

LIVE🔴 : Timor-Leste vs Thailand | ASEAN Championship 2024 | 08.12.24

LIVE🔴 : Timor-Leste vs Thailand | ASEAN Championship 2024 | 08.12.24

HIGHLIGHTS : Timor-Leste 0-10 Thailand | ASEAN Championship 2024 | 08.12.24

HIGHLIGHTS : Timor-Leste 0-10 Thailand | ASEAN Championship 2024 | 08.12.24