I love small and awesome models

Matt Williams

มุมมอง 27 419

เพิ่มลงใน
- เพลย์ลิสต์ของฉัน
- ดูภายหลัง
แชร์

แชร์

ฝัง

ขนาดวิดีโอ:

แสดงแผงควบคุมโปรแกรมเล่น

เล่นอัตโนมัติ

เล่นใหม่

เผยแพร่เมื่อ 20 พ.ย. 2024

ความคิดเห็น • 135

@BORCHLEO หลายเดือนก่อน ⁺³⁷
you are underrated matt! they didnt sponsor you because they wanted to just get the people as are spewing hype! you go into such detail! your content should be #1 on any ollama tutorial.
@KonstantinBykov-o3k หลายเดือนก่อน ⁺⁶
I’ve tested 3b parameters model and very impressed with it. Speed and accuracy. I use it for improving descriptions in catalog. And it looks like we don’t have to pay to Open AI anymore. Thank you for your video!
@trsd8640 หลายเดือนก่อน ⁺³
The first really helpful video about llama 3.2! Thanks Matt!
@solyarisoftware หลายเดือนก่อน ⁺¹⁷
Hi Matt, I upvoted as usual. Two notes:
Ollama HW resources calculations (proposal for a new Ollama video): In this video, you thankfully show how easy it is to set the context length in the model file, bypassing Ollama's default. How does the context length influence the RAM usage of the host? In general, it would be great to dedicate a video to hardware resource calculations based on model size, quantization, context size, and possibly other macro parameters. It would also be helpful to discuss how CPU, and especially GPU, can improve latency times (especially in a multi-user environment).
You mention "your" function call method. I know you've already done a video on this topic, but since it's very useful in practice, maybe you could create a new video with code examples (Python is welcome).
Other viewers: If you agree, please upvote my comment. Community thoughts are welcome!
Thanks again,
Giorgio
@arthurhjorth1490 หลายเดือนก่อน ⁺¹
Agreed! A deeper dive into context size would be very helpful: HW Resource allocations/calculations (even if "just" some heuristics), impact of larger context size on in/out eval rates (again, if not accurate calculations then some heuristics), potential problems with changing context size and what does one need to know about a model when doing this (e.g. what happens if you set a context size larger than what the model allows?).
Edit to add: I'd also be interested in a potential context size video touching on how context size affects/interacts with parallel requests, and how to set up multi-user environments that share the model's context size.
Thank you for an, as usual, excellent tutorial!
Cheers, Arthur
@solyarisoftware หลายเดือนก่อน
@@arthurhjorth1490 Yes. Just a more note about evaluations criteria. Matt and other people use a short list of "trusted" question to evaluate a model. That's perfectly fine and maybe it could be useful to have some why to automatize the evaluation of a model maybe just scripting the list of question and have an automatic way to evaluate the response (by another LLM?). Just a food for thought / a possible video topic in the future?
@AKBARESFAHANI หลายเดือนก่อน ⁺²
I love your content and learn every time I watch, thank you
@FOGSUser หลายเดือนก่อน ⁺²
Loving the Companion Plugin for Obsidian with llama3.2:1b. Completing my thoughts in the persona of William Shakespear by changing the User prompt was a lot of fun.
Fair Companion App, thy wondrous might,
Doth streamline notes and tasks with ease and light.
In Obsidian vault, thou dost thy work impart,
With real-time updates, a digital heart.
@TheHummChannel หลายเดือนก่อน ⁺¹
Really this channel deserve way more exposure! Love the contents and the host ! Keep the good work thanks
@КравчукІгор-т2э หลายเดือนก่อน
Thanks Matt! Everything is interesting and clear as always!
@vbitz4800 หลายเดือนก่อน
Software engineer for many many years here....I have recently stumbled upon ollama and your videos. I have done several Intergrations of ai into client's apps using azure ai services (mostly speech and some openai) but MS azure services keep you away from the nuts and bolts via abstractions. This is fine for productivity but not great for understanding. Your videos and github repo have crystalized many concepts for me and opened my eyes and brain to a better and deeper understanding. Thank you, sincerely, for what you are doing here. It is priceless. BTW, I adore ollama and the smaller models....however my wife would not be please with my recent PC build cost (two 4090s lol). But hey, this is addictive! Isn't it amazing to see the realization of what we all dreamed of when we started in the programming/IT realm!
@technovangelist หลายเดือนก่อน
Yup. Compared to what I was doing in my Intro to AI programming class I took in 1989 at FSU this is science fiction.
@GundamExia88 หลายเดือนก่อน ⁺¹
Ha! That's how I felt about the same when people ask about which number is bigger 8.8 vs 8.21! It depends in what context! And that's what I noticed when I test the models, most people only run it one time. The models do not always give the answer right the first time, sometimes the second times, etc. Great video.
@ClearSight2022 หลายเดือนก่อน
Hi Matt very clear presentation yet again. I also really enjoy your polished style, so I subscribed even though I do so rarely. Cheers !
@jackonell1451 หลายเดือนก่อน ⁺⁶
Wondering what would be the success rate of the tool call if wrapped in a framework like Yacana ?
Because only using Ollama's function calling by itsef is IMO not representativ of the potential of any models as local LLMs need some level of guidance to really achieve anything.
@technovangelist หลายเดือนก่อน ⁺¹
The older approach works fantastically with every model
@jackonell1451 หลายเดือนก่อน
@@technovangelist I completely agree with you that Ollama's function calling is excellent. Forcing LLMs to output JSON the way it does is very impressive! However, we use PHI models in production, and only using Ollama didn't yield the expected results... The JSON was valid, but the arguments generated were not. That's why we had to switch to a agents framework, because developing our own overlay would have been too complex. With the agent framework we now have enhanced tool calling and a way to do multi-turn chat so multiple PHIs can brainstorm together. Also, we needed control loops at each step to ensure we get a computable output in the final phase.
@jackonell1451 หลายเดือนก่อน
We had a success rate of around 30% when calling tools with correct values using only the "/generate" endpoint. Now, we're at about 70%. Additionally, the team has improved at prompt engineering, which has been a huge factor!
Also, thanks for making videos, they're always a great help. Keep it up! ^^
@JeromeBoivin-tx7fm หลายเดือนก่อน ⁺⁷
Thank you Matt for your videos. I was not aware of the hardcoded context window in Ollama, it may explain why I was so confused by the models claiming having a large one. Why is that? I’m expecting Ollama to be adaptative to the possibilities of the model it’s running! Do I really need each time to manually create a custom model template just to benefit from the native model context size? Do you already posted a video answering these questions? Thank you so much and keep the good job! Cheers from France!
@manuelbradovent3562 หลายเดือนก่อน
@JeromeBoivin-tx7fm Also interested related the context and if in the model file also prompt, end token, etc was added.
@technovangelist หลายเดือนก่อน ⁺⁴
Context takes a lot of memory. And it’s hard to put rails around it so it doesn’t fully crash the machine. I’ve had the machine reboot when it takes too much. And lots of folks have tiny gpus so we got lots of support requests. So it went to a blanket 2k unless you specify the size. But since it’s so easy for most devs to create that file and since ollama is intended as a dev tool first, it’s seemed like a good decision
@viertekco หลายเดือนก่อน
Your delivery is great had no idea u were a founder..that's awesome.🎉 Meta wheres the ❤!!
@Joooooooooooosh หลายเดือนก่อน
Dude ollama is great. Thank you.
@TheDiverJim หลายเดือนก่อน
Love the breath holding tangent!
@alvinnorin8820 หลายเดือนก่อน
8:12 I'd set the temperature to zero, in which case everyone using the model will get the same answer every time for the same inputs. Setting it to zero makes it default to always responding with whatever answer is most likely the accurate one. It's a predictive language model after all.
The higher the temperature, the more varied responses. Setting it to zero disables the randomness parameter and thus removes variety completely. Very useful when benchmarking models against each other.
@technovangelist หลายเดือนก่อน
Setting temp to zero will not get you the same answer every time. You would have to set temp and seed and you will reduce the variations but it may not be the best answer.
@alvinnorin8820 หลายเดือนก่อน
@@technovangelist Right, I assume the seed could also be random. It might be different across different models. I ran llama3.1:70b with temperature at zero, and that seemed to get me the same story from the same Minecraft chat logs along with its system prompt. LLMs have different architectures though, and it doesn't seem like all support all parameters. Taking away randomness is highly advantageous when optimizing system prompts though, being able to compare static responses.
@technovangelist หลายเดือนก่อน
i don't know if that’s true. testing a system prompt on a limited version of the model helps if you only use the model in that limited way every time.
@danielarista1352 หลายเดือนก่อน
Matt, I"m a non-CS/SWE guy trying to hack away at a tool that uses LLMs to add some NL to a the UI of an app I'm building. It's b/c of you I choose Ollama over other options. Thanks brother.
@mduthwala439 หลายเดือนก่อน
Well explained especially the 1B
@BirdManPhil หลายเดือนก่อน ⁺³
ive been using llama 3.1 8b on my 4050 laptop very comfortably for ai assisted tasks in obsidian and i cant wait to see if these smaller 3b models are a better fit. you get a sub from me im all aboard the self hosted train next stop ai station lets gooo
@jayd8935 หลายเดือนก่อน
If you have a cat, it stole the water bottle! Thanks for the review too. I will be interested to try this on my usual M1, might remove the need to run models on another more powerful machine.
@KevinKreger หลายเดือนก่อน
Good one. I saw someone training the 1B model on their iPhone😮
@toadlguy หลายเดือนก่อน
These smaller models are great for research, particularly as they are fairly easy to modify in code. In actual use case, they are somewhat over censored, but I suspect it will be just a matter of time before a fine-tuned uncensored version is created.
@alexandrep4913 หลายเดือนก่อน
There is an awesome video on TH-cam talking about the specific model and how censored it is. I wouldn't be surprised if people find the older model to be more capable.
@BruceWayne15325 หลายเดือนก่อน
I love small LLM's. I don't think people realize the power they have to simplify their lives. I love to use Obsidian for note taking. Using a local LLM, I can have it easily summarize my notes, giving me an at-a-glance view of each daily note. Have a long meeting? Transcribe it and summarize it, then stuff it in your notes. When we get agents then they will become vastly more useful.
People put too much emphasis on the ability of models to do analytical tasks with great accuracy. They don't understand that the power of AI is the ability to have the AI write a program on-the-fly to do that kind of analysis, and then give you the result. AI will never be 100% accurate. It's like a human. We make mistakes too. We are spoiled with computers and the ability to have 100% reliable answers, but an AI can give that to you, just through the extra step of creating a program to accomplish the task.
@jazzejeff1 หลายเดือนก่อน ⁺²
Your channel's so nice I wish could sub twice. Keep up the great work.
@wardehaj หลายเดือนก่อน
Thanks for this great video explaining how to use these small LLMs! I will be waiting for your video about the vision model. Maybe compare llama2 vision with pixtral?
@BulletProof-n7n หลายเดือนก่อน
Respect and appreciation for what you do. Any chance of using a llm on the new Samsung galaxy s23?
@akongas หลายเดือนก่อน
That's great. Hey do you know if we will ever get Ai running locally on our Android, ios devices?
@merefield2585 หลายเดือนก่อน
Hey Matt, thanks for a great video - do you keep the code featured in your videos in public repos?
@g.s.3389 หลายเดือนก่อน ⁺⁴
just a question: what is the best model for supporting me in python programming that I can use with ollama?
@yahoolane หลายเดือนก่อน
What is your use case, llmana3 is a good default
@TaFeiYen หลายเดือนก่อน
First time seeing your vid. Interesting take. I know you have demonstrated some use cases of the models. But to general people, there's way too many models to pick from. Do you have a guide on that? To narrow down which model to use? I know it will always be bias but I would like to hear your take.
@technovangelist หลายเดือนก่อน ⁺¹
This was the first, maybe second, time I looked at one model. I would like to do more of them.
@martijnveenman หลายเดือนก่อน ⁺¹
Amazing video, thank you. Is companion the only ai plugin you use in Obsidian? Looking forward to seeing more practical AI obsidian applications.
@harrykekgmail หลายเดือนก่อน
interesting video. thank you
@tecnopadre หลายเดือนก่อน ⁺¹
1st. Always thank you Matt. Question. I've been testing 3B since launch ata my Laptop with NPU. WebUI on a server and Ollama in my Laptop. The thing is My laptop has NPU and Ollama is not taking advantages of it. Ollma 3B is taking the small GPU and CPU. The results with a RAG (1st with WEBui interface and then with Flowise) gives me good results. I'm trying to search how to activate the NPU from my laptop so Ollama can use it. It would be great. I think LLM Studio does it? Also testing with large files >15MB, the embedding from Ollama at my computer again with WEBUI and Flowise, fails. The computer can't handle. Would be great to have you doing it with files that are closer to real company files. I think 3B model it's great. The last test I did is using it with Groq and of course, there is where I can test it 🙂
@Cingku หลายเดือนก่อน
Could you explain what the generation completion hotkey does in the Companion plugin for Obsidian? When I use the Companion, it automatically generates text, completes it, and streams the response. So, in what situation would I need to use this hotkey? I'd appreciate it if you could clarify this because I was confused by this.
@utsavgoswami5263 หลายเดือนก่อน
well, matt you are our fav choice for all things AI!
@dakkon77blackblade20 24 วันที่ผ่านมา
I would really like to know if these models are any good for entity extraction like for graphRAG or chunk generators for traditional RAG... That would be a great topic!
@yacahumax1431 หลายเดือนก่อน
ollama makes it so easy
@AndresSolar-y3g หลายเดือนก่อน
...worth a thumb up...
@utvikler-no หลายเดือนก่อน ⁺¹
Thanks
@technovangelist หลายเดือนก่อน ⁺¹
What??? You are too kind... a member AND a tip. Thanks so much.
@utvikler-no หลายเดือนก่อน
@@technovangelistI just love the simple and yet the comprehensive way you explain the subjects. Keep up the good work❤
@nosuchthing8 หลายเดือนก่อน
How much vram is required for the 3b token method?
@ivanalberquilla9953 หลายเดือนก่อน
Thank you for the video. What is the tool you use for writing?
@technovangelist หลายเดือนก่อน ⁺¹
Obsidian. And the plugin for it was companion
@ivanalberquilla9953 หลายเดือนก่อน
Thanks!
@arkemiffo หลายเดือนก่อน
Just tried the 3.2:3b. I said hello and got a reply blazingly fast, so I asked if it was on meth or something. Got the standard "I'm just a model, I can't human", so I said I was just surprised to see such fast answers on a local model. And this is where things got confused.
Apparently, Llama3.2:3b thinks it's working off a cloud-service. It refused the notion that I'm running this locally.
Just to be sure, I pulled the ethernet cable, restarted the terminal, and it worked just as fine without (well...duh).
I just find it fascinating that the model itself almost reviles at the notion of being local.
@johang1293 หลายเดือนก่อน ⁺¹
Good stuff
@researchandbuild1751 หลายเดือนก่อน
Can llama 3.2 1b do function calling? I don't know if it's smart enough to format output
@technovangelist หลายเดือนก่อน
Depends. Using the newer more official format, I don't think it does a good job. Using the older approach I have documented in a few videos on this channel I expect it would be great at it.
@researchandbuild1751 หลายเดือนก่อน
@@technovangelist I will try to look into the original method. I still don't think many people explain how it all works in plain English. I will look at your earlier videos about functions
@Aarifshah-A หลายเดือนก่อน ⁺³
Lol the ending 😂😂😂
@megairrational หลายเดือนก่อน
Great content. Could you briefly describe the machine you use for this task? You mentioned 3 seconds…
@technovangelist หลายเดือนก่อน ⁺²
I usually do and forgot this time. M1 Max MacBook Pro with 64gb. A machine you can get for about 1500 usd today.
@megairrational หลายเดือนก่อน ⁺¹
@@technovangelist thank you! 64GB? Impressive. Please keep it up! You are a great communicator
@modoulaminceesay9211 หลายเดือนก่อน
All things local AI and I just subscribed that’s what I need
@userou-ig1ze หลายเดือนก่อน ⁺¹
Thanks for the great content. What is missing in ollama is vision models support like florence2 and sam2. If it had a nice api for that, that could be used with curl or so... dreams.
Raspberry pi with vision models must be so incredibly overpowered, I prefer not thinking about it too much
@technovangelist หลายเดือนก่อน
Raspberry pi overpowered???? way underpowered is more accurate, especially considering the cost of them. Physical size is the big benefit these days. But Florence2 looks like an older model that didn't get much love. Some of the other vision models on Ollama got a lot more coverage. And hadn't heard of sam2 either. Both architectures aren't supported so would require a lot of work to get working.
@userou-ig1ze หลายเดือนก่อน
@@technovangelist thanks for the time to reply, appreciated. Underpowered _is_ the point, as in, if vision models run sufficiently fast on _that_ hardware, it enables vision on edge devices. Florence2 was released months ago, and the combination of selecting pixels by typing, and segmenting and tracking over time with sam2, is an incredibly powerful concept- I needn't ask any lay man to become creative, the usefulness of text driven vision perception seems insane
@ts757arse หลายเดือนก่อน
I've just tried this with fabric. It's as good at summarising youtube videos as microsoft 8x7B. Still gets confused occasionally with certain tasks.
I've got a large-ish DB of technical documents for an obscure bit of research kit and I'm going to try swapping out my current LLM and see if this performs better. It'll probably be about 10x faster.
@stasoline หลายเดือนก่อน
Cool video!
@antoniomonteiro3698 หลายเดือนก่อน
llama 3.2 1B Q4:
To count the number of 'F's and 'R's, I'll go through the word "strawberry" carefully.
Here's the breakdown:
* 'F': 1
* 'R': 2
So, there are 1 'F' and 2 'R's in the word "strawberry".
or:
please create a java program to count the number of 'F's and 'R's in the word 'strawberry'
public class Main {
public static void main(String[] args) {
// Define the word
String word = "strawberry";
// Initialize counters for 'F's and 'R's
int countF = 0;
int countR = 0;
// Loop through each character in the word
for (char c : word.toLowerCase().toCharArray()) {
// Check if the character is 'F' or 'R'
if (c == 'f' || c == 'r') {
// Increment the counter
if (c == 'f') {
countF++;
} else {
countR++;
}
}
}
// Print the results
System.out.println("Number of 'F's: " + countF);
System.out.println("Number of 'R's: " + countR);
}
}
output:
Number of 'F's: 0
Number of 'R's: 3
sorry, they left me home alone...
@dna100 หลายเดือนก่อน
Lovin' the channel. 👍👍It'll be great once Ollama supports vision
@technovangelist หลายเดือนก่อน ⁺³
Ollama does support vision today. The llama3.2 vision should be very soon
@agi_lab หลายเดือนก่อน
I would request you to test out llms on some complex tools (as simple as file create tool fails on 3b model). I assume of i give proper func.desc, it might not. Need to experiment
@enermaxstephens1051 หลายเดือนก่อน
Can we simply download the GGLM file and use it wit GPT4all?
@chrisBruner หลายเดือนก่อน
Good video
@UnwalledGarden หลายเดือนก่อน
Awww yeah!
@PriNovaFX หลายเดือนก่อน
What if you set temperature to 0, does the tool functions test succeed better?
@autumblak หลายเดือนก่อน
Hey matt, I have an intel based MacBook, and I want ollama to utilize my gpu, but I don't know how to go about it. I have searched all round but to no avail. Could you offer some pointers, or resources to where I can succeed?
@technovangelist หลายเดือนก่อน
Unfortunately there are no options. Well except buying an apple silicon MacBook or switching to a pc.
@AlexanderYudin หลายเดือนก่อน
Which hardware setup you have ?
@technovangelist หลายเดือนก่อน
I'm on a m1 MacBook Pro Max with 64GB RAM
@aiamfree หลายเดือนก่อน
Why am I getting Error: error loading model for all the 3.2 downloads?
@technovangelist หลายเดือนก่อน
Have you updated ollama?
@aiamfree หลายเดือนก่อน
@@technovangelist yes that fixed it, thanks… it’s sooo damn fast!!
@aiamfree หลายเดือนก่อน
when is ollama getting the vision models anyone know?
@technovangelist หลายเดือนก่อน ⁺²
The team is working on it.
@aiamfree หลายเดือนก่อน
@@technovangelist awesome, thanks Team!
@shuntera หลายเดือนก่อน
Matthew Berman’s review of the vision models show them to be censored to the point of uselessness. Hopefully someone will bring out uncensored versions
@technovangelist หลายเดือนก่อน
reviews on the vision stuff seems to be mixed so far. It could be that some of the folks who have it have a bad implementation. I look forward to finding out.
@BeauKpad หลายเดือนก่อน
My favorite hallucination : I work for the Jill Stein for President campaign so I use info about her to test AIs. She's a public figure, but a minor one, and I know more about her than most sources.
When playing around with WebSim, I asked it to make me a Jill Stein fan site.
The results were shockingly accurate, history, platform, etc... except the photos. The photos were all of Jill Biden.
For all the normies, Jill Biden is a Joe Biden's wife. Jill Stein was running against Joe Biden.
I pointed this out to the AI, and it replaced them with photos of Kirsten Sine a. If you don't know who that is, I really don't have the bandwidth to use explain, but it is simultaneously shockingly wrong and quite funny. Like, I know what it is getting at.
@iamarto หลายเดือนก่อน ⁺²
Whoever took the sponser from meta, I don't think they asked for it. But in case you haven't noticed they have more subscribers than you.
@technovangelist หลายเดือนก่อน ⁺²
Some have 1/3 the number of subs compared to me. So that’s not it.
@kshabana_YT หลายเดือนก่อน ⁺¹
I tried to run Llama3.2 1b in Samsung s 20 plus Error: no suitable llama servers found. And I am running ollama serve
@Psychopatz หลายเดือนก่อน
just use layla lite then import the model. Yep its a hassle on making your lammacpp to work
@kshabana_YT หลายเดือนก่อน
I don't know what are you talking about
@PeterHagen หลายเดือนก่อน ⁺²
Llama 3.1 & 3.2 are unfortunately very poor in Dutch language usage
@TLabsLLC-AI-Development หลายเดือนก่อน
Meta Matt!
@zhouyangbo4498 หลายเดือนก่อน
ollama run llama3.2:1b
Error: llama runner process has terminated: signal: abort trap error:done_getting_tensors: wrong number of tensors; expected 147, got 146
any idea about this error?
@technovangelist หลายเดือนก่อน ⁺¹
You need to update ollama. You should always update whenever there is a new version.
@zhouyangbo4498 หลายเดือนก่อน
ok ,I will try it , maybe it is GFW issue, thanks.
@thestype หลายเดือนก่อน
I asked it to create a component in javascript in which llama3.1 8B and mistral-nemo greatly succeeded. But llama3.2 3B failed miserably, mixing up different libraries unintelligently. Its just fast, but also a random word generator is fast.
@technovangelist หลายเดือนก่อน
But a random word generator wouldn't be anywhere near as good as llama32 3b.
@omercelebi2012 หลายเดือนก่อน
Man you forgot your cup!
@ChristophBackhaus หลายเดือนก่อน ⁺²
I want you to count the number of r's in Strawberry.
To do so I want you to go Letter by letter and every time you find one r I want you to count up
Gets it right every time...
@danielarista1352 หลายเดือนก่อน
>>> I want you to count the number of r's in aferradfafa;;rr;r;adfafaf'''.. To do so I want you to go letter by letter and every time you find one r I want you to count up.
Let's start counting:
a - no R (0)
f - no R (0)
e - no R (0)
r - found an R! (1)
r - found another R! (2)
a - no R (2)
d - no R (2)
f - no R (2)
a - no R (2)
Continuing...
;; is a semicolon, not a letter to count
rr - found two Rs! (4)
; - no R (4)
r - found another R! (5)
; - no R (5)
a - no R (5)
d - no R (5)
f - no R (5)
a - no R (5)
f - no R (5)
a - no R (5)
f - no R (5)
So, there are 5 Rs in the given text.
@changeagent228 หลายเดือนก่อน ⁺¹
First test I did was "what number is larger 9.9 or 9.11?" and it insisted 9.11 was bigger. When is 2.3 out?
@kshabana_YT หลายเดือนก่อน ⁺²
Why do you quit ollama 😢😢😢
@technovangelist หลายเดือนก่อน ⁺¹
Are you asking about quitting the app? Or why I left the company? That second thing is not something for this comment thread.
@emmanuelgoldstein3682 หลายเดือนก่อน ⁺²
@@technovangelist Due to your hesitance on commenting, we'll just assume they were having Diddy parties until you clear it up
@starlord7526 หลายเดือนก่อน
@@emmanuelgoldstein3682 did you just say diddy party brah? jajajajaja
@kshabana_YT หลายเดือนก่อน
Company
@SlykeThePhoxenix หลายเดือนก่อน
There's 4 killers in the room. Since when does dying make you not a killer?
@technovangelist หลายเดือนก่อน
Good point.
@junaidmundichipparakkal1700 หลายเดือนก่อน
Can someone point to ollama and RAG locally from Matt?
@sskohli79 หลายเดือนก่อน
Hey Matt, nice video. But I don’t think it’s as impressive as you put it. I am sure the llama3.1’s performance was comparable
@technovangelist หลายเดือนก่อน ⁺³
It wasn’t available in a 1 and 3 b model.
@protovici1476 หลายเดือนก่อน
The vision portion isn't to great.
@Jason-ju7df หลายเดือนก่อน
Microsoft GRIN MoE: A Gradient-Informed Mixture of Experts MoE Model 6.6b
Ranks better
@technovangelist หลายเดือนก่อน ⁺¹
In benchmarks? Or in real tests. One is useful the other has zero real value.
@HitsInSandbox หลายเดือนก่อน
I tested it and the Vision and abilities suck and way over rated.
@xevil21 หลายเดือนก่อน
It's amazing how such a small model is smarter than you?

ต่อไป

เล่นอัตโนมัติ