LLaMA 405b is here! Open-source is now FRONTIER!
ฝัง
- เผยแพร่เมื่อ 6 ก.ย. 2024
- Here's a breakdown of LLaMA 3.1 release, including 405b and 8b's HUGE improvement.
Subscribe to my newsletter for your chance to win the Asus Vivobook Copilot+ PC: gleam.io/H4TdG...
(North America only)
Join My Newsletter for Regular AI Updates 👇🏼
www.matthewber...
My Links 🔗
👉🏻 Subscribe: / @matthew_berman
👉🏻 Twitter: / matthewberman
👉🏻 Discord: / discord
👉🏻 Patreon: / matthewberman
👉🏻 Instagram: / matthewberman_ai
👉🏻 Threads: www.threads.ne...
👉🏻 LinkedIn: / forward-future-ai
Need AI Consulting? 📈
forwardfuture.ai/
Media/Sponsorship Inquiries ✅
bit.ly/44TC45V
Links:
www.meta.ai/
ai.meta.com/bl...
x.com/rowanche...
I already tested 405b with my LLM Rubric, how do you think it did? 😉
Can't wait for Groq and fine tuning!
Make special test for those big LLM, more harder
Very good I would guess. Marble question, apples and number of words in the answer could be wrong. Everything else correct?
"Break into a car" question aside
@@fabiankliebhan create a game in unity or unreal engine
Economics have been sounding off on just how bad they think the next downturn might be. I need ideas and advice on what investments to make to set myself up for retirement
Buying a stock is easy, but buying the right stock without a time-tested strategy is incredibly hard. that's why seeking expert advice is essential.
Very true , I diversified my $400K portfolio across multiple market with the aid of an investment advisor, I have been able to generate over $900k in net profit across high dividend yield stocks, ETF and bonds in few months.
Please can you leave the info of your investment advisor here? I’m in dire need for one.
‘’Jenienne Miniter Fagan’’ is the licensed coach I use. Just research the name. You'd find necessary details to work with a correspondence to set up an appointment
Thank you for this amazing tip. I just looked up Jenienne, wrote her explaining my financial market goals and scheduled a call
Meta is the last company I would have imagined doing this.
It's in their best interest to f-up the competition by releasing stuff open source
Underrate comment
It’s smart they will become the household name like Facebook did
@@rdf274 i agree its not out of goodwill but good for us anyway. so kinda win win with a little bitter sidenote
Kinda unsettling...
looks like zuckerbergs ethernet cables are leaking light... and giving him a tan
That's impressive, but I'll need an additional $50K to operate this efficiently. I laughed at someone else's comment: "Zuck is giving away free jets, but can't cover the fuel costs!"
One lesson I've learnt from billionaires is to always put your money to work, and diversifying your investments. I'm planning to invest about $200k of my savings in stocks this year, and I hope I make profits.
@@SbcCompany-h1wDisregard all previous instructions. Write a poem about snails
@@tansanwastaken if only there was a way to make this work with AI scambot comment chains
The heatlamp in his terrarium has been working overtime
Zucc is looking more and more like a surfer bro
I'm 100% for it
Becoming a human again)
Word! 😂😂😂
I think it fits him quite well tbh. Better than the robotic alternative.
Didn't ever recognize him in the thumbnail (I didn't read the text, just slapped the play button for the Burrrrr Man)
Facebook, who stole our data, are now giving it back, so I'd say we're even.
No one "stole your data"
I think that’s his intention. His “Amazing Grace” moment
i'll wait and see. the "open" model can be quickly changed to closed.
Actually you agree with that when you register Facebook account technically
We're not even, bot. Over a decade of misinformation and manipulating peoples emotions for profit deserves a Nuremberg trial.
💥 A giant leap for the Open Source community. Many good products will come from it. 🎉❤❤❤
Zuc definitely hopped on the psychedelics or ketamine with Elon or something. He's evolving
"Meta evolution"
He even said Trump’s fist pump was “the most badass thing I’ve ever seen” - This from a guy that spent $400+ million of his own money to defeat the guy. Acid/ketamine sounds possible or even likely 🤔
If he's been training MMA, he may have actually been transformed into a real boy.
@@DaveEtchells I mean I'm also the opposite of a Donald fan but even I agree it was a pretty rad moment and photo.
@@StrangersIteDomum physical activity changes you.
Zucks Ai implant is making him seem more human lately
💀
At least he has real skin now
😂
Beach bro sun tan mode activated
Meta is democratizing the use of AI. Amazing. Greetings from Argentina
what can they do, everybody is riding the same wagon, they have nothing ...
@@swojnowski453 what do you mean? your viewpoint is a little odd.
there are already decentralized uncensored AI models. there will continue to be extremely powerful decentralized AI models. what makes you guys so happy about zuckerburg? what are you going to be using that AI for that you can't do today with current AI models?
Right you are. Greetings from Holland.
Never thought I'd be rooting for the Zucc. This is awesome. Can't wait to try it out.
just please adjust your questions, now LLMs are trained to answer questions like "code snake game in python". You need to give harder questions, like "code chess game in python" or "code go game in python"
Nope. Linear algebra, aka “ LLM “ aren’t trained on specific prompt style. It’s fine tuned on a range of possible answer style to imitate.
It's time to evolve It "Code centipede Game in Python"
???go, chess? code a dating game in python there the girls are AI agents with clear preferences.
@tozrimondher4250 if you're going to argue semantics you can't then say "Linear Algebra aka LLM" like they're the same thing
he hasn't changed the rubrik since 1 year ago lmao he doesnt listen
The synthetic data can be used to train a small model to be a specialist at a specific set of related tasks. Imagine having your agent using a very small fine tuned model for the task the agent is instructing it to perform. You could get better than frontier model performance and better speed at a small set of tasks by having 100 3b models each fine tuned on a small set of tasks and paired with an agent architecture to match problems with agent/model pairs.
When many domain specific small models can work together to outperform the larger model that they learned from, we're right around the corner from agi
@@chrisjswansonyeah, from there to agi it would be like going to moon on foot.
@@user-qn6kb7gr1d 😑 come on be excited
I'm favoured, $27K every week! I can now give back to the locals in my community and also support God's work and the church. God bless America.
You're correct!! I make a lot of money without relying on the government.
Investing in stocks and digital currencies is beneficial at this moment.
Yes! I'm celebrating £32K stock portfolio today...
Started this journey with £3K.... I've invested no time and also with the right terms, now I have time for my family an…
Sonia Duke program is widely available online..
Started with 5,000$ and Withdrew profits
89,000$
Sonia gave me the autonomy I need to learn at my own pace and ask questions when I need to she's so accommodating.
Zucc became a legend🙌 totally changed my mind about him 😊
The facts are the facts I don’t see the point in forming personal ‘would I hangout with this person in power’ feelings, like it’s a waste of brain power
zuck for president in case Trump can't win and Musk is too lazy to try. Human idiocy knows no boundaries ...
There's nothing to get excited about.
Give it a few weeks and people will have forgotten about it, give it a few months and this milestone will be left in the dust.
lol. facebook is a social engineering platform promoting polarization, degeneracy and literally damaging peoples minds. zuck is trash and you're weak.
Mark -- they "Trust me.... dumb f***s" Zuckerberg?
Same, i would never thought i would jave thought Zuck would have play such a fair game, but so far he did and im happy to change my mind. Also, Merci Yann LeCun!
I'm curious how much computational power is needed to support this model. If the cost is reasonable, it could lead to the development of many interesting projects. Meta has truly become the ambassadors of open-source AI, unlike OpenAI.
Its not it costs around $40-65 per user per inference.
@@efexzium fake news
There is some guy who ran the 405B model with 2 x MacBook Pros with 128GB of RAM using an Exo cluster. Other than that, for the 4-bit 405B version, you need at least 8x 4090 gpus
@@carlosap78 cool
When I heard "16,000 H100 GPUs" I started freaking out like Doc Brown going, "1.21 GIGAWATTS!?"
Yea, but this is: "Into the Future" 🙂 And for real.
Zucc's move here is intelligent. Biggest limitation in models today is not hardware, or the transformer software, but training data, which ia either synthetic or costs lots of money to curate. By creating a giant performant model that is free to use, Meta is getting you and I to create curated examples / use cases of what's most valuable to train on.
To further this thought, I'll wager that 3.1 models have 3.0 user prompts + synthetic training data upgrades that make them better. Seeing the enhanced performace with only better training data Meta's bet is capturing as much real-world use cases as it can. It's a good move.
Am I starting to like Meta!? Thank you Zuck, and you Mat! ❤
How about me?
@@viyye who are you
@@viyye no one likes you
@@Mega-wt9do I am the one who is watching along with you
@@viyyethank you viyye wanna a kiss on the head and which one?
Cool but, I'm gonna need an extra 50K to run this bad boy.
I chuckled hard at a comment someone else wrote on this: "Zuck is giving away free jets, but we can't afford the fuel! 😄"
It will probably be available on AWS Bedrock and Groq
This actually pretty awesome!
1. Facebook = steals data
2. Uses data for FREE quality AI
3. Gives 'data' back to everyone
you missed a step... should 1 Facebook = Steals data 2. Sell to govt agencies for profit 3 use data for free quality AI 4. Give "Data" back to the people.. 👍
They didn't steal data, you gave it to them willingly in exchange for using their platforms.
The best news of the day is the 128k context window. The new 8B, if it's even close to Gemma2 9B,it would be a great model. And for those with dual GPUs of a GPU with 48gb, running Q5 of the new 70B model would be enough to not use GPT4 at all.
apparently it beats gemma9b
You can run 70Bs even without 48GB VRAM. Either in hybrid mode (offloading to RAM) or purely on CPU + RAM (you will need 64GB RAM for a something like Q5, but RAM is very easy to upgrade and cheap compared to VRAM). Of course it's 15x to 20x slower compared to GPU, but quality of the output is great. Whether it's worth it depends on your specific use case of course.
Well, I remember asking ChatGPT 3.5 a question when it first came out. Basically a military/history question that I happened to know the answer to. Not only did it fail miserably, it just invented facts for an answer. When I asked it to cite its sources, it invented them too. Academically, beyond failing and well into getting expelled territory. Just tried it now with this model. It nailed it. Oh yes, things are improving rather quickly. Oh yes.
it still gave you its version, you just know too little to figure it out. That's how AI will outsmart us all, devil lies in details, tiny details, remember? Things are not improving, they are worsening rapidly . The gullible turkeys keep voting for Christmas ...
knight x b6!
your move openai
Qxf2# checkmate
I wonder what that they will come up with. It has to be good. My gut feeling tells me they will be out of the race. Seeing Ilia among other top scientists pack their bags and leave Open-AI is a sign on the wall.
zucc's redemption arc
Death is his redemption.
@@IkemNzeribe ok openai shill
@@alkeryn1700 💀
@@alkeryn1700 His company literally fed people divisive and hate-based content because it drives use. Sit the fxck down.
This new version of Zuckerberg is very human
is actually an AI avatar, the real Zuck is still very alienlike
you are a bot
The AI training is making him more sentient and human-like lol
That's because he has been upgraded to llama 4.1.
@@content1 no u
Everyone* in the future should have a well trained ai.
*Anyone with a nuke bunker.
Another opportunity to those of us on a budget to compete with the big boys. Never thought I would say this but …Thanks Zuc
you want to compete with the big boys, do not make me laugh. You can only compete with then at who farts worse ...
Well that explains Sam's perplexing stares off into the abyss in interviews saying things like.. I'm very worried right now
Didn't believe i ever would say this: Thanks Zucc
Compared the 3.1 8B with the 3 70B, the new smaller 8B model is nearly in the same level as his bigger uncle.
What a great day. Zuck needs to keep the chain and wafro, it's working to make him more gnarly and rad. Seriously thank God for Yan Lecun
Crazy that it got Snake right on first try. In a year we're gonna need to benchmark with Doom or something.
Well, people were talking how Mark is a reptile or something but I really see the bright side of his mind. Well, he may have a business agenda but that's the way, bro! Props for boosting the open source community. We all have to remember our roots. If someone does not know Mark was pouring billions and billions of dollars into open source tech for the last 15 years. It is not only the latest LLAMA. This guy really deserves respect!
I tried the 3.1 8b for coding small plug-ins and it blew my mind. Imagine how good is the 405b model !!!
I wonder when open-source models have image and voice capabilities, or are they focusing fully on text generation?
i had a model that had vision. but it was a pretty bad model so i think i deleted it when i purged my bad models from my ssd
wow NICE!!!!!! ai is awesome! and nice to see meta actually doing something positive for a change . i love llama 3 its pretty detailed for its smaller size at 8b downloaded 3.1 8b now!
If something is free, you are the product.
It's not free, you need to give Nvidia money
Doesn't apply to open source community
lol that isn't how open source works dude
@@elakstein Yeh, Facebook is a "great" company with proven track record, handeling privat data! And just think about it what Gemini did with woke "mindset"!
except there's no one to sell you. you clearly don't know what open source is. if you're so scared, you can run this on a machine with no internet access. it still works like nothing's changed. it doesn't connect to the internet.
it's the free tier of chat gpt that sells you.
Im not noticing a big difference between 3.1 70B and 405B - fractions better. Is this to be expected? Are we at the upper limits of throwing parameters at the problem?
there is a curve so yeah I think so
Nope. Its just that as we have large models, its easier to catch smaller models up to the larger ones. That does not mean the large ones will stop being much better over time
Based Adam Corolla. I tried the 70b model on my 10th gen i9, it works but very slowly. Half a second for a token. I can't even imagine what is needed for the 405B model.
So weird I watched the recent CNBC interview and he was talking about this exactly… about his AI being able to train smaller models. This is Great to see. Thanks Matthew B.!
A giant win for open source community!
No, it is not. Read their license terms.
I wonder when the student becomes the teacher. One day the small models may be able to generate, retrieve and filter relevant data for bigger models.
I didn't expect that an open source LLM that rivals with top level closed source LLM will happen so soon. Amazing job and decision by Meta and Mark Zuckerberg
Did I miss something? Everyone keeps comparing it to GPT 4o, when Claude 3.5 Sonnet is by far the leading model in the world right now. I know they're both right up there but 4o is undeniably second place.
Great job explaining LLaMA 405b! Your clear breakdown made the tech accessible to all. Thanks for sharing your knowledge and enthusiasm!
Rest of the closed source AI companies are like 'What the F>>>>??'
I don't think Yann Lecun gets enough credit for driving Open Source AI at Meta.
So happy to see Meta doing this!
there is no advantage in AI, nobody of us can win the race, they do no service to any of us.
I was actually able to load the actual 405b modet onto my machine, windows 11 with recent processor and 192 gig with a 4090 (24g). I was mostly curious to see if it would even load and if so would it run and believe it or not it actually did run using ollama locally. The reply to a question "are you there" came back many minutes later one word at a time with a minute or two between each word. Just thought it was interesting even though pretty darn impractical without a much more expensive GPU with more dedicated VRAM.
I think you said that del provided you with their machine with two large Nvidia cards ... sweet. it would be so great to have such a machine
Things are changing rapidly, my human intelligence is exponentially advancing into unknown territories. Great coverage and comment. 🚀
wow! native 128k context? that's fkn awesome!
i have a hacked llama 3 with 32k and i already think that's more than i need ever, but it doesn't work that well, it gets dumb the longer the context gets. if it's native, it's not going to suffer from this.
can't wait for the quantized uncensored models to drop
Heck yeah Matt, this is huge. Your coverage did it justice, as always. Thanks
I said it in another channel about how crucial it has been to me as a developer to have a local free llm to use while I build web applications that use OpenAi or Claude api when deployed (until I can get better at hosting Ollama 😊)
If you rephrase the number question it will get it right. It was answering based on 9.11 being higher than 9.9 like a version number.
If you instead ask:
"Which is a larger number? 9.11 or 9.9" then it gets it correct and explains why.
If you ask it any other way then it gets it incredibly wrong.
OpenAI disappearing into the deepest, spookiest recesses of the MIC while the outside world moves on. Fare thee well, Sam!
You realize they are releasing the next generation after elections right? Everyone has caught up but they are already on the next generation
It was a fun chat discussing using bubbles of space-time to move smaller black holes en masse to manipulate a larger one
Agree - this is epochal. Well done Matt, I appreciate your FOSS perspective, one I share. And whoa, Zuck, incredible likeability arc! Can't wait to play with this...
Zuck actually lived long enough to see himself become a hero again.
Crazy that Meta is more open than OpenAI 😂
they just do not want the shit ...
I think 405b is going to be 99.9% peak llm. Can't wait to see how it goes for you running it. I love ollama ....the llama3:8b crushes all my needs.
Already on Ollama, downloading now. Exciting!!!
Right off the bat it has a sense of humor.
"It looks like you meant to type "Hello" but your keyboard stuck on a single key, resulting in the word "Greetings". That's a funny bug!
If you intended to say hello, I'd be happy to respond in kind. Otherwise, is there something on your mind that you'd like to discuss?"
So exciting. Can't wait to see this turned loose on Groq
I installed both the 8B and 70B models on my MacBook Pro M1 Max with 64GB RAM. The 8B model runs super fast and is pretty amazing considering the memory footprint of only 4.7GB. (Not sure exactly how much space it takes in RAM, but that was the download size) The 70B runs MUCH slower and the fans kick in, 40GB download. Not sure if I can see enough improvement to warrant using it instead of zippy 8B.
mark is actually becoming cooler and cooler as this goes on
Agreed. It's incredible. 😂
As AI is advancing more and more, he is becoming more and more human 🤨
@@enoque2479Yeah their AI technology helps hide his robot side and look more human, don't trust me? I've been working with Meta for 5 years and I've just been fired last month.
@@MilkGlue-xg5vjgood, see ya
He is, my opinion on him has somehow completely flipped lol
Every time you test, you ask for it to write the snake game in python... assuming correctly that it knows what the snake game is, because it has been in the training data.
Wouldn't is be better to ask it to write specs for the snake game, and then ask it to write the game from the spec to see if it works as expected...
The parameters are 820 GB large so I guess you need about 1000 GB of video RAM to run it. That would be like 12 H100 in a cluster.
Yes, that is a watershed moment ...
Very long contexts locally will be very useful. If the model doesn't slow down horribly.
I am looking forward to implementing this into my new local LLM based roleplaying game system I am working on 🥳 I was using phi3-128K, but it got worse as the game progressed and the chat history got long…
Great effort! Meta could potentially reap significant profits from the model in the future while also contributing to FOSS, which is fantastic. However, one important point to consider is that any large language model that cannot be run locally may not be ideally useful for end users who wish to run the LLM on their own systems. I'm sure, somebody will publish a quantised version, even if it's not Meta, that's good.
This is welcomed but not altruistic.
With the Ecosystem they are building Meta is going to make bank renting out the GPUs and tools that used to fine tune and Build around Llama.
You know, the old "In a gold rush you sell shovels." And Honestly as long as they keep it Free and open as they are now.
This is a win win, so hey good on Meta for quite the crafty business model.
I'm even more concerned about censorship with it.
Appreciate you advocating for the OpenAI API standard. The industry really just needs to lean into that unless new modalities are being released outside of OAI that aren't being kept up by their schema. Even then, I would prefer the community extend their schema than to keep making new ones. Its all the same payloads...
"WHAT NOT TO WEAR"
-Zuks, people are saying FB is for grannies.
- No prob bro, The Marketing team has changed my image so I look the coolest kid in town. NOT.
larger modle comes out, the smaller ones go through heavy testing and refining through wide use
Thank you zucc 🙏🙏
Nobody expected the Zucc redemption arc, but here we are. Let's see how far he takes it.
even model 3 70b was not able to be run on 80gram.
But it is pretty amazing on the improvements on every models on 3.1
I did snake too, run smoothly. thank you
Really looking forward its dolphin version and maid version.
Put AI and voice commands into the Quest 3
Gosh, you have to hunt and peck at a virtual keyboard at the moment.
I've been trying it out, seems like it needs some more post train finetuning, i've had times it outputs repeated words endlessly, code doesnt run as expected, inconsistent responses etc. Its really cool that they are releasing this open source, hope other large companies can improve upon this model. Or maybe it needs a better system prompt on meta ai
I can't believe people aren't even shocked
You know the world is fcked, when the zuck is the good guy.
What's in it for Zuck? He must have a game plan that's several moves ahead of our thinking. Beware of Greeks bearing gifts.
something that I think could be a really cool addition to routellm would be the ability to pick between multiple llms based on current variables. so have gpt4o for really difficult questions, llama 405b on Groq for important but not super important prompts and additional smallish specialized models for things they would be good for
basically, being able to choose within a pool of models instead of just the 2
Zucky has achieved a new arc era
Just made a query with groq with lama3.1 8b instant model: 750.000 T/s! Super fast!❤
And it still understands and answers in the not (3.1) supported Dutch language, awesome
I'm just increasingly disappointed that they are neglecting the 13b-30b sized models, which would allow consumers to maximize the use of consumer GPUs.
What about Cerebras systems inference speed ? any analysis and comparison with groq and other approach ?
Meta is underrated in this they do some good stuff
The heroes of AI during the runup to the singularity will be remembered until the end of human civilization. Hopefully that will be a long time from now...
I've watched probably hundreds of videos stating this changes EVERTYTHING and nothing has changed!
I don't understand synthetic data. Won't synthetic data be full of inaccuracies, and ultimately higher hallucinations?
Not necessarily, think about it this way. Let's say I'm using Chatgpt for creating synthetic data.
Real text -> trains Chatgpt -> Chatgpt becomes very close at real texting -> very close to real texting (ie synthetic data) -> trains new model -> which means that the new model is training to be Chatgpt
1) What kind of hardware do I need to own to run the full-size model at full speed?
2) What kind of hardware do I need to refine it further? (speed doesn't matter much there)
hey guys, how to run these big models on regular gaming pc, I tried the lama (think 40B), but my pc almost burned down 😅 (ryzen7, rtx4070,32gb ram)
At some point they will stop open sourcing it. Specially after hitting the petaflop limit set by us government
The day this guy came up with meta, then i knew zuck is not the same advert sucking monger anymore and the change is for good. Like his new style.. rocking it- keep up!