Mixture of Agents TURBO Tutorial 🚀 Better Than GPT4o AND Fast?!

แชร์
ฝัง
  • เผยแพร่เมื่อ 30 มิ.ย. 2024
  • Here's how to use Mixture of Agents with Groq to achieve not only incredible quality output because of MoA, but to solve the latency issue using Groq.
    Check out Groq for Free: www.groq.com
    UPDATE: You don't need a valid OpenAI API key for this tutorial.
    Be sure to check out Pinecone for all your Vector DB needs: www.pinecone.io/
    Join My Newsletter for Regular AI Updates 👇🏼
    www.matthewberman.com
    Need AI Consulting? 📈
    forwardfuture.ai/
    My Links 🔗
    👉🏻 Subscribe: / @matthew_berman
    👉🏻 Twitter: / matthewberman
    👉🏻 Discord: / discord
    👉🏻 Patreon: / matthewberman
    👉🏻 Instagram: / matthewberman_ai
    👉🏻 Threads: www.threads.net/@matthewberma...
    👉🏻 LinkedIn: / forward-future-ai
    Media/Sponsorship Inquiries ✅
    bit.ly/44TC45V
    Links:
    github.com/togethercomputer/MoA
  • วิทยาศาสตร์และเทคโนโลยี

ความคิดเห็น • 173

  • @matthew_berman
    @matthew_berman  7 วันที่ผ่านมา +32

    Is this the ultimate unlock for open source models to compete with closed source?

    • @punk3900
      @punk3900 7 วันที่ผ่านมา +2

      Groq is amazing... I read their comments on Nvidia, and there is clearly a huge potential in changing the architecture for LLM ASICs. Yet, Nvidia would like to sell one chip to rule them all rather then butchering their sota universal chipch. Yet, my last thought is that Nvidia surely secretly works on a LLM specific chip and they will show it once the competition becomes real. Thanks Matt for sharing your findings.

    • @lucindalinde4198
      @lucindalinde4198 7 วันที่ผ่านมา +1

      @matthew_berman
      Great video

    • @jeffg4686
      @jeffg4686 7 วันที่ผ่านมา

      And they're STILL not talking about socialism yet...

    • @dahahaka
      @dahahaka 7 วันที่ผ่านมา

      Hey, what is that display that you have in the background? The one showing different animations including snake?

    • @olafge
      @olafge 6 วันที่ผ่านมา

      I wonder how much the token count increase diminishes the cost efficiency against frontier models. Would be good to add tiktoken to the code.

  • @3dus
    @3dus 7 วันที่ผ่านมา +41

    This is a serious oportunity for Groq to just employ this transparently to the user. They could have a great competitor to frontier models.

    • @wurstelei1356
      @wurstelei1356 7 วันที่ผ่านมา +4

      Yea, Groq should host this MoA from Mat. Would be great.

  • @MilesBellas
    @MilesBellas 7 วันที่ผ่านมา +29

    Tshirt merchandise :
    "I am going to revoke these API keys"
    😂😅

    • @matthew_berman
      @matthew_berman  7 วันที่ผ่านมา +4

      Such a good idea!!!

    • @nikhil_jadhav
      @nikhil_jadhav 7 วันที่ผ่านมา

      @@matthew_berman Since I saw the keys exposed, I was just waiting for you to say these lines. Once you said it I felt relieved.

    • @punk3900
      @punk3900 7 วันที่ผ่านมา

      @@matthew_berman It was kind of cruel to mention this the second time you showed those keys :D

    • @matthew_berman
      @matthew_berman  7 วันที่ผ่านมา +1

      @@nikhil_jadhav Lol!! I'll mention it as soon as I show them next time ;)

  • @starmap
    @starmap 7 วันที่ผ่านมา +18

    I love that open source local models are so powerful that they can compete with the giants.

    • @mikezooper
      @mikezooper 7 วันที่ผ่านมา

      Not really though. If you have ten small engines driving a car, that doesn’t mean one of those engines is impressive.

    • @StemLG
      @StemLG 7 วันที่ผ่านมา +5

      ​@@mikezooper sure, but you're missing the fact that those engines are free

    • @wurstelei1356
      @wurstelei1356 7 วันที่ผ่านมา +1

      @@mikezooper Also, OpenAI is using something similar to MoA.

    • @TheRealUsername
      @TheRealUsername 7 วันที่ผ่านมา

      It's just that the proprietary models have hundreds of billions of parameters, compared to open source models which are 3b-70b.

    • @omarhabib7411
      @omarhabib7411 6 วันที่ผ่านมา +1

      @@TheRealUsernamewhen is llama 3 400B coming out?

  • @nomad1220
    @nomad1220 7 วันที่ผ่านมา +5

    Hey Matthew - Love your vids - they are tremendously informative.

  • @artnikpro
    @artnikpro 7 วันที่ผ่านมา +13

    I wonder how good it will be with Claude 3.5 Sonnet + GPT4o + Gemini 1.5 Pro

    • @punk3900
      @punk3900 7 วันที่ผ่านมา

      But Groq will not run it. It can run only open source models as they just provide the infrastructure

    • @user-ku6oq8cn6m
      @user-ku6oq8cn6m 7 วันที่ผ่านมา

      ​ @punk3900 It is true Groq will not run it. However, the MoA code already seems to let you run any cloud LLM with an OpenAI formatted endpoint. And there are solutions already available to turn most cloud LLMs into an OpenAI formatted endpoint (in the cases where one is not already provided). Personally, I don't care if it is really slow (much slower than Groq). I still want to try combining a mixture of the best models (including proprietary cloud LLMs) already out there.

  • @scitechtalktv9742
    @scitechtalktv9742 7 วันที่ผ่านมา +3

    Great! I will certainly try this out

  • @jackflash6377
    @jackflash6377 7 วันที่ผ่านมา +2

    Thanks for the clear and concise instructions.
    Worked flawlessly first time.
    Now we just need a UI to work with it, including an artifacts window of course

  • @titusblair
    @titusblair 3 วันที่ผ่านมา

    Great tutorial thanks so much Matthew!

  • @MrMoonsilver
    @MrMoonsilver 7 วันที่ผ่านมา +12

    Hey Matt, remember GPT-Pilot? Apart from revisiting the repo (it's done amazingly well) there is an interesting use case in relation to MoE. Remember how GPT-Pilot calls an API for its requests? Wouldn't it be interesting to see how it performs if it were to call a MoE "API"? It would require to expose the MoE as an API, but it would be very interesting to see as it would enable developers to piece together much cheaper models to achieve great outcomes, the likes of 3.5 sonnet.

    • @matthew_berman
      @matthew_berman  7 วันที่ผ่านมา +2

      Good idea. I’ve seen CrewAI powering a coding AI project which looked interesting!

    • @14supersonic
      @14supersonic 7 วันที่ผ่านมา +1

      This would be perfect for agentic wrappers like Pythagora, Devin, Open-Devin. Paying for those expensive API's for the frontier models is not always the best option for most end-users. Especially when you're working with lots of experimental data. This could be something that could be relatively simple to implement too.

    • @MrMoonsilver
      @MrMoonsilver 7 วันที่ผ่านมา +1

      @@14supersonic it might even be more accurate than the standard APIs

    • @wurstelei1356
      @wurstelei1356 7 วันที่ผ่านมา +1

      @@MrMoonsilver Plus the privacy is much higher.

    • @frinkfronk9198
      @frinkfronk9198 6 วันที่ผ่านมา

      @@wurstelei1356privacy def better as long as it's fully local. running groq is still cloud. not that you were suggesting otherwise. I just mean generally to the conversation people should be aware

  • @wardehaj
    @wardehaj 7 วันที่ผ่านมา +1

    Awesome instructions video. Thanks a lot!

  • @RosaLei
    @RosaLei วันที่ผ่านมา

    Stoked you're using Groq! 🙌 The speed is mind-blowing! IMO, the results quality have been useful to me.

  • @vikastripathiindia
    @vikastripathiindia 4 วันที่ผ่านมา

    Thank you. This is brilliant!

  • @AhmedMagdy-ly3ng
    @AhmedMagdy-ly3ng 7 วันที่ผ่านมา

    Wow 🤯
    I really appreciate your work, keep going pro ❤

  • @Dmitri_Schrama
    @Dmitri_Schrama 7 วันที่ผ่านมา +3

    Sir, you are a legend.

  • @MrBademy
    @MrBademy 5 วันที่ผ่านมา

    this setup is actually worth it.... good video man :)

  • @MrLorde76
    @MrLorde76 7 วันที่ผ่านมา

    Nice went smoothly

  • @punk3900
    @punk3900 7 วันที่ผ่านมา +4

    I wonder how asking the same model multiple times with different temperatures might work. For instance, you ask the hot head, you ask the cold head, and a medium head integrates this. I think most LLM models would hate it this way, bu it's clearly the future that we will have a unified frameworks with several LLM's asked several time and some model integrating these answers. No single model can compensate for integrating data from several sources.

  • @manuelbradovent3562
    @manuelbradovent3562 5 วันที่ผ่านมา

    Great video Matthew! Very useful. Previously I had problems using crewai and max tokens with groq and will have to check how to resolve it.

  • @labrats-AI
    @labrats-AI 7 วันที่ผ่านมา +1

    Groq is awesome !

  • @kevinduck3714
    @kevinduck3714 7 วันที่ผ่านมา +1

    groq is an absolute marvel

  • @EROSNERdesign
    @EROSNERdesign 7 วันที่ผ่านมา

    AMAZING AI NEWS!

  • @Pthaloskies
    @Pthaloskies 7 วันที่ผ่านมา +2

    Good idea, but we need to know the cost comparison as well.

  • @juanpasalagua2402
    @juanpasalagua2402 6 วันที่ผ่านมา

    Fantastic!

  • @nexusphreez
    @nexusphreez 7 วันที่ผ่านมา +1

    This is awesome. What would be even better is getting a GUI setup for this so that it can be used more for coding. I maytry this later.

  • @drlordbasil
    @drlordbasil 7 วันที่ผ่านมา +3

    We need to start comparing things to claude 3.5 sonnet too!
    But I love MoA concept.

  • @zeeveener
    @zeeveener 7 วันที่ผ่านมา +2

    All of these enhancements could be something you contribute back to the project in the form of configuration. Would make it a lot easier for the next wave of users

  • @danberm1755
    @danberm1755 6 วันที่ผ่านมา

    Thanks much! I might actually give that a try considering you did all the heavy lifting.
    Seems like the AI orchestrators such as Crew AI or LangChain should be able to do this as well.

  • @marcusk7855
    @marcusk7855 5 วันที่ผ่านมา

    Wow. That is good.

  • @robboerman9378
    @robboerman9378 7 วันที่ผ่านมา +1

    If Matthew with his insane local machine can’t compete, I am convinced Groq is the way to go for MoA. Super interesting to see how it nails the most difficult task in the rubric and fast! ❤

  • @punk3900
    @punk3900 7 วันที่ผ่านมา +2

    Oh boy, Groq's inference time is truly impressive. As most guys, however, I thought you were talking about Grok. Groq is in fact just mostly Llama on steroids. It's a pitty they cant offer larger models so far. But seeing how Groq works gave a glimpse of the speed of LLM chatbots in a year or two.

    • @matthew_berman
      @matthew_berman  7 วันที่ผ่านมา +4

      Llama 405b I assume is coming

    • @jewlouds
      @jewlouds 7 วันที่ผ่านมา +1

      @@matthew_berman I hope you are 'assuming' correctly!

  • @KCM25NJL
    @KCM25NJL 5 วันที่ผ่านมา

    Little tip:
    conda create -n python= && conda activate

  • @BigBadBurrow
    @BigBadBurrow 7 วันที่ผ่านมา +4

    Just glancing at your headshot, I thought you were wearing a massive chain, like a wideboy from the Sopranos. Then I realised it's just the way your T-shirt is folded 😂

  • @IntelliAmI
    @IntelliAmI 3 วันที่ผ่านมา

    Matthew, how are you? Groq hosted another one, I inserted it in this version of MoA that you taught us. Now, with 5 LLMs at the same time. The new model is gemma2-9b-it.

  • @thays182
    @thays182 7 วันที่ผ่านมา

    Need the follow up tho. What is mixture of agents? How can we use it? Do we get to edit the agentic framework and structure ourselves? What possibilities now exist with this tool? I need moooore! (Amazing video and value, I never miss your posts!)

  • @burnt1ce85
    @burnt1ce85 7 วันที่ผ่านมา +9

    The title of your video is misleading. Your tutorial shows how to setup MoA with Groq, but you haven't demonostrated how it's "Better Than GPT4o AND Fast". Why didnt you test the MoA with your benchmarks?

    • @hotbit7327
      @hotbit7327 7 วันที่ผ่านมา

      Exactly. He likes to exaggerate and sometimes mislead, sadly.

    • @tonyclif1
      @tonyclif1 5 วันที่ผ่านมา

      Did you see the question mark after the word fast? Sure a little click bait, but also misread by you it seems

  • @GodFearingPookie
    @GodFearingPookie 7 วันที่ผ่านมา +11

    Groq? We love local LLM

    • @matthew_berman
      @matthew_berman  7 วันที่ผ่านมา +11

      Yes but you can’t achieve these speeds with local AI

    • @InsightCrypto
      @InsightCrypto 7 วันที่ผ่านมา +1

      @@matthew_berman why not try that in your super computer

    • @Centaurman
      @Centaurman 7 วันที่ผ่านมา +1

      Hi Matthew if someone wanted to build a home server on a 5grand budget do you reckon a dual 3090 set up could?
      If not, how might a determined enthusiast make this fully local?

    • @HansMcMurdy
      @HansMcMurdy 7 วันที่ผ่านมา

      You can use local Language Models but unless you are using a cust asic, the speed will be reduced substantially.

    • @shalinluitel1332
      @shalinluitel1332 7 วันที่ผ่านมา

      @@matthew_berman any local, free, and open-source models which have the fastest inference time? which is the fastest so far?

  • @vauths8204
    @vauths8204 7 วันที่ผ่านมา

    see now we need that but uncensored

  • @braineaterzombie3981
    @braineaterzombie3981 7 วันที่ผ่านมา +2

    What if we use mixture of gaint models like 4o with claude opus and sonnet 3.5 and other state of art models with groq combined

  • @AlfredNutile
    @AlfredNutile 7 วันที่ผ่านมา +1

    Just a thought if you fork the repo then upload your changes we could just download your fork and try it out?

  • @psychurch
    @psychurch 7 วันที่ผ่านมา +7

    Not all of those Apple sentences make sense Apple.

    • @oguretsagressive
      @oguretsagressive 7 วันที่ผ่านมา +2

      Sadly, even my favorite Llama 3 botched sentence #4. Maybe this test should specify that the output should be grammatically correct? Or make sense? Apple.

    • @wurstelei1356
      @wurstelei1356 7 วันที่ผ่านมา

      @@oguretsagressive A valid sentence has to meet certain criteria. The AI should keep track of this and not just output blah blah Apple.
      Even if you don't explicitly tell it to produce valid sentences ending with Apple.

  • @l0ltaha
    @l0ltaha 7 วันที่ผ่านมา

    Hey @matthew_berman , How would I go about having this whole process in a Docker container or have it as an API endpoint to where I can then connect Groq's speech to text, and that returned text is what gets passed to the prompt of MOA? Thanks and love the vids!

  • @Pregidth
    @Pregidth 7 วันที่ผ่านมา +1

    How many tokens are used for calling openAI API? Would be wonderful, if you could show how to leave OpenAI out. And full benchmark test please. Thanks Matthew!

  • @DavidJNowak
    @DavidJNowak 6 วันที่ผ่านมา

    Excellent explanation of how to write code that uses GROQ as a manager of a mixture of agents. But you just went too fast for me to catch all your changes that make it all work. Could you write this in your newsletter or create a video for the slow, metal programming types? Thanks again. Groq AI is making programming more accessible for non-power users like most of us.

  • @nikhil_jadhav
    @nikhil_jadhav 7 วันที่ผ่านมา

    Trying right away!! Thank you very much.. I wonder if I expose my personal mixture of agents to someone else and they use my model as their primary model?? Or thousands of such a model interconnected to each other.. a mesh of a model looped within themselves what will happen??

  • @Ha77778
    @Ha77778 7 วันที่ผ่านมา +1

    I love that , i hat open ai 😅

  • @positivevibe142
    @positivevibe142 7 วันที่ผ่านมา +1

    What is the best local private AI models RAG app?

  • @kai_s1985
    @kai_s1985 6 วันที่ผ่านมา +1

    The biggest limitation for Groq is API limit. After some use, you cannot use it anymore.

  • @KeithBofaptos
    @KeithBofaptos 7 วันที่ผ่านมา

    I've been thinking about this approach also. Very helpful vid. Thanks.🙏🏻.
    I'm curious if on top of MoA if MCTSr would get the answer closer to 💯? And once SoHu comes online how awesome are those speeds gonna be?!

    • @KeithBofaptos
      @KeithBofaptos 7 วันที่ผ่านมา

      This is also interesting:
      www.nature.com/articles/s41586-024-07421-0.pdf

  • @nikhil_jadhav
    @nikhil_jadhav 7 วันที่ผ่านมา

    Just wondering how can I use this groq setup in Continue??
    ]

  • @wholeness
    @wholeness 7 วันที่ผ่านมา

    Quietly this is what Sonnet 3.5 is and the Anthropic secret. That why the API doesn't work well with using so much function calling.

    • @Kutsushita_yukino
      @Kutsushita_yukino 7 วันที่ผ่านมา

      where in the heck did you hear this rumor

    • @blisphul8084
      @blisphul8084 7 วันที่ผ่านมา

      If that were the case, streaming tokens would not work so well. Though having multiple models perform different tasks isn't a bad idea. That's probably why there's a bit of delay when starting an artifact in Claude.

  • @vash2698
    @vash2698 7 วันที่ผ่านมา

    Any idea if this could be used as a sort of intermediary endpoint? As in I point to a local machine hosting this script as though it were a locally hosted model. If this can perform on par with or better than gpt-4o with this kind of speed then it would be incredible to use as a voice assistant in Home Assistant.

  • @engineeranonymous
    @engineeranonymous 7 วันที่ผ่านมา

    When TH-camrs has better security than Rabbit R1. He revokes API keys.

  • @42svb58
    @42svb58 7 วันที่ผ่านมา

    How does this compare when there is RAG with structured and unstructured data???

  • @paul1979uk2000
    @paul1979uk2000 7 วันที่ผ่านมา

    I'm wondering, have any test been done with much smaller models where you can have 2 or 3 running locally on your own hardware to see if it improves quality over any of the 2 or 3 on their own?
    I ask because with how APU's are developing, dedicating 20-30-40GB or more to A.I. use wouldn't be that big of a deal with how cheap memory is getting.

  • @sammathew535
    @sammathew535 7 วันที่ผ่านมา

    Can I make an "API" call to this MoA and use it say, with DSPy? Have you ever considered making a tutorial on DSPy?

  • @vikastripathiindia
    @vikastripathiindia 4 วันที่ผ่านมา

    Can we use OpenAI as the main engine along with Groq?

  • @jay-dj4ui
    @jay-dj4ui 7 วันที่ผ่านมา

    nice AD

  • @techwiththomas5690
    @techwiththomas5690 6 วันที่ผ่านมา

    Can you explain how this many layers of models actually know HOW TO produce the best answer possible? How do they know what answer is better or more correct?

  • @shrn680
    @shrn680 4 วันที่ผ่านมา

    would there be a way to integrate this with a front end like openwebui?

  • @millerjo4582
    @millerjo4582 7 วันที่ผ่านมา +2

    Is there any chance you’re gonna look into the new algorithmic way to produce LLM’s, this is a transformers killer supposedly, I would think that that would be really relevant to viewers.

    • @matthew_berman
      @matthew_berman  7 วันที่ผ่านมา +2

      Name?

    • @millerjo4582
      @millerjo4582 6 วันที่ผ่านมา

      @@matthew_berman also thank you so much responding. Ridgerchu/Matmulfree

    • @millerjo4582
      @millerjo4582 6 วันที่ผ่านมา

      @@matthew_berman I don’t know if you got that.. it looks like the comments were struck.

  • @dr.ignacioglez.9677
    @dr.ignacioglez.9677 7 วันที่ผ่านมา

    Ok 🎉

  • @consig1iere294
    @consig1iere294 7 วันที่ผ่านมา

    How is this any different than Autogen or CreawAI?

  • @husanaaulia4717
    @husanaaulia4717 7 วันที่ผ่านมา

    We got Moe, MoA, CoE, is there anything else?

  • @ollibruno7283
    @ollibruno7283 7 วันที่ผ่านมา

    But it cant process pictures?

  • @piotr780
    @piotr780 4 วันที่ผ่านมา

    why use conda and not pip ?

  • @4.0.4
    @4.0.4 7 วันที่ผ่านมา +1

    "4. The old proverb says that eating an apple a day keeps the doctor away apple."
    🙃

  • @4NowIsGood
    @4NowIsGood 7 วันที่ผ่านมา +2

    Interesting but I don't know WTF You're doing but it looks great. Unless there's an easier setup and install, for me right now it's easier to just use chat GPT.

    • @FriscoFatseas
      @FriscoFatseas 7 วันที่ผ่านมา

      yeah im tempted but by the time i get this shit working gpt4 will get a random update and be better lol

  • @rinokpp1692
    @rinokpp1692 7 วันที่ผ่านมา

    What the coast of running this in a one million contect input and output

  • @badomate3087
    @badomate3087 7 วันที่ผ่านมา

    Has anyone created a comparision between MoA and other multi-agent systems that can utilize LLMs? (Like Autogen) Because to me, this looks exactly like an Autogen network with a few simplifications, like no code running, and no tool or non-LLM agent usage.
    So, if this is not better, or even worse than Autogen, then it might not worth to use it. Since Autogen has a lot more features (like the code running, which was mentioned in the last video).
    Also, the results compared to a single (but much bigger) LLM, looks kinda obvious to me. Since the last modell receives a lot of proposed outputs, next to the prompt, and it only has to filter the best ones. This task is a lot easier than generating the correct one, for the first time, only from the prompt. And with the base idea behind MoA is this, the results are expectable.

  • @harshitdubey8673
    @harshitdubey8673 6 วันที่ผ่านมา +1

    I tried asking MoA
    if A=E, B=F, C=G, D=H
    Then E=?
    It got it wrong 😂
    MoA’s answer:- “I”
    But it’s amazing 🤩

    • @wrOngplan3t
      @wrOngplan3t 6 วันที่ผ่านมา

      Would be my answer as well. What's wrong about that?

    • @harshitdubey8673
      @harshitdubey8673 6 วันที่ผ่านมา +1

      @@wrOngplan3t E is predefined

    • @harshitdubey8673
      @harshitdubey8673 6 วันที่ผ่านมา

      Logic not always be a sequence it could be circle ⭕️ sometimes.

    • @wrOngplan3t
      @wrOngplan3t 6 วันที่ผ่านมา

      @@harshitdubey8673 Ah okay, well there's that 🙂 Thanks!

  • @mohl-bodell2948
    @mohl-bodell2948 7 ชั่วโมงที่ผ่านมา

    Could you put your version in a github repo?

  • @Officialsunshinex
    @Officialsunshinex 7 วันที่ผ่านมา +1

    Brave browser local llm test o.o

  • @D0J0Master
    @D0J0Master 4 วันที่ผ่านมา

    Is goq censored?

  • @ErickJohnson-qx8tb
    @ErickJohnson-qx8tb 7 วันที่ผ่านมา +1

    all about AI ragmodel running uncensored v2w/ MOA using groq MOA LIBRABRY blackfridays gpts library ENOUGH SAID YOUR WELCOME I WOTE MY OWN API KEY ON OPEN GUI i built LOLOLOL

  • @rudomeister
    @rudomeister 7 วันที่ผ่านมา

    Thats why (according to the small agents vs response-time) Microsoft specially, with all the others have whole datacenters trying to find out how a swarm of millions of small agents can work seamlessly. What else should it be? A giant multi-trillion parameter model with the name Goliath? haha

  • @GoofyGuy-WDW
    @GoofyGuy-WDW 7 วันที่ผ่านมา +1

    Groq? Sounds like Grok doesn’t it?

    • @blisphul8084
      @blisphul8084 7 วันที่ผ่านมา

      Groq had the name first. Blame Elon Musk.

  • @Dave-nz5jf
    @Dave-nz5jf 7 วันที่ผ่านมา +1

    There's so many advances coming so fast with all of this, I wonder if the real value in your content is the rubric. Or, more accurately, improving your rubric. Right now I think it's barely version 1.0, and it needs to be v5.0. And try adding a medical or legal question for gosh sakes.

  • @ryanscott642
    @ryanscott642 6 วันที่ผ่านมา

    This is cool but can you write some real multi-document code with these things? I don't need to make 10 sentences ending in apple. Most the things I've tried so far can't write code and I struggle to figure out their use.

  • @BizAutomation4U
    @BizAutomation4U 7 วันที่ผ่านมา +1

    I just read that GROQ no longer wants to sell cards directly for 20K but instead wants to offer a SaaS model ? This seems to contradict the benefits of running LLMs locally for privacy reasons, because now you're sending tokens out to a 3rd party web-service. I don't know why this is an either / or decision. LOTS of SMBs can afford to invest 20K in hardware. Total outlay for a serious LLM rig would have been something like 30K, which is barely half a year's salary to an entry level position. Bad move I say, but the good news is there will be competitors that correct this decision if Groq doesn't, and soon !

    • @cajampa
      @cajampa 7 วันที่ผ่านมา

      Dude, you have misunderstood how groq works. Look into the details, you need maaaaaany cards to be able to fit a model. Look into the details, it is fast because is very little but very fast memory on every card.
      So you need a lot if cards to fit anything useful but then you can batch run requests at crazy speeds against those servers.

    • @BizAutomation4U
      @BizAutomation4U 7 วันที่ผ่านมา +1

      Ok ... What about the whole privacy thing which is the reason people want to run local LLMs. If there is an iron-clad way to prove to most people that using a Groq API for inference is not going to risk sharing data with a 3rd party, you might have a great business case (it's too deep for me technically to know), otherwise you end up with a different dog with the same fleas.

    • @blisphul8084
      @blisphul8084 7 วันที่ผ่านมา +1

      ​@@cajampayeah, it seems that's the reason they offer very few models. It'd be great if you could host other models on Groq, like Qwen2, as well as any fine-tunes that you'd want to use, like Magnum or Dolphin model variants.

    • @cajampa
      @cajampa 6 วันที่ผ่านมา

      @@BizAutomation4U If a business want to run Groq because they need the speed they can offer. I am pretty sure Groq can offer them an isolated instance of the servers for the right price. Groq was never about consumers running local LLM. The hardware is just not catered to this use case at all in anyway.

    • @cajampa
      @cajampa 6 วันที่ผ่านมา

      @@blisphul8084 I say the same to you, if a business want to run Groq with their choice of models I am pretty sure Groq can offer it to them for the right price.

  • @user-td4pf6rr2t
    @user-td4pf6rr2t 7 วันที่ผ่านมา

    This is called Narrow AI.

  • @hqcart1
    @hqcart1 7 วันที่ผ่านมา

    dude, all your videos talking about beating gpt-4o, and we haven't seen any!

  • @sanatdeveloper
    @sanatdeveloper 7 วันที่ผ่านมา

    First😊

  • @tamelo
    @tamelo 7 วันที่ผ่านมา

    Groq is terrible, worse than GPT-3.
    Why do you keep chilling for it?

    • @matthew_berman
      @matthew_berman  7 วันที่ผ่านมา +2

      Groq isn’t a model, it’s an inference service. They have multiple models and offer speeds and prices that are far better than anyone else.
      I really like Groq.

    • @TheAlastairBrown
      @TheAlastairBrown 7 วันที่ผ่านมา

      There are two different companies/products. One is GROQ and the other is GROK. The one spelled with a "q" is what Matt is talking about, they are essentially a server farm designed to run 3rd party opensource LLM's quickly so you can cheaply offload computation. The one spelled with a "k" is Elon Musk/X's version of ChatGTP.

  • @christosmelissourgos2757
    @christosmelissourgos2757 7 วันที่ผ่านมา

    Honestly Matthew why do advertise this? It has been months and we still are stuck with their free package with a rate limit that you can bring no app to production yet . Waste of time and integration

  • @LeandroMessi
    @LeandroMessi 7 วันที่ผ่านมา

    Second

  • @Tubernameu123
    @Tubernameu123 7 วันที่ผ่านมา +1

    Groq is too filtered/censored... too shameful not courageous.... too weak impotent.....

  • @Heisenberg2097
    @Heisenberg2097 7 วันที่ผ่านมา +5

    Groq is nowhere near CGPT or Claude... and all of them need a lot of attention and are far away from SAI. There is currently only SUPER-FLAWED and SUPER-OVERRATED.

    • @greenstonegecko
      @greenstonegecko 7 วันที่ผ่านมา +1

      I'd need to see a benchmark for proof.
      These models are super nuanced. They might score a 0/10 on task A, but a 9/10 on task B.
      You cant generalize to "they suck".
      These models can already pass the Turing Test. People cannot differentiate ChatGPT 3.5 from actual humans 54% of the time.

    • @ticketforlife2103
      @ticketforlife2103 7 วันที่ผ่านมา

      Lol they are far away from AGI let alone ASI

    • @lancemarchetti8673
      @lancemarchetti8673 7 วันที่ผ่านมา +4

      Groq is not an LLM as such, it's a booster for AI models. Like a turbo switch to get results faster. By loading your model into Groq, you save around 80% of the time you would have spent without it.

    • @Player-oz2nk
      @Player-oz2nk 7 วันที่ผ่านมา

      @@lancemarchetti8673 thank you lol i coming to say this

    • @4.0.4
      @4.0.4 7 วันที่ผ่านมา +4

      This is like someone saying a car dealership is nowhere near the performance of a Honda Civic in drag racing. It only communicates you're a bit new to this.

  • @annwang5530
    @annwang5530 7 วันที่ผ่านมา +2

    You are gaining weight?

    • @Kutsushita_yukino
      @Kutsushita_yukino 7 วันที่ผ่านมา

      are you his parents?

    • @AI-Rainbow
      @AI-Rainbow 7 วันที่ผ่านมา

      Is that any of your business?

    • @blisphul8084
      @blisphul8084 7 วันที่ผ่านมา

      He didn't criticize, just pointed it out. Better to know earlier than late while it's easier to fix.

    • @annwang5530
      @annwang5530 7 วันที่ผ่านมา +1

      @@blisphul8084yeah, today pointing out anything is seen as an attack cuz glass society

  • @finalfan321
    @finalfan321 2 วันที่ผ่านมา

    too technical too much effort unfriendly interface

  • @flb5078
    @flb5078 7 วันที่ผ่านมา

    As usual, too much coding...

    • @TheAlastairBrown
      @TheAlastairBrown 6 วันที่ผ่านมา

      copy the github files, and the transcript from this video, into claude. tell it to follow the transcript, and try to create what Matt is doing.

  • @ManjaroBlack
    @ManjaroBlack 6 วันที่ผ่านมา

    I finally unsubscribed. Don’t know why it took me so long.

  • @onewhoraisesvoice
    @onewhoraisesvoice 7 วันที่ผ่านมา

    @matthew_berman
    Attention, you didn't revoke keys before publishing this video!

  • @fnice1971
    @fnice1971 9 ชั่วโมงที่ผ่านมา

    how about? MoA-Openrouter