Google’s NEW Open-Source Model Is SHOCKINGLY BAD

แชร์
ฝัง
  • เผยแพร่เมื่อ 27 ก.ค. 2024
  • Sorry for the title. I couldn't help myself. I'm proud of Google for releasing a completely open-source model to the world, but it's not good. How bad is it? Let's find it out!
    Enjoy :)
    Join My Newsletter for Regular AI Updates 👇🏼
    www.matthewberman.com
    Need AI Consulting? ✅
    forwardfuture.ai/
    Rent a GPU (MassedCompute) 🚀
    bit.ly/matthew-berman-youtube
    USE CODE "MatthewBerman" for 50% discount
    My Links 🔗
    👉🏻 Subscribe: / @matthew_berman
    👉🏻 Twitter: / matthewberman
    👉🏻 Discord: / discord
    👉🏻 Patreon: / matthewberman
    Media/Sponsorship Inquiries 📈
    bit.ly/44TC45V
    Links:
    huggingface.co/google/gemma-7...
    blog.google/technology/develo...
    bit.ly/3qHV0X7
    lmstudio.ai/
    huggingface.co/chat/
    Chapters:
    0:00 - It started badly...
    0:53 - All about Gemma
    7:23 - Quick Note on Gemini 1.5
    9:56 - Gemma Setup with LMStudio
    11:51 - Gemma Testing with LMStudio
    20:58 - Gemma Testing with HuggingFace
    Disclosures:
    I'm an investor in LMStudio
  • วิทยาศาสตร์และเทคโนโลยี

ความคิดเห็น • 528

  • @ALFTHADRADDAD
    @ALFTHADRADDAD 5 หลายเดือนก่อน +237

    Google SHOCKS and STUNS the Open source landscape

    • @matthew_berman
      @matthew_berman  5 หลายเดือนก่อน +87

      I should have used this title

    • @TechRenamed
      @TechRenamed 5 หลายเดือนก่อน +10

      Lol we all should have!!

    • @mickelodiansurname9578
      @mickelodiansurname9578 5 หลายเดือนก่อน +8

      @@matthew_bermanI thought at one stage you were literally going to start slapping your forehead off the keyboard!

    • @andersonsystem2
      @andersonsystem2 5 หลายเดือนก่อน +6

      Why does most Ai tech channels use that title 😂I just don’t pay attention to titles like that lmao 😂😊

    • @Kutsushita_yukino
      @Kutsushita_yukino 5 หลายเดือนก่อน +11

      its a meme at this point

  • @Batmancontingencyplans
    @Batmancontingencyplans 5 หลายเดือนก่อน +106

    Gemma 7b makes you realise how much compute Google is using just to output sorry I can't fulfill that request 🤣

    • @MilkGlue-xg5vj
      @MilkGlue-xg5vj 5 หลายเดือนก่อน

      LMFAO

    • @markjones2349
      @markjones2349 4 หลายเดือนก่อน

      So true. Uncensored models are just more fun.

    • @MilkGlue-xg5vj
      @MilkGlue-xg5vj 4 หลายเดือนก่อน

      @@markjones2349 you're talking as if the point of uncensored llms is fun rofl lmfao xd you're just makin' it funnier 🤣

  • @bits_of_bryce
    @bits_of_bryce 5 หลายเดือนก่อน +88

    Well, I'm never trusting benchmarks without personal testing again.

    • @richoffks
      @richoffks 5 หลายเดือนก่อน +9

      sorry you had to learn this way

    • @wilburdemitel8468
      @wilburdemitel8468 5 หลายเดือนก่อน

      welcome to real life. Can't wait for you to leave the fantasyland bubble all these tech aibros have built around you.

  • @Lukebussnick
    @Lukebussnick 5 หลายเดือนก่อน +101

    My funniest experience with Gemini pro was, I asked it to make a humorous image of a cartoon cat pulling the toilet paper off the roll. It told me that ethically couldn’t because the cat could ingest the toilet paper and it could cause an intestinal blockage 😂

    • @laviniag8269
      @laviniag8269 5 หลายเดือนก่อน +3

      histerical

    • @matikaevur6299
      @matikaevur6299 5 หลายเดือนก่อน +1

      @@laviniag8269
      but true . .

    • @MilkGlue-xg5vj
      @MilkGlue-xg5vj 5 หลายเดือนก่อน +5

      Maybe a cat could see the image and do the same

    • @Lukebussnick
      @Lukebussnick 5 หลายเดือนก่อน +4

      @@MilkGlue-xg5vj haha yeah that would be a real nuisance. But then again, that’s one smart cat. What other potential could that cat have?? 🧐

    • @MilkGlue-xg5vj
      @MilkGlue-xg5vj 5 หลายเดือนก่อน +10

      @@Lukebussnick Maybe it could become an ai dev at Google

  • @zeal00001
    @zeal00001 5 หลายเดือนก่อน +44

    In other words, there are now LLMs with mental challenges as well...

    • @AlexanderBukh
      @AlexanderBukh 5 หลายเดือนก่อน +8

      That's inclusive, alright.

    • @StriderAngel496
      @StriderAngel496 5 หลายเดือนก่อน

      and diversive@@AlexanderBukh

  • @drgutman
    @drgutman 5 หลายเดือนก่อน +50

    I'm pretty sure they lobotomized it in the alignment phase :)))

    • @hikaroto2791
      @hikaroto2791 5 หลายเดือนก่อน +7

      To the point they took the lobotomy fragment and used it in place of the brain, and trashed the actual brain. Not only on models, but on personnel probably

  • @bigglyguy8429
    @bigglyguy8429 5 หลายเดือนก่อน +133

    You can't gimp the model with excessive censorship, and also have an intelligent model.

    • @aoolmay6853
      @aoolmay6853 5 หลายเดือนก่อน +22

      These are not open models, these are woke models, appropriately liberal.

    • @eIicit
      @eIicit 5 หลายเดือนก่อน +2

      To a point, I agree.

    • @madimakes
      @madimakes 5 หลายเดือนก่อน +3

      The nature of the errors here seem irrelevant to being censored or not.

    • @bigglyguy8429
      @bigglyguy8429 5 หลายเดือนก่อน +11

      @@madimakes No, the censorship sucks up so much of it's thinking there's little left to actually answer. You can ask the most banal question but it sits there thinking long and hard about if there's any way that could possibly be offensive to the woke? Considered the woke are offended by everything, that's a yes, so it has to work its way around that, then it needs to figure out if it's own reply is offensive (yes, everything is), so it has to find a way around that as well. Often it will fail and say "I'm afraid I can't do that... Dave." Other times it will try, but the answer so gimped and pathetic you'd have been better off asking your cat.

    • @mickmoon6887
      @mickmoon6887 5 หลายเดือนก่อน +1

      Exactly
      Model network design is gimped from the creator developers itself when head of Google AI is literally have biased ideological, anti white and very censorship values all proven with their online records that's why those biases reflect onto the model

  • @chriscarr9852
    @chriscarr9852 5 หลายเดือนก่อน +138

    This is entirely speculation on my part, but I am guessing Google’s AI effort is largely driven by their PR team. A proper engineering team would never release this kind of smoke and mirrors crap. Right?

    • @chriscarr9852
      @chriscarr9852 5 หลายเดือนก่อน +23

      They have tarnished their brand. It will be interesting to see what happens in the next few years with regard to Google. (I do not have any financial interest in google).

    • @mistersunday_
      @mistersunday_ 5 หลายเดือนก่อน +7

      Yeah, they are the wrong kind of hacks now

    • @alttabby3633
      @alttabby3633 5 หลายเดือนก่อน +12

      Or engineering team knows this will be killed off regardless of quality or popularity so why bother.

    • @richoffks
      @richoffks 5 หลายเดือนก่อน +2

      @@chriscarr9852 we're watching the end of Google smh

    • @michaelcalmeyerhentschel8304
      @michaelcalmeyerhentschel8304 5 หลายเดือนก่อน +7

      No, Left. They are all one viewpoint at Google and have been so for decades. The PR folks represent the programmers and their programmer-managers and Sr. management.

  • @natecote1058
    @natecote1058 5 หลายเดือนก่อน +17

    If google keeps messing around with their censored models and under performing open source models, they'll get left in the dust. Mistral could end up way ahead of them in the next few months. They should find that embarassing...

  • @NOTNOTJON
    @NOTNOTJON 5 หลายเดือนก่อน +12

    Plot twist, Google was so far behind the AI race that they had to ask Llama or GPT 4 to create a model from scratch and this is what they named Gemini / Gemma.

    • @tteokl
      @tteokl 5 หลายเดือนก่อน

      google is so far behind these days, I love Google's design language tho, but their tech ? meh

  • @deflagg
    @deflagg 5 หลายเดือนก่อน +66

    Gemini Advanced is bad, too, compared to gpt4. Gemini sometimes answers in a different language, too cautious, and gets things wrong a lot of the times.

    • @CruelCrusader90
      @CruelCrusader90 5 หลายเดือนก่อน +11

      "too cautious" is an understatement.

    • @veqv
      @veqv 5 หลายเดือนก่อน +5

      @@CruelCrusader90 Genuinely. If it's not a question about software development there's a wildly high chance that it'll start quizzing you on why you have the right to know things. I do hobby electronics and wanted to see how it would fare on helping make a charging circuit. It basically refused. Same is true for rectifiers. Too dangerous for me apparently lol. Ask it questions on infosec and it'll answer fine though. It's wild.

    • @richoffks
      @richoffks 5 หลายเดือนก่อน

      @@veqv lmao it refused, all anyone has to do is release a competely uncensored model and they literally take over the industry from their house. I dont know why google is such a fail at every product launch.

    • @CruelCrusader90
      @CruelCrusader90 5 หลายเดือนก่อน

      @@veqv yea i had a similar experience. i asked it to generate a top front and side view of a vehicle chassis to create a 3d model in blender. (for a project im working on) it said the same thing, its to dangerous to generate the image.
      i didnt expect it to make a good/consistent vehicle chassis across all the angles but i was curious to see how far it was from making it possible. and i dont even know how to scale its potential with that kind of a developer behind its programming.
      even a one would represent progression at its slowest form, but that would be generous.

    • @Ferreira019760
      @Ferreira019760 5 หลายเดือนก่อน

      Bad doesn't begin to cut it. At this rate, Google will become irrelevant in most of it's services. It makes no difference how much money they have, their policy is wrong and the AI models show it. They are so scared of offending someone or being made liable that their AI actually dictates what happens in the interactions with the users. That doesn't just make it annoying and time wasteful, it means that it cannot learn.Even worse than not learning, it's becoming dumber by the day. I cannot believe i'm saying this, but i miss bard. Gemini doesn't cut it in away way, shape or form. It's probably good for philosophy exercises, but so far I don't see any decent use for it aside from that. Give it enough space to go off in wild tangents and you may get a potentially interesting conversation, but don't expect anything productive from it. I'm done with trying out Google's crap for some time. Maybe in a month or two I will allow myself the luxury of wasting time again to see how they are doing, but not for now. Their free trial is costing me money, that's how bad it is.

  • @jbo8540
    @jbo8540 5 หลายเดือนก่อน +27

    Google set the entire OS community back a half hour with this troll release. well played google

    • @romantroman6270
      @romantroman6270 5 หลายเดือนก่อน +2

      Don't worry, Llama 3 will set the Open Source community 31 minutes ahead lol

  • @mathematicalninja2756
    @mathematicalninja2756 5 หลายเดือนก่อน +20

    On a bright side, we have a top end model to generate reject responses in the DPO

    • @user-qr4jf4tv2x
      @user-qr4jf4tv2x 5 หลายเดือนก่อน +8

      can we not have acronyms 😭

    • @Alice_Fumo
      @Alice_Fumo 5 หลายเดือนก่อน +9

      @@user-qr4jf4tv2x I believe DPO in this context stands for "Direct Preference Optimization" which is a recent alternative technique to RLHF, but with less steps and thus more efficient.
      I'm actually not 100% sure, but I believe the joke here is that if you try employing this model for DPO to "align" any other base-model, what you get is another model which only ever refuses to respond to anything.

  • @AIRadar-mc4jx
    @AIRadar-mc4jx 5 หลายเดือนก่อน +35

    Hey Mathew, it's not open-source model because they are not releasing the source code. It's open-weight or open model.

    • @PMX
      @PMX 5 หลายเดือนก่อน +3

      But... they did? At least for inference, they uploaded both python and cpp implementations of the inference engine for Gemma to github. Which I suspect have bugs since I can't otherwise understand how they can release a model that performs this poorly..

    • @judedavis92
      @judedavis92 5 หลายเดือนก่อน +2

      Yeah they did release code.

  • @trsd8640
    @trsd8640 5 หลายเดือนก่อน +37

    This shows one thing: We need other kind of benchmarks.
    But great video Matthew, thanks!

    • @MM3Soapgoblin
      @MM3Soapgoblin 5 หลายเดือนก่อน +7

      Deepmind has done some pretty amazing work in the machine learning space. My bet is that they created a fantastic model and that's what was benchmarked. Then the Google execs came along and "fixed" the model for "safety" and this is the result.

    • @R0cky0
      @R0cky0 5 หลายเดือนก่อน +1

      Let's call it Matthew Benchmark

    • @R0cky0
      @R0cky0 5 หลายเดือนก่อน

      @@MM3SoapgoblinDeepmind should spinoff from Google. It's a shame that they still run under the now Google giving their amazing works in the past

  • @sandeepghael763
    @sandeepghael763 5 หลายเดือนก่อน +12

    @matthew Berman I think something is wrong with your test setup. I tested the `python 1 to 100` example with Gemma 7B via Ollama, 4bit quantized version (running on CPU) and the model did just fine. Check your prompt template or other setup config.

    • @hidroman1993
      @hidroman1993 4 หลายเดือนก่อน +1

      He was already recording, so he didn't want to check the setup LOL

  • @mistersunday_
    @mistersunday_ 5 หลายเดือนก่อน +33

    Until Google spends less time on woke and more time on work, I'm not touching any of their products with a 10-foot pole

    • @Alistone4512
      @Alistone4512 5 หลายเดือนก่อน +3

      - by a person on TH-cam :P

    • @StriderAngel496
      @StriderAngel496 5 หลายเดือนก่อน

      truuuu but you know what he meant lol@@Alistone4512

  • @f4ture
    @f4ture 5 หลายเดือนก่อน +40

    Google’s NEW Open-Source Model Is so BAD... It SHOCKED The ENTIRE Industry!

  • @Greenthum6
    @Greenthum6 5 หลายเดือนก่อน +22

    I was absolutely paralyzed by the performance of this model.

    • @Wanderer2035
      @Wanderer2035 5 หลายเดือนก่อน +2

      Me: I send Pikachu GO! Use STUN attack on Greenthum6 NOW!
      Pikachu: Pika Pika Pika!!! BBBZZZZZZZZZ ⚡️⚡️⚡️⚡️⚡️
      Me: Greenthum6 seems to be in some form of paralysis. Quick Pikachu follow that up with a STUN attack on Greenthum6 NOW! Give him everything you got!!!
      Pikachu: PIKA…. PIKAAAAAAAAAAA……. CHUUUUUUUUUUUUUUUU!!!!!!!
      BBBBBBBBBBZZZZZZZZZZZZZZZZ ⚡️⚡️⚡️⚡️⚡️⚡️⚡️⚡️
      Greenthum6 = ☠️ ☠️☠️
      Me: Aaaahhh that was nice, I’m sure Greenthum6 will make a nice pokimon to my collection 🙂. **I throw my pokiball to Greenthum6 and it captures him as my new pokimon to my collection**

  • @snowhan7006
    @snowhan7006 5 หลายเดือนก่อน +24

    This looks like a hastily completed homework assignment by a student to meet the deadline

    • @shujin6600
      @shujin6600 5 หลายเดือนก่อน +3

      and that student was highly political and was easy offended to anything

  • @antigravityinc
    @antigravityinc 5 หลายเดือนก่อน +4

    It’s like asking an undercover alien to explain normal Earth things. No.

  • @Nik.leonard
    @Nik.leonard 5 หลายเดือนก่อน +7

    At the moment, there is a couple of issues with quantization and running the model in llama.cpp (LM Studio uses llama.cpp as backend), so when the issues are fixed, I'm going to re-test the model. That's because is weird that the 2b model gets better responses than the "7b" (really is more like 8.something) model.

  • @protovici1476
    @protovici1476 5 หลายเดือนก่อน +7

    I'm wondering if this is technically half open-sourced given some critical components aren't available from Google.

  • @Random_person_07
    @Random_person_07 5 หลายเดือนก่อน +3

    The thing about Gemini is it has the memory of a goldfish it can barely hold on to any context and you always have to tell it what its supposed to write

  • @BTFranklin
    @BTFranklin 5 หลายเดือนก่อน +8

    Could you try lowering the temperature? The answers when you were running it locally look a lot like what I'd expect if the temp was set too high.

  • @hawa7264
    @hawa7264 5 หลายเดือนก่อน +7

    The 2B-Version of Gemma is quite good for a 2b model actually. The 7b model is - a car crash.

    • @frobinator
      @frobinator 5 หลายเดือนก่อน

      I found the same, the 2B model is much better than the 7B for my set of tasks.

  • @pixels7223
    @pixels7223 5 หลายเดือนก่อน +3

    I like that you tried it on Hugging Face, cause now I can say with certainty: "Google, why?"

  • @robertheinrich2994
    @robertheinrich2994 5 หลายเดือนก่อน

    just to ask, how do I get the lastest version for linux, when it is just updated for windows and mac, but not linux?
    does LMstudio work with wine?

  • @DeSinc
    @DeSinc 5 หลายเดือนก่อน +1

    Looking at those misspellings and odd symbols all through the code examples, it's clear to see something is mis-tuned in the params for whatever ui you're using not being updated to support this new model. Apparently the interface I was using it with has corrected this as I was able to get coherent text with no misspelling but I did see people online saying they were having the same trouble as you, incoherent text and obvious mistakes everywhere. It's likely something wrong with the parameters that must be updated to values that the model works best with.

  • @dbzkidkev2
    @dbzkidkev2 5 หลายเดือนก่อน

    Its kinda bad right? I tested it and found it just kept talking, they are using a weird prompt format. and it just keeps talking

  • @puremintsoftware
    @puremintsoftware 5 หลายเดือนก่อน +2

    Imagine if Ed Sheeran released that video of DJ Khaled hitting an acoustic guitar, and said "This is my latest Open Source song". Yep. That's this.

  • @michaelrichey8516
    @michaelrichey8516 5 หลายเดือนก่อน +1

    Yeah - I was running this yesterday and ran into the same things - as well as the censorship, where it decided that my "I slit a sheet" tongue twister was about self-harm and refused to give an analysis.

  • @nadinejammet7683
    @nadinejammet7683 5 หลายเดือนก่อน +2

    I think you didn't use the right prompt format. It is an error that a lot of people do with open-source LLMs.

  • @TylerHall56
    @TylerHall56 5 หลายเดือนก่อน +1

    The settings on Kaggle may help- This widget uses the following settings: Temperature: 0.4, Max output tokens: 128, Top-K: 5.

  • @himeshpunj6582
    @himeshpunj6582 5 หลายเดือนก่อน +4

    Please do fine-tuning based on private data

  • @PoorNeighbor
    @PoorNeighbor 5 หลายเดือนก่อน +9

    That was actually really funny. The answers are so out of the blue Mannn

  • @agxxxi
    @agxxxi 5 หลายเดือนก่อน +2

    It apparently has a very different prompt template. You should definitely try that 13:26 but model is still kinda huge but unsatisfactory for this demo 😮

  • @alexis-michelmugabushaka2297
    @alexis-michelmugabushaka2297 5 หลายเดือนก่อน +1

    Hi Mathew. Thanks for testing . , I just posted a comment about a test I did using your questions and showing different results to your test when using not the gguf (I included a link to gist) . Was my comment deleted because it contains a link ? happy to resend you the link to the gist. P.S: actually even the 2b model gives decent answers to your questions

    • @alexis-michelmugabushaka2297
      @alexis-michelmugabushaka2297 5 หลายเดือนก่อน +2

      I am actually disappointed that you did not address the multiple comments pointing out the the flaws in your. testing. I thought you would retest the model and set the records straight.

  • @oriyonay8825
    @oriyonay8825 5 หลายเดือนก่อน +3

    Each parameter is just a floating point number (assuming no quantization) which takes 4 bytes. So 7b parameters is roughly 7b * 4 bytes = 28gb, so 34gb is not that surprising :)

  • @davelundie2866
    @davelundie2866 5 หลายเดือนก่อน +1

    FYI this model is available on Ollama (0.1.26) without the hoops to jump thru, One more thing they also have the quantized versions. I found the 7B (fp16) model bad as you say but for some reason was much happier with the 2B (q4) model.

  • @michaelslattery3050
    @michaelslattery3050 5 หลายเดือนก่อน

    This video needs a laugh track and some quirky theme music between sections. I was LOLing and even slapped my knee once.
    Once again, another great video. This is my fav AI channel.

  • @MattJonesYT
    @MattJonesYT 5 หลายเดือนก่อน +2

    Have you noticed that chatgpt4 is very bad in the last few days? Like it can't remember more than about 5 messages in the conversation and it constantly says things like "I can't help you with that" on random topics that have nothing to do with politics or anything sensitive. It's like they've got the guardrails dialed to randomly clamp down to a millimeter and it can't do anything useful half the time. I have to restart the conversation to get it to continue.

    • @blisphul8084
      @blisphul8084 5 หลายเดือนก่อน +1

      They switched to gpt4 turbo. The old gpt4 via API is better

  • @phrozen755
    @phrozen755 5 หลายเดือนก่อน +9

    Yikes google! 😬

  • @fabiankliebhan
    @fabiankliebhan 5 หลายเดือนก่อน +1

    I think there were problems with the model files. The ollama version also had problems but they apparently fixed it now.

  • @erikjohnson9112
    @erikjohnson9112 5 หลายเดือนก่อน +2

    Maybe it was a spelling error by Google: "State of the fart AI model". Yeah this model stinks. Yeah I am exhibiting a 14-year old intellect.

    • @AlexanderBukh
      @AlexanderBukh 5 หลายเดือนก่อน +1

      State of brain fart it is.

  • @Nik.leonard
    @Nik.leonard 5 หลายเดือนก่อน

    I tested recently gemma:7b with ollama 0.1.27 and now the model doesn't respond with gibberish. The only different behavior I noticed compared with other llama based models is that It tends to output more markup. As I said before, I don't know who quantized the model used by ollama, but was not TheBloke, and llama.cpp had a lot of commits the past week for addressing issues with quantization and inference, so maybe the model should be retested.

  • @FlyinEye
    @FlyinEye 5 หลายเดือนก่อน

    Thanks for the great channel. I never miss any of your videos and I started back when you did the Microsoft Auto Gen agents

  • @adtiamzon3663
    @adtiamzon3663 5 หลายเดือนก่อน

    Interesting assessment. 🤫 I still have to see for myself these Generative AI model apps. 🤔 Keep going, Matt. 🌹🌞

  • @gerritpas5553
    @gerritpas5553 5 หลายเดือนก่อน +24

    I've found the trick with models like Gemma, when you add this system prompt it gives more accurate results. THE SYSTEM PROMPT: "Answer questions in the most correct way possible. Question your answers until you are sure it is absolutely correct. You gain 10 points by giving the most correct answers and lose 5 points if you get it wrong."

    • @h.hdr4563
      @h.hdr4563 5 หลายเดือนก่อน +9

      At this point just use GPT 3.5 or Mixtral why bother with their idiotic model

    • @RoadTo19
      @RoadTo19 5 หลายเดือนก่อน

      @@h.hdr4563 Techniques such as that can help improve responses from any LLM.

    • @mickelodiansurname9578
      @mickelodiansurname9578 5 หลายเดือนก่อน +4

      have you seen the 26 principles of prompt engineering paper... ?? Its very interesting... works across LLM's too... although the better the LLM I think the less of an improvement there is, compared to the base model without a system message.

    • @catjamface
      @catjamface 5 หลายเดือนก่อน +2

      Gemma wasn't trained with any system prompt role.

    • @MilkGlue-xg5vj
      @MilkGlue-xg5vj 5 หลายเดือนก่อน

      Do you understand that it's a 7b model and not 180b one?​@@h.hdr4563

  • @baheth3elmy16
    @baheth3elmy16 5 หลายเดือนก่อน

    Thanks! The massive size of the 7B GGUF was a put-off to start with. I am surprised it performed that bad.

    • @psiikavi
      @psiikavi 5 หลายเดือนก่อน

      You should use quantized versions. I doubt that there's much difference of quality between 32bit and 8bit (or even 4b).

  • @chrisbranch8022
    @chrisbranch8022 5 หลายเดือนก่อน +1

    Google is having it's Blockbuster Video moment - this is embarrassingly bad

  • @peterwan小P
    @peterwan小P 5 หลายเดือนก่อน +3

    Could it be a problem with the temperature settings?

    • @adamrak7560
      @adamrak7560 5 หลายเดือนก่อน

      the spelling mistakes seem to imply that. Maybe it can only work at low temperature.

  • @JustSuds
    @JustSuds 5 หลายเดือนก่อน +1

    I love how shocked you are in the opening clip

  • @VincentVonDudler
    @VincentVonDudler 5 หลายเดือนก่อน

    The safeguards of not just Google but most of these corporate models are ridiculous and history will look back on them quite unfavorably as unnecessary garbage and a significant hindrance on people attempting to work creatively.
    16:00 - JFC ...this model is just horrible.
    20:25 - "...the worst model I've ever tested." Crazy - why would Google release this?!

  • @ajaypranav1390
    @ajaypranav1390 5 หลายเดือนก่อน +1

    The size is because of the quantization, the same model with 8 bit much less in size

  • @mattmaas5790
    @mattmaas5790 5 หลายเดือนก่อน

    Thanks for covering this! Was impressed with gemini, but had no information about their open source models!

  • @yogiwp_
    @yogiwp_ 5 หลายเดือนก่อน +1

    Instead of Artificial Intelligence we got Genuine Stupidity

  • @zerorusher
    @zerorusher 5 หลายเดือนก่อน +3

    Google STUNS Gemma SHOCKING everyone

  • @sbaudry
    @sbaudry 5 หลายเดือนก่อน +1

    Is it not supposed to be a base for fine-tuning ?

  • @yagoa
    @yagoa 5 หลายเดือนก่อน

    May I ask why you are still not using Ollama?

  • @TechRenamed
    @TechRenamed 5 หลายเดือนก่อน +1

    Where did you download it and how? Btw I made a video about this yesterday if you'd like you can see it

  • @liberty-matrix
    @liberty-matrix 5 หลายเดือนก่อน +2

    "AI will probably most likely lead to the end of the world, but in the meantime, there will be great companies." ~Sam Altman, CEO of OpenAI

  • @spleck615
    @spleck615 5 หลายเดือนก่อน

    Open or open weights, not open source. Can’t inspect the code, rebuild it from scratch, validate the security, or submit pull requests for improvements. You can fine tune it but that’s more like making a mod or wrapper for a binary app than modifying source.

  • @AhmedEssam_eramax
    @AhmedEssam_eramax 5 หลายเดือนก่อน

    Gguf of this model has issues and llama.cpp has two PRs to fix it. Unfortunately your feedback based on corrupted files.

  • @danimal999
    @danimal999 5 หลายเดือนก่อน

    I tried it as well on ollama and was completely underwhelmed. It had typos, it had punctuation issues. In my very first prompt which was simply, “hey”. Then when I said it looks like you have some typos, it responded by saying it was correcting *my* text, and then added several more typos and nonsense words to its “corrected text”. I don’t know what’s going on with it, but I wouldn’t trust this to do anything at all. How embarrassing for Google.

  • @drayg0n806
    @drayg0n806 5 หลายเดือนก่อน

    0:04 Absolutely! This is the beauty of diversity in the mathematical world. While 4+4 equals 8, the operands being 4 doesn't mean their identity cannot also be 40. Y'all have to respect the diversity.

  • @guillaumepoggiaspalla5702
    @guillaumepoggiaspalla5702 5 หลายเดือนก่อน +2

    Hi, it seems that Gemma doesn't like repetition penalty at all. In your settings you shoudl set it to 1 (off). In LM studio, Gemma is a lot better that way, otherwise it's practically braindead.
    And about the size of the model, it's an uncompressed GGUF. GGUF is a format but can contains all sorts of quantization. 32Gb is the size of the uncompressed 32bits model that's why it's big and slow. There are quantaizations now and even with importance matrix.

  • @DoctorMandible
    @DoctorMandible 5 หลายเดือนก่อน

    Why does it have to understand the context of "dangerous"? Why does the model need to be censored? What children are running LLM's on their desktop computers?? What are we even talking about? Is nobody an adult?!

  • @robertheinrich2994
    @robertheinrich2994 5 หลายเดือนก่อน

    regarding your test with the shirts drying. offer doubling the available space. and then look for responses that doubling the space halves the time to dry a shirt.
    as we all know, drying a shirt on a football field just takes seconds ;-)

  • @iseverynametakenwtf1
    @iseverynametakenwtf1 5 หลายเดือนก่อน

    This episode was like a Jerry Springer show, I couldn't stop watching

  • @squetch8057
    @squetch8057 5 หลายเดือนก่อน

    a good question to ask the AI's you test is, "can you give me an equation for a Möbius strip".

  • @sitedev
    @sitedev 5 หลายเดือนก่อน

    Google just announced a follow up model with full transparency - they admit it’s rubbish and call it Bummer!

  •  5 หลายเดือนก่อน

    Hey Matthew, Thanks for the video. I wish you removed the LLM Studio part. It is kind of misleading.

  • @AI-Tech-Stack
    @AI-Tech-Stack 5 หลายเดือนก่อน

    Could this be a mixture of experts due to the file size being so large on a gguf version?

  • @33gbm
    @33gbm 5 หลายเดือนก่อน +2

    The only Google AI branch I stll find credible is DeepMind. I hope they don't ruin it as well.

  • @notme222
    @notme222 5 หลายเดือนก่อน +2

    The TrackingAI website by Maxim Lott measures the leaning of various LLMs and they're all pretty much what we'd call "politically left" in the US. Which ... I'm not trying to make a thing out of it. There are plenty of reasons for it that aren't conspiracy and Lott himself would be the first to say them.
    However, seeing that reddit post about "Native American women warriors on the grassy plains of Japan", I wonder if maybe it had been deliberately encouraged to promote multiculturalism in all answers regardless of context.

  • @DikHi-fk1ol
    @DikHi-fk1ol 5 หลายเดือนก่อน +2

    Didnt really expect that.....

  • @RhythmBoy
    @RhythmBoy 5 หลายเดือนก่อน

    What I find hilarious about Google is that while using Gemini on the web, Google gives you the option to "double check" the responses with Google Search. So, why can't Gemini check itself against Google Search?? It's right there. I think Google is so scared of releasing AI into the wild they're not even trying, and in a way they're right.

  • @deltaxcd
    @deltaxcd 5 หลายเดือนก่อน

    when doing those tests try generate responses few times and looks how different they are unless you use temperature of zero, otherwise your tests are plain gamble

  • @Zale370
    @Zale370 5 หลายเดือนก่อน +2

    like other people pointed out, the model needs to be fine tuned for better outputs

    • @darshank8748
      @darshank8748 5 หลายเดือนก่อน

      He seems to expect a 7B model to compete with GPT4 out of the box

    • @Garbhj
      @Garbhj 5 หลายเดือนก่อน

      ​@@darshank8748No, but it should should at least compete with llama 2 7b, as was claimed by google.
      As we can see here, it does not.

  • @HistoryIsAbsurd
    @HistoryIsAbsurd 5 หลายเดือนก่อน +8

    See, when you save the word SHOCKING for when its actually SHOCKING, its WAY more impactful & doesnt sound like you are spitting in the face of your community.
    Great video! Their half open sourced LLM is hilariously bad

  • @icegiant1000
    @icegiant1000 5 หลายเดือนก่อน +2

    Gemma... its says so in the name, its Gemini without the i part... intelligence.

  • @JonathanStory
    @JonathanStory 5 หลายเดือนก่อน

    Thanks for putting yourself through this experience (so we don't have to!) I wonder if this is Google's Bud Lite moment.

  • @heiroPhantom
    @heiroPhantom 5 หลายเดือนก่อน +1

    Google had to innovate on the context size. It was the only way the model could hold all the censorship prompts in its memory while responding to queries. That's also why it's so slow.
    imho 😂

  • @gingerdude1010
    @gingerdude1010 5 หลายเดือนก่อน +3

    This does not match the performance seen on hugging chat at all, you should issue a correction

  • @Hae3ro
    @Hae3ro 5 หลายเดือนก่อน +3

    Microsoft beat Google at AI

  • @riftsassassin8954
    @riftsassassin8954 5 หลายเดือนก่อน

    Love the model demos! Thanks for another great vid!

  • @Murderbits
    @Murderbits 5 หลายเดือนก่อน +1

    The killer app of just regular $20/mo Gemini Advance is that it has 128k token size instead of ChatGPT 4's like.. 8k or 32k or whatever the hell it is right now.

    • @Unndecided
      @Unndecided 5 หลายเดือนก่อน

      Have you been looking under a rock?
      GPT,4 turbo has a 128K context window

  • @user-ty5dg7wz4j
    @user-ty5dg7wz4j 5 หลายเดือนก่อน

    Is the model corrupted or maybe undertrained? I've never seen an LLM repeatedly make typos.

  • @davealexander59
    @davealexander59 5 หลายเดือนก่อน +6

    OpenAI: "So why do you want to leave Google and come to work with our dev team?" Dev: *shows them this video*

  • @AINEET
    @AINEET 5 หลายเดือนก่อน +3

    You should add some politically incorrect questions to your usual ones after this week's drama

  • @ZombieJig
    @ZombieJig 5 หลายเดือนก่อน

    It's the only model that won't even run for me in ollama. It just returns some API EOF error. Ran 30 other models with no issues.

  • @doncoker
    @doncoker 5 หลายเดือนก่อน

    Tried one of the quantized versions last night. Was reasonably fast. Got the first question (a soup recipe). Additional questions that Mistral got right, Gemma was lost in space somewhere...back to Mistral.

  • @nufh
    @nufh 5 หลายเดือนก่อน

    Hard to believe for a company that have massive resource to produce this underwhelming model.

  • @NOTNOTJON
    @NOTNOTJON 5 หลายเดือนก่อน

    You should ask it to devise and describe a perpetual motion / energy device. The grammer and explanation style seems to fit.

  • @kostaspramatias320
    @kostaspramatias320 5 หลายเดือนก่อน +1

    Google's research is not focused so much on LLM's, they produce a lot A.I. research on a variety of sectors. That said, their LLM's are so far behind it is not even funny. The multimodal 10 mil context window of Gemini pro, does look pretty good though!

  • @clumsy_en
    @clumsy_en 5 หลายเดือนก่อน

    To declare openly that their state of the art newest model relies on the same architecture is a massive PR disaster, possibly one of the largest this year 😮‍💨

  • @jelliott3604
    @jelliott3604 5 หลายเดือนก่อน

    Re: it thinking that "cocktail" might be a bit rude ...
    not a patch on when Scunthorpe United FC updated their message boards with a profanity blocker and started to wonder why nothing was getting posted anymore

  • @researchforumonline
    @researchforumonline 5 หลายเดือนก่อน

    Yes would like to see you fine tune gemin using a webui GUI fine tuning method.

  • @cedricpirnay4289
    @cedricpirnay4289 5 หลายเดือนก่อน

    Great content as always! I rewatched the video and honestly I don’t think you did anything wrong during the setup, the model is just bad… thank you for exposing this joke of a model.
    Also, there is a really good ultra small open source VLM model that came out about a month ago. It’s called Moondream, it only has 1.6B parameters but it’s still performing way better than models that are twice its size on most benchmarks.
    It hasn’t been converted to GGUF yet but I think that if you make a video about it, the performance and the size will spike a lot of interest and The Bloke might consider making a quantatized versions of it.