Running Gemma using HuggingFace Transformers or Ollama

Sam Witteveen

มุมมอง 24 438

เพิ่มลงใน
- เพลย์ลิสต์ของฉัน
- ดูภายหลัง
แชร์

แชร์

ฝัง

ขนาดวิดีโอ:

แสดงแผงควบคุมโปรแกรมเล่น

เล่นอัตโนมัติ

เล่นใหม่

เผยแพร่เมื่อ 27 มิ.ย. 2024
Colab: drp.li/rfWhi
Ollama: ollama.com/library/gemma
Gemma CPP: github.com/google/gemma.cpp
Code examples: ai.google.dev/gemma
👨‍💻Github:
github.com/samwit/langchain-t... (updated)
git hub.com/samwit/llm-tutorials
⏱️Time Stamps:
00:00 Intro
00:19 Gemma + Ollama
00:25 Gemma + Keras
00:33 gemma.cpp
00:56 Gemma using Hugging Face
19:54 Gemma using Ollama (edited)
วิทยาศาสตร์และเทคโนโลยี

ความคิดเห็น • 31

@TonyRichards68 4 หลายเดือนก่อน
Thanks for doing this! I was about to tackle this task this morning, and voila, you already made a video.
@mrwadams 4 หลายเดือนก่อน
Very helpful video, thank you. Particularly the points around how the prompt format / system prompt differs from models such as GPT-4 and Mistral.
@mamisoa 4 หลายเดือนก่อน ⁺³
Thanks for this very instructive video, especially about tokenizer. Would be interesting to test RAG with the 2b model vs usual models. And nice to know you are from Australia, I’ve been wondering since some time.
@samwitteveenai 4 หลายเดือนก่อน
Originally from there but haven't live there for a couple of decades. :D
@abdelkaioumbouaicha 4 หลายเดือนก่อน ⁺¹
📝 Summary of Key Points:
📌 The video covers different ways to perform inference with Gemma models, focusing on both the hugging face and AL (Al) methods.
🧐 Setting up Gemma using the hugging face version involves accepting terms, downloading the model, and using a quantization config for inference.
🚀 Running Gemma on AL is straightforward, with the model already available for download and usage, offering a simpler local inference option.
💡 Additional Insights and Observations:
💬 The Gemma model outputs in markdown format by default, providing formatted responses with bullet points and bold text.
📊 Gemma's large training corpus of Six Trillion tokens allows for basic translation capabilities, although factual accuracy may vary.
🌐 The video hints at the potential for fine-tuning Gemma models for specific tasks to enhance performance and responsiveness.
📣 Concluding Remarks:
The video showcases accessible methods for utilizing Gemma models through hugging face and AL platforms, highlighting the model's unique markdown output and the potential for future fine-tuning to optimize performance. Exploring different prompts and system configurations can enhance the model's responses and usability.
Generated using TalkBud
@rayaneghilene5152 4 หลายเดือนก่อน
Great video! Can it be prompted for text classification (ie: zero shot classification)?
@samwitteveenai 4 หลายเดือนก่อน ⁺¹
Honestly haven't tried it that much. The 2B model I wouldn't expect much from it for zero shot. It wasn't trained on as many tokens etc. I may release some fine tunes on in house data for the 7B .
@Charles-Darwin 4 หลายเดือนก่อน
Do you think theyre applying that Ring Attention architecture to the prompts?
@nahuelfiorenza421 3 หลายเดือนก่อน
is there any way i can train this model for my own aplication?
I need to give the model enough information to solve doubts about the topics I'm going to use...
please tell me if it is even possible.
thanks in advance!
@samwitteveenai 3 หลายเดือนก่อน
Would depend on your application and data, but yeah it should be possible
@claxvii177th6 4 หลายเดือนก่อน
That email tho lol
@user-tr1ey1rs6r 4 หลายเดือนก่อน
tried gemma 2b in google colab using cpu but couldnt load model because ram consumption was too high , the 12GB offered by colab (not premium) is not enough how is it suposed to run on a local machine as advertised is there a trick i am missing (still just a noobie)
@aaroldaaroldson708 4 หลายเดือนก่อน ⁺¹
I think what they mean by « local » is a decent PC with enough RAM (normally above 32gb) and preferably with a GPU
@pinkfloyd2642 4 หลายเดือนก่อน
I am trying to understand how these open source models work. Kinda newbie here. Is it possible that I ask Gemma to use my excel sheet data, perform the desired mathematical operation, and provide me the result? Something like - Provide me the total of the last 1 month of debt.
@gideonwyeth9779 4 หลายเดือนก่อน
The main rule in every application utilizing AI: do as much as you can in program code. That doesn't mean AI can't do that, it means you better don't trust AI in sensitive tasks, due to AI random nature.
@mamtasantoshvlog 4 หลายเดือนก่อน
The output of the prime numbers is wrong
@guanjwcn 4 หลายเดือนก่อน
Somehow the key takeaway for me from this video is that Sam Witteveen is from Australia. 🤣
@samwitteveenai 4 หลายเดือนก่อน
was but left many years ago :D
@sil1235 4 หลายเดือนก่อน
It seems it has no safeguards when you use other languages, I wonder how that is implemented (some filter on English words?).
@samwitteveenai 4 หลายเดือนก่อน
I think it is most likely due to the SFT and RLHF all being in English only
@cucciolo182 3 หลายเดือนก่อน
gemma vs Llama2 Uncensored ?
@samwitteveenai 3 หลายเดือนก่อน
can always Gemma uncensored too.
@randyh647 4 หลายเดือนก่อน ⁺¹
I was using ollama windows with gemma:7b and asked >>> "who was the 23rd president of the us"
Bill Clinton, a Democrat. The symbol is referring to Bill Clintons presidency as being out of order by so meexperts because it interrupted Richard Nixon's second term after Lyndon Johnsoberts death in office and guanconㅡ差别 simpelнии lila veden nuo sraNOSIS koupra prezidentyu vo sate institut itdnie nal jesu osoby beng }], therefore, there has not actually been a president number 23.
@SuprBestFriends 4 หลายเดือนก่อน
How is Gemma not overfit white that much training data
@pythonpeng7018 4 หลายเดือนก่อน
{Disambiguation
In a study of disambiguation with 24- and 36-month-olds}
@JunYamog 4 หลายเดือนก่อน
Gemma is wrong in Singlish, it should be “How to get to Orchard Rd… La” 😂
@bigpickles 4 หลายเดือนก่อน
Haha. Can
@robertputneydrake 4 หลายเดือนก่อน ⁺²
Why care about Gemma at all though? It's terrible.
@hqcart1 4 หลายเดือนก่อน
worst model ever

ต่อไป

เล่นอัตโนมัติ