LoRA explained (and a bit about precision and quantization)

The Era of 1-bit LLMs-All Large Language Models are in 1.58 Bits

Mixture-of-Depths: Dynamically allocating compute in transformer-based language models

กูโดนเกรียน กูโดนเกรียน กูโดนเกรียน

ละลาย (LALALYE) - DAOU PITTAYA ต้าห์อู๋ พิทยา [OFFICIAL MV]

ทหารไทยเอาอยู่หาก ‘ว้าแดง’ รุกชายแดน | NEWS DIGEST #65

The Era of 1-bit LLMs: All Large Language Models are in 1.58 Bits and BitNet

Gabriel Mongaras

มุมมอง 5 727

เพิ่มลงใน
- เพลย์ลิสต์ของฉัน
- ดูภายหลัง
แชร์

แชร์

ฝัง

ขนาดวิดีโอ:

แสดงแผงควบคุมโปรแกรมเล่น

เล่นอัตโนมัติ

เล่นใหม่

เผยแพร่เมื่อ 30 พ.ย. 2024

ความคิดเห็น • 25

@crypticnomad 8 หลายเดือนก่อน ⁺⁶
Thank you for the great break down. I really like how you went back to the first paper to explain the theory underlying bitnet and then explained what was different in the new paper. In general these kinds of advancements excite me because they potentially could make running, and in some cases training, these huge models something us mere mortals without infinite compute can actually do locally.
@ehfik 8 หลายเดือนก่อน ⁺¹
oh! ive been thinking about this myself. how nice to see it realized!
@TheRealHassan789 8 หลายเดือนก่อน
Excellent review! Thanks
@darshanputtaswamy3199 7 หลายเดือนก่อน
This paper will add a ceiling price to the Nivida stocks
@cbuchner1 8 หลายเดือนก่อน ⁺²
Does using FP8 for activations as opposed to INT8 offer a significant accuracy benefit?
I suppose a integer adder is even simpler than a floating point adder and may save additional power
@cbuchner1 8 หลายเดือนก่อน
Ok I found a paper discussing these details: arxiv.org/pdf/2303.17951.pdf
@gabrielmongaras 8 หลายเดือนก่อน ⁺²
I have no idea why I said FP8 during the video 😳
INT8 is used just like you said since FP8 doesn't offer anything over INT8
@cbuchner1 8 หลายเดือนก่อน ⁺¹
@@gabrielmongaras actually I have seen a paper that reports a stability benefit of FP8 over INT8 for LLMs during training once they scale beyond a certain size.
@gabrielmongaras 8 หลายเดือนก่อน
@@cbuchner1 That sounds really intersting. Can you please send the paper? I wonder if other formats would work better in FP8 such as how BFLOAT16 is usually better than FP16.
@cbuchner1 8 หลายเดือนก่อน
@@gabrielmongaras chapter 4.6 here arxiv.org/pdf/2303.17951.pdf But I misremembered in so far as they state it works better for Transformers (not limited to very large ones) and that there are ways to also make it work well with int8
I will have to keep looking for papers that talk about comparing int8/fp8 in training GPTs
@FaultyTwo 8 หลายเดือนก่อน
Can't wait for 0.5 bits models!
@notu483 8 หลายเดือนก่อน
According to the calculation log2(0.5) = -1 so does that mean you need a base -1 number system?
@blaineone 8 หลายเดือนก่อน
😂😂😂
@Aerotune0 8 หลายเดือนก่อน
Analog computing
@AnamikaChatterjee-j4l 8 หลายเดือนก่อน
Excellent Explanation !! can you please make a video on speculative streaming
@szebike 3 หลายเดือนก่อน
Technically impressive that its possible however I only see limited application for this. Pactically most models below 8 bit quantization are way less "aware" of input and context. If I alter a situation ina 8b model it can adjust its output accordingly any model below that is very rigid. That being said maybe you can metigate those effects when you train them on that quantization to begin with and not compress it when it was trained on higher values if it makes sense what I say...
@david1mdavis 8 หลายเดือนก่อน
Will this work for CNN or only LLM? Does the training still need to used GPU's or is this only an inference benefit. I don't see any examples or data other than this Paper.
@KevinInPhoenix 5 หลายเดือนก่อน
Does this technology make Nvidia's tech and NPUs obsolete?
@TheRealHassan789 8 หลายเดือนก่อน
Noob Question… so binary/ternary quantization has been around for a while… which part was the major innovation/discovery in BitNet paper?
@gabrielmongaras 8 หลายเดือนก่อน ⁺⁵
Mainly that a model can be trained from scratch using binary weights and still be competitive in terms of perplexity and accuracy.
@ariabk 8 หลายเดือนก่อน
very cool
@JorgetePanete 8 หลายเดือนก่อน
we need the code
@crypticnomad 8 หลายเดือนก่อน
I found an implementation already on pip under the name "bitnet". From looking at the code they fully implemented the first paper and are making the changes now to implement the changes in the second paper. They even have a bitnet version of llama(bit_llama) in the repo. They also have a function where you can replace all of the linear layers in a model with bitlinear layers.
@AIMevideochat 8 หลายเดือนก่อน ⁺¹
Hi bro, could you recommand me a large Model which can provide girlfriend API , or I can fine tune it to be a girlfriend model? I need an uncensored model for girl friend role. (not be NSFW, just to be a warm girl friend, when you ask her "can you be my girlfriend", the model won't reply "I am a AI model", that is so annoying) , or any other way to solve the problem. I whatched your video" Talking to girlfriend", but I worried that the Model you mentioned might be outdated. I am looking forward to your reply .Thank you!
@szebike 3 หลายเดือนก่อน
I don't think LLMs should be used for this task, you can interact with them thats ok but a "girlfriend" is something between humans. You shouldn't make money on the back of lonely people and even its free its unhealthy to form a forced realtionship with a machiene (apart from being morally and ethically questionable).

ต่อไป

เล่นอัตโนมัติ

LoRA explained (and a bit about precision and quantization)

LoRA explained (and a bit about precision and quantization)

The Era of 1-bit LLMs-All Large Language Models are in 1.58 Bits

The Era of 1-bit LLMs-All Large Language Models are in 1.58 Bits

Mixture-of-Depths: Dynamically allocating compute in transformer-based language models

Mixture-of-Depths: Dynamically allocating compute in transformer-based language models

กูโดนเกรียน กูโดนเกรียน กูโดนเกรียน

กูโดนเกรียน กูโดนเกรียน กูโดนเกรียน

ละลาย (LALALYE) - DAOU PITTAYA ต้าห์อู๋ พิทยา [OFFICIAL MV]

ละลาย (LALALYE) - DAOU PITTAYA ต้าห์อู๋ พิทยา [OFFICIAL MV]

ทหารไทยเอาอยู่หาก ‘ว้าแดง’ รุกชายแดน | NEWS DIGEST #65

ทหารไทยเอาอยู่หาก ‘ว้าแดง’ รุกชายแดน | NEWS DIGEST #65

ถ่ายทอดสด เรื่องเล่าเสาร์-อาทิตย์ วันที่ 30 พฤศจิกายน 2567

ถ่ายทอดสด เรื่องเล่าเสาร์-อาทิตย์ วันที่ 30 พฤศจิกายน 2567

Visualizing transformers and attention | Talk for TNG Big Tech Day '24

Visualizing transformers and attention | Talk for TNG Big Tech Day '24

Stable Diffusion 3: Scaling Rectified Flow Transformers for High-Resolution Image Synthesis

Stable Diffusion 3: Scaling Rectified Flow Transformers for High-Resolution Image Synthesis

CoPE - Contextual Position Encoding: Learning to Count What's Important

CoPE - Contextual Position Encoding: Learning to Count What's Important

Scaling Transformer to 1M tokens and beyond with RMT (Paper Explained)

Scaling Transformer to 1M tokens and beyond with RMT (Paper Explained)

Llama 1-bit quantization - why NVIDIA should be scared

Llama 1-bit quantization - why NVIDIA should be scared

Francois Chollet - LLMs won’t lead to AGI - $1,000,000 Prize to find true solution

Francois Chollet - LLMs won’t lead to AGI - $1,000,000 Prize to find true solution

Linear Regression Part I Simple Linear Regression

Linear Regression Part I Simple Linear Regression

OpenAI Sora and DiTs: Scalable Diffusion Models with Transformers

OpenAI Sora and DiTs: Scalable Diffusion Models with Transformers

How many people are in the changing room? #devil #lilith #funny #shorts

How many people are in the changing room? #devil #lilith #funny #shorts

🔴𝐋𝐈𝐕𝐄 การแข่งขัน Asian Esports Games 2024 เกม RoV รอบชิงชนะเลิศ

🔴𝐋𝐈𝐕𝐄 การแข่งขัน Asian Esports Games 2024 เกม RoV รอบชิงชนะเลิศ

ผมตกใจทันที เมื่อเจอกับเจ้านี้... |Minecraft #minecraft #มายคราฟ #fypシ #minecraftmemes #ตลก

ผมตกใจทันที เมื่อเจอกับเจ้านี้... |Minecraft #minecraft #มายคราฟ #fypシ #minecraftmemes #ตลก

เจอเรื่องแปลก รุดช่วยตากับยาย ถูกทิ้งกองขยะ

เจอเรื่องแปลก รุดช่วยตากับยาย ถูกทิ้งกองขยะ

Part1 🍖หญิงสาวแรงเกินไปจนขว้างรองเท้าลงไปในหม้อไฟของหัวหน้าแก๊ง #shorts #Chinesedrama #drama #fyp

Part1 🍖หญิงสาวแรงเกินไปจนขว้างรองเท้าลงไปในหม้อไฟของหัวหน้าแก๊ง #shorts #Chinesedrama #drama #fyp

นี่คือเรื่องราวสุดหลอน ของสตีฟที่บิดเบี้ยว !

นี่คือเรื่องราวสุดหลอน ของสตีฟที่บิดเบี้ยว !

มีรถผีสิงอยู่ในฟาร์ม | บรึ๋ย | การ์ตูนเด็ก | นายอำเภอลาบราดอร์ | Kids Cartoon | Sheriff Labrador

มีรถผีสิงอยู่ในฟาร์ม | บรึ๋ย | การ์ตูนเด็ก | นายอำเภอลาบราดอร์ | Kids Cartoon | Sheriff Labrador