I don't know why the algorythm has neglected showing me your content so long. It is right up my alley. I hope you keep up coming with the latest news from AI. It is the biggest thing to happen to humanity since ever, and people still dont react to it. I like your vids. You are smart and the animations are fun. And god, the lipstick. Holy shit. 🙃😍
Nice animations. It feels like we are going in circles: this paper (and ReLU Strikes Back) reintroduce ReLU; S6, RWKV, RetNet reintroduce RNN. Flip a coin what past happens next - residual-free models or AI Winter.
Btw I already love the "can we run larger models now" comments. Seems like every time there is a breakthrough to make NNs more efficient it's just used to make them bigger :D
9 หลายเดือนก่อน +3
@@Aca99100 I read this comment just after I wrote my own "can we run larger models now". LOL
@Aca99100 Yes, the more efficient, the larger because the larger, the better (so far). 😅 I try to keep up the pace, but it will be hard since I will be submitting my thesis in a few months. I am working on a MAMBA video now, but it is hard to find the time, these days. :(
9 หลายเดือนก่อน +2
@@AICoffeeBreak Same here. I'm also submitting my thesis in a few months. I really appreciate how you can still keep up making quality videos.
This was a fantastic and concise explanation!! I'll read the paper in more detail; however, is this method also effective when combined with quantization? I want to run large models in reasonably priced hardware just for inference.
Yes, it is compatible with quantization. The paper has ablations on this: "Furthermore, we show several ablations on different components of DEJAVU and its compatibility with quantization techniques."
IT majors in India used to recruit for general intelligence and then make it sparse in the profession, focusing on specialized, repetitive tasks rather than broad skill development.
Yes, this is what this means. I do think this would be most beneficial for mobile devices, since the sparsity is input-dependent, thus it makes sense to use it only when you need to load an LLM-powered app to run only a few prompts with it. 02:33
@@AICoffeeBreakdoes that mean with longer prompt sessions you have less zeros in the matrix, or something like that, so this does not work as well? Or am I missing something
Yes, when a person refers to the human brain in comparison to AI, they generally mean the collective intelligence of humanity, rather than the capabilities of an individual brain.
I don't know why the algorythm has neglected showing me your content so long. It is right up my alley. I hope you keep up coming with the latest news from AI. It is the biggest thing to happen to humanity since ever, and people still dont react to it. I like your vids. You are smart and the animations are fun. And god, the lipstick. Holy shit. 🙃😍
Nice animations.
It feels like we are going in circles: this paper (and ReLU Strikes Back) reintroduce ReLU; S6, RWKV, RetNet reintroduce RNN. Flip a coin what past happens next - residual-free models or AI Winter.
What a crazy video. I learned so much, thank you for making this!
our pleasure!
Great explanation! Spacity is good.
This channel is a real gem! Can we continue to expect 2 videos a month?
Btw I already love the "can we run larger models now" comments. Seems like every time there is a breakthrough to make NNs more efficient it's just used to make them bigger :D
@@Aca99100 I read this comment just after I wrote my own "can we run larger models now". LOL
@Aca99100 Yes, the more efficient, the larger because the larger, the better (so far). 😅
I try to keep up the pace, but it will be hard since I will be submitting my thesis in a few months. I am working on a MAMBA video now, but it is hard to find the time, these days. :(
@@AICoffeeBreak Same here. I'm also submitting my thesis in a few months. I really appreciate how you can still keep up making quality videos.
Omg, then good luck to us both!
What topic are you working on? (So what's the title?)
I never thought that what I need in life was Ms. Coffee Bean telling me to "sit down". Now I know
This was a fantastic and concise explanation!! I'll read the paper in more detail; however, is this method also effective when combined with quantization? I want to run large models in reasonably priced hardware just for inference.
Yes, it is compatible with quantization. The paper has ablations on this: "Furthermore, we show several ablations on different components of DEJAVU and its compatibility with quantization techniques."
Isn't GELU already enforcing "input-dependent" sparsity ?
Exactly. You do not need to watch the end of the video. 😅
Thanks!
Thank you so much! I'll go get a coffee with this money now.
IT majors in India used to recruit for general intelligence and then make it sparse in the profession, focusing on specialized, repetitive tasks rather than broad skill development.
does this mean we can run larger models on smaller gpus?
Yes, this is what this means. I do think this would be most beneficial for mobile devices, since the sparsity is input-dependent, thus it makes sense to use it only when you need to load an LLM-powered app to run only a few prompts with it. 02:33
@@AICoffeeBreakdoes that mean with longer prompt sessions you have less zeros in the matrix, or something like that, so this does not work as well? Or am I missing something
Awesome summary! Thank you so much
Yes, when a person refers to the human brain in comparison to AI, they generally mean the collective intelligence of humanity, rather than the capabilities of an individual brain.
GPT makers could have a share price hit