If you just add a pair of aviator sunglasses then this is a Yannic Kilcher video. Instant 100k sub upgrade. Jokes aside, this was a great explanation of a great library!
I tried to train a model that has embedding layer having vocab size of 100 million and embedding dim 128 on a 3 A100 80GiB Gpus with deepspeed (zero stage 3, offloading parameters and optimizers to cpu) but it fails with cuda Out of memory error 😢
A100 gpu is 30k usd, is this offloading all theoretical nonsense? Where is apps that allow to run actual llama 3.1 on one or two 3090? Offloading non used stuff on nvme ssd?
Thanks mark!. You have been helping me understand concepts better.
Thanks Mark! Quite a thorough and useful explanation.
Thanks Mark great vid. Good update on SOTA in distributed training since horovod
If you just add a pair of aviator sunglasses then this is a Yannic Kilcher video. Instant 100k sub upgrade.
Jokes aside, this was a great explanation of a great library!
Thanks for such an inspiring and insightful video. What a knowledge feast to enjoy !
Great Video Mark! A few corrections, A100 is available in 40 GB and 80 GB variants.
Hi Mark, great vid. Could you make a video on how to fine-tune large transformer models (e.g. T5 B-11) without running into CUDA errors?
Great suggestion! Yes I’ll do it
@@marksaroufim great! There is a lot information about fine-tuning T-5 base , but not about fine-tuning models above T-5 base
@@adriangabriel3219 Did you ever get t5-11b working?
I tried to train a model that has embedding layer having vocab size of 100 million and embedding dim 128 on a 3 A100 80GiB Gpus with deepspeed (zero stage 3, offloading parameters and optimizers to cpu) but it fails with cuda Out of memory error 😢
amazing!
Nice explanation, but how to do in ooba?
You're looking at RAM, not vRAM btw.
A 2080ti with 30 gigs? 🤭 If only my 4090 had that much RAM 😅
A100 gpu is 30k usd, is this offloading all theoretical nonsense? Where is apps that allow to run actual llama 3.1 on one or two 3090? Offloading non used stuff on nvme ssd?