Building LLMs from the Ground Up: A 3-hour Coding Workshop

Sebastian Raschka

มุมมอง 87 231

เพิ่มลงใน
- เพลย์ลิสต์ของฉัน
- ดูภายหลัง
แชร์

แชร์

ฝัง

ขนาดวิดีโอ:

แสดงแผงควบคุมโปรแกรมเล่น

เล่นอัตโนมัติ

เล่นใหม่

เผยแพร่เมื่อ 26 ธ.ค. 2024

ความคิดเห็น •

@joneskin1432 3 หลายเดือนก่อน ⁺⁶¹
Dude I keep accidentally running into your content while learning this material. The other day I was trying firing off weirdly specific google searches while trying to build intuition on how self-attention works and I found a year old comment you wrote on reddit that nailed what I was having trouble with. Just bought your book MEAP, you've been doing an amazing job, keep it up!
@SebastianRaschka 3 หลายเดือนก่อน ⁺⁵
Whoa what a small world. Glad you are finding this useful and consider getting a copy of my book!
@razeo7068 3 หลายเดือนก่อน ⁺²
Can you share the self-attention reddit link
@thehard-coder9398 3 หลายเดือนก่อน
@joneskin1432 - Would you mind sharing what version did you get the MEAP? eBook or Text Book? Mind to share the link? Many thanks!
@shahabmos5130 3 หลายเดือนก่อน
You are a computer enginneer and still brelive in accidents .
Wake up.
@atlasflare7824 3 หลายเดือนก่อน ⁺⁸
This is a gem for me as a Msc AI student. Thank you for making this.
@devtest8078 3 หลายเดือนก่อน ⁺⁵
23 mins in. This is by far, the best tutorial I have seen on building LLMs from scratch. I have followed you for a while Sebastian for all the great contributions you have made over the years, but you have outdone yourself once again. Well done man and Thank you.
@devtest8078 3 หลายเดือนก่อน ⁺¹
96 mins in. Still awesome.
@SebastianRaschka 3 หลายเดือนก่อน
@@devtest8078 Hah, thanks so much!
@devtest8078 3 หลายเดือนก่อน
👏👏👏
@iamsnglrty 3 หลายเดือนก่อน ⁺¹⁴
"Thank you! I love your work, Sebastian. 😊
I hope my small token of appreciation will motivate you further to create more content like this.
By the way, I already own most of your books. My favorite is your recent one - Build a Large Language Model (from Scratch)." 📚
@SebastianRaschka 3 หลายเดือนก่อน ⁺¹
Wow, thanks so much for the kind support!
@paolodragol 3 หลายเดือนก่อน ⁺⁴
Sebastian, I want to sincerely thank you for providing such good material. I cannot express my gratitude enough! I admire your desire to share this content with such clarity and human touch! Thanks a lot!
@SebastianRaschka 3 หลายเดือนก่อน
Thanks for the kind words!
@haribhauhud8881 3 หลายเดือนก่อน ⁺¹⁰
Dear Sebastian,
I hope you are doing well. I am writing to express my deepest gratitude for your incredible effort and dedication to teaching on the online platform. Your generosity in sharing your knowledge for free has made a profound impact on so many of us.
Your classes have been a beacon of light in these challenging times, providing not only education but also inspiration and hope. The clarity with which you explain complex topics and your unwavering patience in addressing our questions have been truly remarkable.
Thank you for your time, energy, and passion for teaching. You've made a significant difference in my learning journey, and I am immensely grateful for the knowledge and wisdom you've imparted.
Wishing you all the best in your future endeavors. 😊
Warm regards,
Hari
@SebastianRaschka 3 หลายเดือนก่อน
Thanks so much for this very kind message, Hari. This is very nice of you, and it's very motivating to hear this!
@masonholcombe3327 3 หลายเดือนก่อน ⁺⁴
Your deep learning series got me through stat 453 at uw Madison and now this workshop has been the perfect transition into LLMs! Great video Sebastian!
@SebastianRaschka 3 หลายเดือนก่อน
Wow, small world, and I am glad to hear that this video was useful as well!
@juliogodel 3 หลายเดือนก่อน ⁺²
Thanks!
@SebastianRaschka 3 หลายเดือนก่อน ⁺¹
Thanks for the very kind support!
@hasaniqbal3180 3 หลายเดือนก่อน ⁺¹
Thank you. I recently got your book and this stuff is invaluable. So much stuff out there and its not all organized in a way that's easy to digest. Your books / videos are great!
@SebastianRaschka 3 หลายเดือนก่อน
Glad to hear that the organization makes it accessible! That’s usually the trickiest part!
@nish2288 3 หลายเดือนก่อน ⁺²
Super helpful. Thanks for sharing.
looking forward to more such videos on LLMs.
Keep it up!!
@parthsarthisharma4163 3 หลายเดือนก่อน ⁺²
Just finished the video, thank you very much for the detailed explanation. Next step is reading your book :) 🙂
@prashlovessamosa 3 หลายเดือนก่อน ⁺¹
Mr Sebastain I found your channel yesterday so greatful to you for such top notch education.
@taido4883 2 หลายเดือนก่อน ⁺¹
Thank you for such an amazing book, such an invaluable source for a beginner like me!
I watched the 4-hour lecture by Kapathy and initially thought that your content could hardly be impressed. However, I am "wow" reading through every single chapter of your book.
@SebastianRaschka 2 หลายเดือนก่อน ⁺¹
I am super glad to hear that the book was worth your while!
@thefatcat-hd6ze 3 หลายเดือนก่อน ⁺⁹
What a time to be alive haha, love your book.
@Philmad 3 หลายเดือนก่อน ⁺²
Indeed, great book
@deepaksingh9318 3 หลายเดือนก่อน
Which Book are we talking about here?can anyone also give me the name please 🙂
@AhmedMostafa-r2u 3 หลายเดือนก่อน
@@deepaksingh9318 LLMs From Scratch at Minning
@deepaksingh9318 3 หลายเดือนก่อน
@@AhmedMostafa-r2u thanks ☺️
@thehard-coder9398 3 หลายเดือนก่อน ⁺²
@SebastianRaschka - I just bought the book(How to build a LLM from scratch). Thank you for all your great effort!. :) I look forward to your new content soon. :)
@SebastianRaschka 3 หลายเดือนก่อน ⁺¹
I hope you are enjoying the book! Happy reading!
@challengemania4447 19 วันที่ผ่านมา
hi can you please help me by answering the below question.
Does Sebastian explained all the things which are present in that book ?
@Alexander-je3qc 3 หลายเดือนก่อน ⁺⁸
Just finished the book, extremely pedagogical and valuable. Great job as always Sebastian!
@SebastianRaschka 3 หลายเดือนก่อน ⁺¹
Thanks for the feedback! Glad you got lots out if it!
@hokage5619 3 หลายเดือนก่อน ⁺¹
Thanks a lot Sebastian! Coding from scratch up made most concepts crystal clear for me.
@SebastianRaschka 3 หลายเดือนก่อน
Nice, I am very glad to hear this!
@shreyaskatiyar614 3 หลายเดือนก่อน ⁺²
Make more videos professor ! Ur knowledge is enlightening me a lot !
@amitabhachakraborty497 3 หลายเดือนก่อน ⁺¹
I am following your blogs from very long time.i have already purchased your new book LLM .I have also purchased your machine learning books.Please upload such contents more .
@SebastianRaschka 3 หลายเดือนก่อน
Thanks for the kind support!
@SHAMIKII 3 หลายเดือนก่อน ⁺¹
Thank you very much for giving a short and sweet(i have patience for week long workshops too :D) overview of building an LLM, pre-training and fine-tuning it.
Looking to explore deeper from your detailed code base of your book.
🙏
@SebastianRaschka 3 หลายเดือนก่อน
Glad this was useful! Ha, yeah, a week long workshop would be interesting, but with a full-time job, it would be a bit tough to carve out the time to record it 😅
@SHAMIKII 3 หลายเดือนก่อน ⁺¹
@@SebastianRaschka Completely agree with you. Only if my job workshops would be as useful as these ones. ;)
The benefit of these videos is that even though its hours long, i can always pause it and re-visit it when i have time.
@SebastianRaschka 3 หลายเดือนก่อน
@@SHAMIKII Thanks for the kind compliment!
@satishlokkoju6844 3 หลายเดือนก่อน ⁺³
Thank you for developing watermark python package. I became aware of your work because of how amazing watermark was and wanted to find out what else the author is upto!
@SebastianRaschka 3 หลายเดือนก่อน
Small world 😊
@hopelesssuprem1867 หลายเดือนก่อน ⁺¹
thank you very much for such a rare and useful tutorial. Bro, you're really good ml-engineer
@p3nGu1nZz 3 หลายเดือนก่อน ⁺¹
Thank you for putting this together. One of the best talks on the technicals.
@maysammansor 3 หลายเดือนก่อน ⁺¹
Sebastian I like your deep contents.we appreciate the time you put into this
@Tothefutureand 3 หลายเดือนก่อน ⁺¹
I have read so many of your educational materials and it has been useful that I feel like you are one of my close friends .
@SebastianRaschka 3 หลายเดือนก่อน
Glad my materials are so useful that you keep turning back to them!
@bosepukur 3 หลายเดือนก่อน ⁺¹
Thank you for a such a awesome contribution towards democratizing LLM research
@Philmad 3 หลายเดือนก่อน ⁺¹
Your book was already a great read and practice.
@SebastianRaschka 3 หลายเดือนก่อน
Glad to hear that you got lots out of my book?
@kuafou 3 หลายเดือนก่อน ⁺¹
Very great job! Just bought your book!
@SebastianRaschka 3 หลายเดือนก่อน
Thanks, happy reading and coding!
@prashlovessamosa 3 หลายเดือนก่อน ⁺¹
I am reading your book from mannings library loving it.
@SebastianRaschka 3 หลายเดือนก่อน
Thanks! Happy to hear this!
@isbestlizard หลายเดือนก่อน ⁺¹
Are there any llm's that can operate sort of asynchronously? Like right now, it's like you take a turn, then the llm takes a turn, and all the tokens are generated into one mixed stream. I feel like it would be better if there were multiple streams, one for the output of the llm, and others for the inputs, or an internal monologue/thinking stream, which generate a fixed number of tokens per second which could be null or no-op if the llm has nothing to say, and is listening to you or another llm speaking, but could interrupt at any time by starting to emit tokens.
@isbestlizard หลายเดือนก่อน
And then you could run multiple different fine tunes simultaneously. Maybe one is fine tuned to be a fact checker, and would interrupt if the other llm speaking says something it disagrees with. Maybe another is fine tuned in another discipline, and could interrupt if it thinks it has something important to say the other overlooked. Like, it should be like a lively conversation around a pub table, not a 'I write a few paragraphs then you write a few paragraphs back' kind of thing
@SebastianRaschka หลายเดือนก่อน
This is an interesting point. I think all APIs currently parse the user and assistant responses as structured input ( {"role": "user", "content": "..."}, {"role": "assistant", "content":"...}, ...), but it's still a single stream. The only example of somewhat asynchronous processing is speculative decoding, where a second LLM is used to predict some future tokens in parallel. But it's kind of a different beast still.
@alisaghi051 3 หลายเดือนก่อน ⁺¹
1:22:40 You are right Sebastion, for me it did not have the peak that you have gotten here. BTW, thanks a lot for this tutorial and your "Introduction to Deep Learning and Generative Modeling" course as well.
@aabhamishra3952 3 หลายเดือนก่อน ⁺¹
This is absolutely amazing. On WIsonsin!
@chrisogonas 3 หลายเดือนก่อน ⁺¹
Incredible! Thanks for sharing this great resource.
@xray788 3 หลายเดือนก่อน ⁺²
Amazing Sebastian 👏 Thank you so much. I also read your book and found it insightful. Will you be making some content on how we could get the LLM to have UI design like chatGPT?
@SebastianRaschka 3 หลายเดือนก่อน ⁺¹
This is an interesting point. It would be interesting but since I don’t enjoy web development very much, I don’t have any fixed plans for that yet.
@nasirnr5518 3 หลายเดือนก่อน ⁺¹
Great explanation as usual. Thank for sharing.
@neeravkaushal 3 หลายเดือนก่อน ⁺²
Thanks for the tutorial, Sebastian! Quick question. Why is layernorm before attention and before feedforward insrtead of after attention+residual connection and feedforward+residual connection. I understand there is a final norm as well but why before? Thanks!
@SebastianRaschka 3 หลายเดือนก่อน
Good question. There are actually different variants called Pre-LayerNorm and Post-LayerNorm. I summarized it in the section "(3) On Layer Normalization in the Transformer Architecture" here: magazine.sebastianraschka.com/p/understanding-large-language-models
@neeravkaushal 3 หลายเดือนก่อน
@@SebastianRaschka Thank you so much! Another quick question but kind of on a different tangent. Where does one compare a new model (say I built a new kind of model like a transformer or an RNN) and now I want to test/evaluate it so I can see how does it compare with the existing benchmarks of transformers or LSTMs so I can publish it? Is there a website where I can test this new model on some standard sota dataset they host? Sorry for the ill phrasing. I guess what I want to ask is that is there any website where you put your model and they test it for you on standard NLP tasks? So all you have to do is input your model and the output is the scores of evaluation on NLP tasks which you can then publish (if better)? Again, sorry for the long question but I have been trying to find its answer for a while now.
@SebastianRaschka 3 หลายเดือนก่อน ⁺¹
@@neeravkaushal Good question, I think it can be a bit tricky to get non-standard models in there, but there's tatsu-lab.github.io/alpaca_eval/ and huggingface.co/spaces/open-llm-leaderboard/open_llm_leaderboard
@neeravkaushal 3 หลายเดือนก่อน
@@SebastianRaschka Thank you so much. Very helpful. :-)
@user-wr4yl7tx3w 3 หลายเดือนก่อน ⁺²
always look forward to your content. 👍
@r0back55 3 หลายเดือนก่อน ⁺⁴
I think it is exactly what I was waiting for 😍
@SebastianRaschka 3 หลายเดือนก่อน ⁺¹
Happy coding!
@peterm5039 3 หลายเดือนก่อน
Great video so far. I just watched the data prep portion. I am pretty interested in embedding models, so wished you would have gone into that a bit. I understand why it was cut, though. Do you have any videos that explain that part? Thanks again!
@paneercheeseparatha 3 หลายเดือนก่อน ⁺²
Just finished watching the entire video. Amazing! But could you also make a video providing an in-depth understanding of tokenizers? I'm struggling with its implementation especially while modifying the vocabulary for different languages.
I've also watched your STAT 453 lectures, which helped me understand GANs and ML models in detail. Thanks a lot. ♥
@SebastianRaschka 3 หลายเดือนก่อน ⁺³
Great suggestion. I was actually doing that (extending the vocab of a tokenizer and adjusting the embedding layer and output layer of an LLM accordingly) for a little side project. Hope to find the time to put together a tutorial on that some time
@paneercheeseparatha 3 หลายเดือนก่อน ⁺¹
@@SebastianRaschka Thanks for considering! Really looking forward to it.
@brenok 3 หลายเดือนก่อน ⁺¹
Also check out Karpathys 2hour video on building a tokenizer: th-cam.com/video/zduSFxRajkE/w-d-xo.html
@paneercheeseparatha 3 หลายเดือนก่อน
@@brenok Oh. Thanks a lot. I completely forgot to check Andrej's channel. Thanks for the reference.
@YaswanthPrasad-f3w 2 หลายเดือนก่อน
39:20 ,my code is throwing me an error stating there is no recognised package called supplementary
Anyone please help to tackle this
@SebastianRaschka 2 หลายเดือนก่อน
Hey there. I just double-checked and the supplementary.py file seems to be present in both the GitHub repository and the Studio. Maybe you accidentally deleted or moved it?
@towhidurrahman8961 2 หลายเดือนก่อน ⁺¹
very good job. it is a simple text based model building. if there are complex mathematical equations, graphs and tables related to article related to complex mathematical problems, how can i prepare the model?
@SebastianRaschka 2 หลายเดือนก่อน
That's a good question. It would require a lot of extra work. Probably a book (or at least a workshop) in itself. To understand the general process, I can recommend the Qwen2.5-Math report (arxiv.org/pdf/2409.12122) which outlines how the researchers took a text model (here: Qwen 2) and finetuned it for math.
@mentalhealthcore 3 หลายเดือนก่อน ⁺¹
outstanding Doc, this wunderbar...thank you 🤙
@allenlu2007 3 หลายเดือนก่อน ⁺¹
Excellent video and book! Maybe a sequel about LLM inference, like KV cache and other acceleration schemes?
@SebastianRaschka 3 หลายเดือนก่อน
Yeah, this would be a good topic for another book one day…
@Futura2005 24 วันที่ผ่านมา ⁺¹
Throwing error in Lightning Studio while reading text file in cell 1. Can't move further. What to do
FileNotFoundError Traceback (most recent call last)
Cell In[1], line 1
----> 1 with open("LLM-workshop-2024/02_data/the-verdict.txt", "r", encoding="utf-8") as f:
@AlejandroSánchezPoveda หลายเดือนก่อน ⁺¹
I am enrolled in Machine Learning about a year know as a first semester college student in Colombia and the fact that biblotechs offers high quality material ( Becasue if more confortable habinf it on physisc ) Looking forward to reach out such level. For now I'live Yout book Machine Leanring With pytoch and SCi-Kit Learn.
@ahmedtremo 3 หลายเดือนก่อน ⁺¹
Great Video, thanks for putting in the time!
@wajihaabdullah6601 หลายเดือนก่อน ⁺¹
Is it possible to work with VLMS or ViT with low memory due to lack of GPUS or TPU
@SebastianRaschka หลายเดือนก่อน
Yes, the ViT in a VLM setup is usually smaller and less resource-intensive as the LLM itself. In the worst case, you can first load the ViT, encode the image, and then load the LLM afterwards. But that's going to be a slower process. Btw if you are interested in VLMs/multimodal LLMs, I published a new blog article discussing the methods last week: magazine.sebastianraschka.com/p/understanding-multimodal-llms
@berlinbrown03 3 หลายเดือนก่อน
Great, keep it coming, hope to use.
@bezozo97 3 หลายเดือนก่อน ⁺¹
I have a question regarding the outputs of the llm - what's the point of having the vectors of existing tokens in the output, instead of only the next token's vector? If I understand correctly, those are discarded anyway.
@SebastianRaschka 3 หลายเดือนก่อน
You use them for the next-word prediction task during training. If you have the sentence "the world is round", then this gives you 3 prediction tasks "the -> world", "the world -> is", and "the world is -> round" instead of just one prediction task "the world is -> round"
@bezozo97 3 หลายเดือนก่อน
@@SebastianRaschka Thanks for your reply, I'm trying to understand the rationale. More prediction tasks, so this is mainly a way to increase training efficiency. But it seems to me that by doing this we're training a copy machine along the next-word prediction. I need to read up more in this topic. Thank you so much for the great video!
@Pingu_astrocat21 3 หลายเดือนก่อน
Thank you for this❤ Such a detailed explanation!
@dhruv-v8w 3 หลายเดือนก่อน ⁺¹
Love your work!
@AltafRehmani 3 หลายเดือนก่อน ⁺¹
Thanks for this. really appreciated
@muhammadsaad3793 3 หลายเดือนก่อน
Hi Sebastian, this was amazing; thank you for making this video!
Quick question. I would like to build an LLM for my reading notes and blog posts. I would like to prompt questions, and the LLM should go into the dataset and find the answer.
If I were to follow these steps, would I be able to do that?
Thanks!
@SebastianRaschka 3 หลายเดือนก่อน
There would be two general approaches: (1) Finetune the model on your dataset or (2) build a RAG application around the model. RAG is a system that feeds a model with chunks from the dataset during inference. I have a brief outline here: github.com/rasbt/RAGs
@shahedmomenzadeh 3 หลายเดือนก่อน ⁺¹
Thanks for this workshop. Did you finish the book or is it still under development?
@SebastianRaschka 3 หลายเดือนก่อน
I finished the last chapter a few months ago, and it's now been layouted and sent to the printer as of last week, which means the print version should be available soon :)
@adityanjsg99 หลายเดือนก่อน
A thorough tutorial
@thrivefoxxgaming1120 3 หลายเดือนก่อน ⁺³
Wow what a blessing 🎉
@ChocolateMilkCultLeader 3 หลายเดือนก่อน ⁺¹
Dropping heat as usual
@nguyenhuuuc2311 3 หลายเดือนก่อน ⁺³
A year ago, I really wished there was a video like this! Congrats on finishing the project (book) ahead of schedule and distil a year's work into a 3-hour video 😂
@SebastianRaschka 3 หลายเดือนก่อน ⁺²
Thanks! Working on the book has been intense but also a lot of fun :). The workshop covers only like 10% (otherwise it would be 30 rather than 3 hours) but I hope it’s useful!
@nagahemachandchinta5498 2 หลายเดือนก่อน ⁺¹
There's so much happening in this field. I feel overwhelmed, I start with basics but the field is moving so fast and jobs need advanced skills. How do I learn quickly and stay updated? Please suggest me.
@thehard-coder9398 3 หลายเดือนก่อน
Thanks for creating such an amazing video!!! Just one quick question, I failed to open the Studio in the Lightning Studio. Any idea? Your response is much appreciated.
@SebastianRaschka 3 หลายเดือนก่อน
Thanks for letting me know. Was there any particular error or issue you were getting. Or, if you don’t mind, could you describe the problem in a bit more detail?
@thehard-coder9398 3 หลายเดือนก่อน ⁺¹
@@SebastianRaschka - Hi, thank you for your prompt reply. Kindly see the error message here . The error message pops up when I hit the button "Open in Studio". Thanks in advance
@SebastianRaschka 3 หลายเดือนก่อน
@@thehard-coder9398 Huh, that's a weird one, I will ask my colleagues to see what's up. Thanks!
@thehard-coder9398 3 หลายเดือนก่อน ⁺¹
@@SebastianRaschka - Thanks! I look forward to hearing from your response soon. :)
@SebastianRaschka 3 หลายเดือนก่อน
@@thehard-coder9398 We tried to reproduce this issue but couldn't find the issue. Could you give it another try?
@jasonjimenez9116 3 หลายเดือนก่อน ⁺¹
Is this a companion video of your LLM Book?
@SebastianRaschka 3 หลายเดือนก่อน
Good question: yes and no. It's based on the book but it only covers about ~10%. The code notebooks have also been substantially simplified otherwise it would be a much longer video.
@michaelodonnell5710 3 หลายเดือนก่อน
I'm now 2:00 into this video and I think I'm going to enjoy it! He seems to be one of those who have that distracting verbal tic where he says "Yeah" every 7th word but, fortunately, his S/N ratio appears to be high so we can forgive him...
@SebastianRaschka 3 หลายเดือนก่อน ⁺¹
Yeah, the free version has a lot of these
@PradeepKumar6 3 หลายเดือนก่อน
Is the working of BPE covered in your book? you mentioned in the video that It is very long topic to talk so just asking if its covered in the book. Thanks however, for this video. very useful
@SebastianRaschka 3 หลายเดือนก่อน
The book is focused in implementing the LLM, training, and finetuning it etc. But I am planning to add bonus material on implementing BPE. I implemented the algo a while back, just need some time to add explanations.
@PradeepKumar6 3 หลายเดือนก่อน ⁺¹
@@SebastianRaschka thank you, i read your other book on pytorch and machine learning. It was very good. I will buy this one as well. Thanks
@SanjeevKumar-j6u 3 หลายเดือนก่อน
Is the print version of book available ? Amazon shows availability sometime in late October?
@SanjeevKumar-j6u 3 หลายเดือนก่อน
@SebastianRaschka Is the print version of book available ? Amazon shows availability sometime in late October?
@Humble_Electronic_Musician 3 หลายเดือนก่อน ⁺²
Awesome 👏🏻
(Even though awesome is an understatement…)
@SebastianRaschka 3 หลายเดือนก่อน ⁺¹
Thanks!
@first-fundamental-field 3 หลายเดือนก่อน
Way to go, Seb! 🖐️
@neuralfalcon 3 หลายเดือนก่อน ⁺¹
Thank You
@klncgty 2 หลายเดือนก่อน
many thanks!
@user-wr4yl7tx3w 3 หลายเดือนก่อน ⁺¹
awesome!
@MannyBernabe 2 หลายเดือนก่อน
lovely. thank you
@SettimiTommaso 3 หลายเดือนก่อน ⁺¹
Yes!
@superfreiheit1 3 หลายเดือนก่อน
Code area are to small cant see
@HopkinsDean-r8i 3 หลายเดือนก่อน
Lewis Donna Walker Jennifer Johnson Mary
@ChocolateMilkCultLeader 3 หลายเดือนก่อน
Doctor- You only have 2:45:10 to live.
Me:

ต่อไป

เล่นอัตโนมัติ

[1hr Talk] Intro to Large Language Models