Building LLMs from the Ground Up: A 3-hour Coding Workshop

แชร์
ฝัง
  • เผยแพร่เมื่อ 7 พ.ย. 2024

ความคิดเห็น • 130

  • @joneskin1432
    @joneskin1432 2 หลายเดือนก่อน +51

    Dude I keep accidentally running into your content while learning this material. The other day I was trying firing off weirdly specific google searches while trying to build intuition on how self-attention works and I found a year old comment you wrote on reddit that nailed what I was having trouble with. Just bought your book MEAP, you've been doing an amazing job, keep it up!

    • @SebastianRaschka
      @SebastianRaschka  2 หลายเดือนก่อน +4

      Whoa what a small world. Glad you are finding this useful and consider getting a copy of my book!

    • @razeo7068
      @razeo7068 2 หลายเดือนก่อน +2

      Can you share the self-attention reddit link

    • @thehard-coder9398
      @thehard-coder9398 2 หลายเดือนก่อน

      @joneskin1432 - Would you mind sharing what version did you get the MEAP? eBook or Text Book? Mind to share the link? Many thanks!

    • @shahabmos5130
      @shahabmos5130 หลายเดือนก่อน

      You are a computer enginneer and still brelive in accidents .
      Wake up.

  • @haribhauhud8881
    @haribhauhud8881 2 หลายเดือนก่อน +10

    Dear Sebastian,
    I hope you are doing well. I am writing to express my deepest gratitude for your incredible effort and dedication to teaching on the online platform. Your generosity in sharing your knowledge for free has made a profound impact on so many of us.
    Your classes have been a beacon of light in these challenging times, providing not only education but also inspiration and hope. The clarity with which you explain complex topics and your unwavering patience in addressing our questions have been truly remarkable.
    Thank you for your time, energy, and passion for teaching. You've made a significant difference in my learning journey, and I am immensely grateful for the knowledge and wisdom you've imparted.
    Wishing you all the best in your future endeavors. 😊
    Warm regards,
    Hari

    • @SebastianRaschka
      @SebastianRaschka  2 หลายเดือนก่อน

      Thanks so much for this very kind message, Hari. This is very nice of you, and it's very motivating to hear this!

  • @paolodragol
    @paolodragol หลายเดือนก่อน +4

    Sebastian, I want to sincerely thank you for providing such good material. I cannot express my gratitude enough! I admire your desire to share this content with such clarity and human touch! Thanks a lot!

  • @devtest8078
    @devtest8078 2 หลายเดือนก่อน +3

    23 mins in. This is by far, the best tutorial I have seen on building LLMs from scratch. I have followed you for a while Sebastian for all the great contributions you have made over the years, but you have outdone yourself once again. Well done man and Thank you.

    • @devtest8078
      @devtest8078 2 หลายเดือนก่อน +1

      96 mins in. Still awesome.

    • @SebastianRaschka
      @SebastianRaschka  2 หลายเดือนก่อน

      @@devtest8078 Hah, thanks so much!

    • @devtest8078
      @devtest8078 2 หลายเดือนก่อน

      👏👏👏

  • @atlasflare7824
    @atlasflare7824 2 หลายเดือนก่อน +5

    This is a gem for me as a Msc AI student. Thank you for making this.

  • @masonholcombe3327
    @masonholcombe3327 2 หลายเดือนก่อน +4

    Your deep learning series got me through stat 453 at uw Madison and now this workshop has been the perfect transition into LLMs! Great video Sebastian!

    • @SebastianRaschka
      @SebastianRaschka  2 หลายเดือนก่อน

      Wow, small world, and I am glad to hear that this video was useful as well!

  • @Alexander-je3qc
    @Alexander-je3qc 2 หลายเดือนก่อน +7

    Just finished the book, extremely pedagogical and valuable. Great job as always Sebastian!

    • @SebastianRaschka
      @SebastianRaschka  2 หลายเดือนก่อน +1

      Thanks for the feedback! Glad you got lots out if it!

  • @taido4883
    @taido4883 หลายเดือนก่อน +1

    Thank you for such an amazing book, such an invaluable source for a beginner like me!
    I watched the 4-hour lecture by Kapathy and initially thought that your content could hardly be impressed. However, I am "wow" reading through every single chapter of your book.

    • @SebastianRaschka
      @SebastianRaschka  หลายเดือนก่อน +1

      I am super glad to hear that the book was worth your while!

  • @prashlovessamosa
    @prashlovessamosa 2 หลายเดือนก่อน +1

    Mr Sebastain I found your channel yesterday so greatful to you for such top notch education.

  • @hasaniqbal3180
    @hasaniqbal3180 2 หลายเดือนก่อน +1

    Thank you. I recently got your book and this stuff is invaluable. So much stuff out there and its not all organized in a way that's easy to digest. Your books / videos are great!

    • @SebastianRaschka
      @SebastianRaschka  2 หลายเดือนก่อน

      Glad to hear that the organization makes it accessible! That’s usually the trickiest part!

  • @hokage5619
    @hokage5619 หลายเดือนก่อน +1

    Thanks a lot Sebastian! Coding from scratch up made most concepts crystal clear for me.

    • @SebastianRaschka
      @SebastianRaschka  หลายเดือนก่อน

      Nice, I am very glad to hear this!

  • @parthsarthisharma4163
    @parthsarthisharma4163 2 หลายเดือนก่อน +2

    Just finished the video, thank you very much for the detailed explanation. Next step is reading your book :) 🙂

  • @thehard-coder9398
    @thehard-coder9398 2 หลายเดือนก่อน +2

    @SebastianRaschka - I just bought the book(How to build a LLM from scratch). Thank you for all your great effort!. :) I look forward to your new content soon. :)

    • @SebastianRaschka
      @SebastianRaschka  2 หลายเดือนก่อน +1

      I hope you are enjoying the book! Happy reading!

  • @thefatcat-hd6ze
    @thefatcat-hd6ze 2 หลายเดือนก่อน +9

    What a time to be alive haha, love your book.

    • @Philmad
      @Philmad 2 หลายเดือนก่อน +2

      Indeed, great book

    • @deepaksingh9318
      @deepaksingh9318 2 หลายเดือนก่อน

      Which Book are we talking about here?can anyone also give me the name please 🙂

    • @AhmedMostafa-r2u
      @AhmedMostafa-r2u 2 หลายเดือนก่อน

      @@deepaksingh9318 LLMs From Scratch at Minning

    • @deepaksingh9318
      @deepaksingh9318 2 หลายเดือนก่อน

      @@AhmedMostafa-r2u thanks ☺️

  • @shreyaskatiyar614
    @shreyaskatiyar614 2 หลายเดือนก่อน +2

    Make more videos professor ! Ur knowledge is enlightening me a lot !

  • @nish2288
    @nish2288 2 หลายเดือนก่อน +2

    Super helpful. Thanks for sharing.
    looking forward to more such videos on LLMs.
    Keep it up!!

  • @maysammansor
    @maysammansor 2 หลายเดือนก่อน +1

    Sebastian I like your deep contents.we appreciate the time you put into this

  • @bosepukur
    @bosepukur 2 หลายเดือนก่อน +1

    Thank you for a such a awesome contribution towards democratizing LLM research

  • @p3nGu1nZz
    @p3nGu1nZz 2 หลายเดือนก่อน +1

    Thank you for putting this together. One of the best talks on the technicals.

  • @satishlokkoju6844
    @satishlokkoju6844 2 หลายเดือนก่อน +3

    Thank you for developing watermark python package. I became aware of your work because of how amazing watermark was and wanted to find out what else the author is upto!

  • @Tothefutureand
    @Tothefutureand 2 หลายเดือนก่อน +1

    I have read so many of your educational materials and it has been useful that I feel like you are one of my close friends .

    • @SebastianRaschka
      @SebastianRaschka  2 หลายเดือนก่อน

      Glad my materials are so useful that you keep turning back to them!

  • @amitabhachakraborty497
    @amitabhachakraborty497 2 หลายเดือนก่อน +1

    I am following your blogs from very long time.i have already purchased your new book LLM .I have also purchased your machine learning books.Please upload such contents more .

    • @SebastianRaschka
      @SebastianRaschka  2 หลายเดือนก่อน

      Thanks for the kind support!

  • @SHAMIKII
    @SHAMIKII 2 หลายเดือนก่อน +1

    Thank you very much for giving a short and sweet(i have patience for week long workshops too :D) overview of building an LLM, pre-training and fine-tuning it.
    Looking to explore deeper from your detailed code base of your book.
    🙏

    • @SebastianRaschka
      @SebastianRaschka  2 หลายเดือนก่อน

      Glad this was useful! Ha, yeah, a week long workshop would be interesting, but with a full-time job, it would be a bit tough to carve out the time to record it 😅

    • @SHAMIKII
      @SHAMIKII 2 หลายเดือนก่อน +1

      @@SebastianRaschka Completely agree with you. Only if my job workshops would be as useful as these ones. ;)
      The benefit of these videos is that even though its hours long, i can always pause it and re-visit it when i have time.

    • @SebastianRaschka
      @SebastianRaschka  2 หลายเดือนก่อน

      @@SHAMIKII Thanks for the kind compliment!

  • @iamsnglrty
    @iamsnglrty 2 หลายเดือนก่อน +13

    "Thank you! I love your work, Sebastian. 😊
    I hope my small token of appreciation will motivate you further to create more content like this.
    By the way, I already own most of your books. My favorite is your recent one - Build a Large Language Model (from Scratch)." 📚

    • @SebastianRaschka
      @SebastianRaschka  2 หลายเดือนก่อน +1

      Wow, thanks so much for the kind support!

  • @kuafou
    @kuafou หลายเดือนก่อน +1

    Very great job! Just bought your book!

    • @SebastianRaschka
      @SebastianRaschka  หลายเดือนก่อน

      Thanks, happy reading and coding!

  • @prashlovessamosa
    @prashlovessamosa หลายเดือนก่อน +1

    I am reading your book from mannings library loving it.

  • @Philmad
    @Philmad 2 หลายเดือนก่อน +1

    Your book was already a great read and practice.

    • @SebastianRaschka
      @SebastianRaschka  2 หลายเดือนก่อน

      Glad to hear that you got lots out of my book?

  • @xray788
    @xray788 2 หลายเดือนก่อน +2

    Amazing Sebastian 👏 Thank you so much. I also read your book and found it insightful. Will you be making some content on how we could get the LLM to have UI design like chatGPT?

    • @SebastianRaschka
      @SebastianRaschka  2 หลายเดือนก่อน +1

      This is an interesting point. It would be interesting but since I don’t enjoy web development very much, I don’t have any fixed plans for that yet.

  • @chrisogonas
    @chrisogonas 2 หลายเดือนก่อน +1

    Incredible! Thanks for sharing this great resource.

  • @r0back55
    @r0back55 2 หลายเดือนก่อน +4

    I think it is exactly what I was waiting for 😍

  • @nasirnr5518
    @nasirnr5518 2 หลายเดือนก่อน +1

    Great explanation as usual. Thank for sharing.

  • @aabhamishra3952
    @aabhamishra3952 2 หลายเดือนก่อน +1

    This is absolutely amazing. On WIsonsin!

  • @user-wr4yl7tx3w
    @user-wr4yl7tx3w หลายเดือนก่อน +2

    always look forward to your content. 👍

  • @alisaghi051
    @alisaghi051 2 หลายเดือนก่อน +1

    1:22:40 You are right Sebastion, for me it did not have the peak that you have gotten here. BTW, thanks a lot for this tutorial and your "Introduction to Deep Learning and Generative Modeling" course as well.

  • @mentalhealthcore
    @mentalhealthcore 2 หลายเดือนก่อน +1

    outstanding Doc, this wunderbar...thank you 🤙

  • @paneercheeseparatha
    @paneercheeseparatha 2 หลายเดือนก่อน +2

    Just finished watching the entire video. Amazing! But could you also make a video providing an in-depth understanding of tokenizers? I'm struggling with its implementation especially while modifying the vocabulary for different languages.
    I've also watched your STAT 453 lectures, which helped me understand GANs and ML models in detail. Thanks a lot. ♥

    • @SebastianRaschka
      @SebastianRaschka  2 หลายเดือนก่อน +3

      Great suggestion. I was actually doing that (extending the vocab of a tokenizer and adjusting the embedding layer and output layer of an LLM accordingly) for a little side project. Hope to find the time to put together a tutorial on that some time

    • @paneercheeseparatha
      @paneercheeseparatha 2 หลายเดือนก่อน +1

      ​@@SebastianRaschka Thanks for considering! Really looking forward to it.

    • @brenok
      @brenok 2 หลายเดือนก่อน +1

      Also check out Karpathys 2hour video on building a tokenizer: th-cam.com/video/zduSFxRajkE/w-d-xo.html

    • @paneercheeseparatha
      @paneercheeseparatha 2 หลายเดือนก่อน

      @@brenok Oh. Thanks a lot. I completely forgot to check Andrej's channel. Thanks for the reference.

  • @peterm5039
    @peterm5039 หลายเดือนก่อน

    Great video so far. I just watched the data prep portion. I am pretty interested in embedding models, so wished you would have gone into that a bit. I understand why it was cut, though. Do you have any videos that explain that part? Thanks again!

  • @Pingu_astrocat21
    @Pingu_astrocat21 2 หลายเดือนก่อน

    Thank you for this❤ Such a detailed explanation!

  • @dhruv-v8w
    @dhruv-v8w 2 หลายเดือนก่อน +1

    Love your work!

  • @neeravkaushal
    @neeravkaushal หลายเดือนก่อน +2

    Thanks for the tutorial, Sebastian! Quick question. Why is layernorm before attention and before feedforward insrtead of after attention+residual connection and feedforward+residual connection. I understand there is a final norm as well but why before? Thanks!

    • @SebastianRaschka
      @SebastianRaschka  หลายเดือนก่อน

      Good question. There are actually different variants called Pre-LayerNorm and Post-LayerNorm. I summarized it in the section "(3) On Layer Normalization in the Transformer Architecture" here: magazine.sebastianraschka.com/p/understanding-large-language-models

    • @neeravkaushal
      @neeravkaushal หลายเดือนก่อน

      @@SebastianRaschka Thank you so much! Another quick question but kind of on a different tangent. Where does one compare a new model (say I built a new kind of model like a transformer or an RNN) and now I want to test/evaluate it so I can see how does it compare with the existing benchmarks of transformers or LSTMs so I can publish it? Is there a website where I can test this new model on some standard sota dataset they host? Sorry for the ill phrasing. I guess what I want to ask is that is there any website where you put your model and they test it for you on standard NLP tasks? So all you have to do is input your model and the output is the scores of evaluation on NLP tasks which you can then publish (if better)? Again, sorry for the long question but I have been trying to find its answer for a while now.

    • @SebastianRaschka
      @SebastianRaschka  หลายเดือนก่อน +1

      @@neeravkaushal Good question, I think it can be a bit tricky to get non-standard models in there, but there's tatsu-lab.github.io/alpaca_eval/ and huggingface.co/spaces/open-llm-leaderboard/open_llm_leaderboard

    • @neeravkaushal
      @neeravkaushal หลายเดือนก่อน

      @@SebastianRaschka Thank you so much. Very helpful. :-)

  • @AltafRehmani
    @AltafRehmani 2 หลายเดือนก่อน +1

    Thanks for this. really appreciated

  • @towhidurrahman8961
    @towhidurrahman8961 20 วันที่ผ่านมา +1

    very good job. it is a simple text based model building. if there are complex mathematical equations, graphs and tables related to article related to complex mathematical problems, how can i prepare the model?

    • @SebastianRaschka
      @SebastianRaschka  19 วันที่ผ่านมา

      That's a good question. It would require a lot of extra work. Probably a book (or at least a workshop) in itself. To understand the general process, I can recommend the Qwen2.5-Math report (arxiv.org/pdf/2409.12122) which outlines how the researchers took a text model (here: Qwen 2) and finetuned it for math.

  • @ahmedtremo
    @ahmedtremo 2 หลายเดือนก่อน +1

    Great Video, thanks for putting in the time!

  • @berlinbrown03
    @berlinbrown03 หลายเดือนก่อน

    Great, keep it coming, hope to use.

  • @allenlu2007
    @allenlu2007 2 หลายเดือนก่อน +1

    Excellent video and book! Maybe a sequel about LLM inference, like KV cache and other acceleration schemes?

    • @SebastianRaschka
      @SebastianRaschka  2 หลายเดือนก่อน

      Yeah, this would be a good topic for another book one day…

  • @thrivefoxxgaming1120
    @thrivefoxxgaming1120 2 หลายเดือนก่อน +3

    Wow what a blessing 🎉

  • @muhammadsaad3793
    @muhammadsaad3793 หลายเดือนก่อน

    Hi Sebastian, this was amazing; thank you for making this video!
    Quick question. I would like to build an LLM for my reading notes and blog posts. I would like to prompt questions, and the LLM should go into the dataset and find the answer.
    If I were to follow these steps, would I be able to do that?
    Thanks!

    • @SebastianRaschka
      @SebastianRaschka  หลายเดือนก่อน

      There would be two general approaches: (1) Finetune the model on your dataset or (2) build a RAG application around the model. RAG is a system that feeds a model with chunks from the dataset during inference. I have a brief outline here: github.com/rasbt/RAGs

  • @nguyenhuuuc2311
    @nguyenhuuuc2311 2 หลายเดือนก่อน +3

    A year ago, I really wished there was a video like this! Congrats on finishing the project (book) ahead of schedule and distil a year's work into a 3-hour video 😂

    • @SebastianRaschka
      @SebastianRaschka  2 หลายเดือนก่อน +2

      Thanks! Working on the book has been intense but also a lot of fun :). The workshop covers only like 10% (otherwise it would be 30 rather than 3 hours) but I hope it’s useful!

  • @shahedmomenzadeh
    @shahedmomenzadeh 2 หลายเดือนก่อน +1

    Thanks for this workshop. Did you finish the book or is it still under development?

    • @SebastianRaschka
      @SebastianRaschka  2 หลายเดือนก่อน

      I finished the last chapter a few months ago, and it's now been layouted and sent to the printer as of last week, which means the print version should be available soon :)

  • @thehard-coder9398
    @thehard-coder9398 2 หลายเดือนก่อน

    Thanks for creating such an amazing video!!! Just one quick question, I failed to open the Studio in the Lightning Studio. Any idea? Your response is much appreciated.

    • @SebastianRaschka
      @SebastianRaschka  2 หลายเดือนก่อน

      Thanks for letting me know. Was there any particular error or issue you were getting. Or, if you don’t mind, could you describe the problem in a bit more detail?

    • @thehard-coder9398
      @thehard-coder9398 2 หลายเดือนก่อน +1

      @@SebastianRaschka - Hi, thank you for your prompt reply. Kindly see the error message here . The error message pops up when I hit the button "Open in Studio". Thanks in advance

    • @SebastianRaschka
      @SebastianRaschka  2 หลายเดือนก่อน

      @@thehard-coder9398 Huh, that's a weird one, I will ask my colleagues to see what's up. Thanks!

    • @thehard-coder9398
      @thehard-coder9398 2 หลายเดือนก่อน +1

      @@SebastianRaschka - Thanks! I look forward to hearing from your response soon. :)

    • @SebastianRaschka
      @SebastianRaschka  หลายเดือนก่อน

      @@thehard-coder9398 We tried to reproduce this issue but couldn't find the issue. Could you give it another try?

  • @ChocolateMilkCultLeader
    @ChocolateMilkCultLeader 2 หลายเดือนก่อน +1

    Dropping heat as usual

  • @Humble_Electronic_Musician
    @Humble_Electronic_Musician 2 หลายเดือนก่อน +2

    Awesome 👏🏻
    (Even though awesome is an understatement…)

  • @klncgty
    @klncgty 27 วันที่ผ่านมา

    many thanks!

  • @michaelodonnell5710
    @michaelodonnell5710 2 หลายเดือนก่อน

    I'm now 2:00 into this video and I think I'm going to enjoy it! He seems to be one of those who have that distracting verbal tic where he says "Yeah" every 7th word but, fortunately, his S/N ratio appears to be high so we can forgive him...

    • @SebastianRaschka
      @SebastianRaschka  2 หลายเดือนก่อน +1

      Yeah, the free version has a lot of these

  • @bezozo97
    @bezozo97 หลายเดือนก่อน +1

    I have a question regarding the outputs of the llm - what's the point of having the vectors of existing tokens in the output, instead of only the next token's vector? If I understand correctly, those are discarded anyway.

    • @SebastianRaschka
      @SebastianRaschka  หลายเดือนก่อน

      You use them for the next-word prediction task during training. If you have the sentence "the world is round", then this gives you 3 prediction tasks "the -> world", "the world -> is", and "the world is -> round" instead of just one prediction task "the world is -> round"

    • @bezozo97
      @bezozo97 หลายเดือนก่อน

      ​@@SebastianRaschka Thanks for your reply, I'm trying to understand the rationale. More prediction tasks, so this is mainly a way to increase training efficiency. But it seems to me that by doing this we're training a copy machine along the next-word prediction. I need to read up more in this topic. Thank you so much for the great video!

  • @MannyBernabe
    @MannyBernabe หลายเดือนก่อน

    lovely. thank you

  • @first-fundamental-field
    @first-fundamental-field 2 หลายเดือนก่อน

    Way to go, Seb! 🖐️

  • @user-wr4yl7tx3w
    @user-wr4yl7tx3w 2 หลายเดือนก่อน +1

    awesome!

  • @neuralfalcon
    @neuralfalcon 2 หลายเดือนก่อน +1

    Thank You

  • @juliogodel
    @juliogodel 2 หลายเดือนก่อน +1

    Thanks!

    • @SebastianRaschka
      @SebastianRaschka  2 หลายเดือนก่อน

      Thanks for the very kind support!

  • @SanjeevKumar-j6u
    @SanjeevKumar-j6u 2 หลายเดือนก่อน

    Is the print version of book available ? Amazon shows availability sometime in late October?

    • @SanjeevKumar-j6u
      @SanjeevKumar-j6u 2 หลายเดือนก่อน

      ​ @SebastianRaschka Is the print version of book available ? Amazon shows availability sometime in late October?

  • @SettimiTommaso
    @SettimiTommaso 2 หลายเดือนก่อน +1

    Yes!

  • @nagahemachandchinta5498
    @nagahemachandchinta5498 21 วันที่ผ่านมา +1

    There's so much happening in this field. I feel overwhelmed, I start with basics but the field is moving so fast and jobs need advanced skills. How do I learn quickly and stay updated? Please suggest me.

  • @jasonjimenez9116
    @jasonjimenez9116 หลายเดือนก่อน +1

    Is this a companion video of your LLM Book?

    • @SebastianRaschka
      @SebastianRaschka  หลายเดือนก่อน

      Good question: yes and no. It's based on the book but it only covers about ~10%. The code notebooks have also been substantially simplified otherwise it would be a much longer video.

  • @PradeepKumar6
    @PradeepKumar6 2 หลายเดือนก่อน

    Is the working of BPE covered in your book? you mentioned in the video that It is very long topic to talk so just asking if its covered in the book. Thanks however, for this video. very useful

    • @SebastianRaschka
      @SebastianRaschka  2 หลายเดือนก่อน

      The book is focused in implementing the LLM, training, and finetuning it etc. But I am planning to add bonus material on implementing BPE. I implemented the algo a while back, just need some time to add explanations.

    • @PradeepKumar6
      @PradeepKumar6 2 หลายเดือนก่อน +1

      ​@@SebastianRaschka thank you, i read your other book on pytorch and machine learning. It was very good. I will buy this one as well. Thanks

  • @YaswanthPrasad-f3w
    @YaswanthPrasad-f3w 20 วันที่ผ่านมา

    39:20 ,my code is throwing me an error stating there is no recognised package called supplementary
    Anyone please help to tackle this

    • @SebastianRaschka
      @SebastianRaschka  20 วันที่ผ่านมา

      Hey there. I just double-checked and the supplementary.py file seems to be present in both the GitHub repository and the Studio. Maybe you accidentally deleted or moved it?

  • @superfreiheit1
    @superfreiheit1 2 หลายเดือนก่อน

    Code area are to small cant see

  • @HopkinsDean-r8i
    @HopkinsDean-r8i หลายเดือนก่อน

    Lewis Donna Walker Jennifer Johnson Mary

  • @ChocolateMilkCultLeader
    @ChocolateMilkCultLeader หลายเดือนก่อน

    Doctor- You only have 2:45:10 to live.
    Me: