Entry Point AI
Entry Point AI
  • 13
  • 202 088
RLHF & DPO Explained (In Simple Terms!)
Learn how Reinforcement Learning from Human Feedback (RLHF) actually works and why Direct Preference Optimization (DPO) and Kahneman-Tversky Optimization (KTO) are changing the game.
This video doesn't go deep on math. Instead, I provide a high-level overview of each technique to help you make practical decisions about where to focus your time and energy.
0:52 The Idea of Reinforcement Learning
1:55 Reinforcement Learning from Human Feedback (RLHF)
4:21 RLHF in a Nutshell
5:06 RLHF Variations
6:11 Challenges with RLHF
7:02 Direct Preference Optimization (DPO)
7:47 Preferences Dataset Example
8:29 DPO in a Nutshell
9:25 DPO Advantages over RLHF
10:32 Challenges with DPO
10:50 Kahneman-Tversky Optimization (KTO)
11:39 Prospect Theory
13:35 Sigmoid vs Value Function
13:49 KTO Dataset
15:28 KTO in a Nutshell
15:54 Advantages of KTO
18:03 KTO Hyperparameters
These are the three papers referenced in the video:
1. Deep reinforcement learning from human preferences (arxiv.org/abs/1706.03741)
2. Direct Preference Optimization:
Your Language Model is Secretly a Reward Model (arxiv.org/abs/2305.18290)
3. KTO: Model Alignment as Prospect Theoretic Optimization (arxiv.org/abs/2402.01306)
The Huggingface TRL library offers implementations for PPO, DPO, and KTO:
huggingface.co/docs/trl/main/en/kto_trainer
Want to prototype with prompts and supervised fine-tuning? Try Entry Point AI:
www.entrypointai.com/
How about connecting? I'm on LinkedIn:
www.linkedin.com/in/markhennings/
มุมมอง: 5 494

วีดีโอ

Ask GPT-4 in Google Sheets
มุมมอง 1.8K9 หลายเดือนก่อน
Learn an amazing trick to use OpenAI models directly inside Google Sheets. Watch as I transform whole columns of data using an AI prompt in seconds! After this video, you'll be able to do awesome things like write creative copy, standardize your data, or extract specific details from unstructured text. Topics covered: 0:27 Demo of LLM calls directly in Google Sheets 5:33 The custom function in ...
Fine-tuning Datasets with Synthetic Inputs
มุมมอง 4.2K10 หลายเดือนก่อน
👉 Start building your dataset at www.entrypointai.com There are virtually unlimited ways to fine-tune LLMs to improve performance at specific tasks... but where do you get the data from? In this video, I demonstrate one way that you can fine-tune without much data to start with - and use what little data you have to reverse-engineer the inputs required! I show step-by-step how to take a small s...
How Large Language Models (LLMs) Actually Work
มุมมอง 2.4Kปีที่แล้ว
In this video, I explain how language models generate text, why most of the process is actually deterministic (not random), and how you can shape the probability when selecting a next token from LLMs using parameters like temperature and top p. I cover temperature in-depth and demonstrate with a spreadsheet how different values change the probabilities. Topics: 00:10 Tokens & Why They Matter 03...
LoRA & QLoRA Fine-tuning Explained In-Depth
มุมมอง 60Kปีที่แล้ว
👉 Start fine-tuning at www.entrypointai.com In this video, I dive into how LoRA works vs full-parameter fine-tuning, explain why QLoRA is a step up, and provide an in-depth look at the LoRA-specific hyperparameters: Rank, Alpha, and Dropout. 0:26 - Why We Need Parameter-efficient Fine-tuning 1:32 - Full-parameter Fine-tuning 2:19 - LoRA Explanation 6:29 - What should Rank be? 8:04 - QLoRA and R...
Prompt Engineering, RAG, and Fine-tuning: Benefits and When to Use
มุมมอง 114Kปีที่แล้ว
Explore the difference between Prompt Engineering, Retrieval-augmented Generation (RAG), and Fine-tuning in this detailed overview. 01:14 Prompt Engineering RAG 02:50 How Retrieval Augmented Generation Works - Step-by-step 06:23 What is fine-tuning? 08:25 Fine-tuning misconceptions debunked 09:53 Fine-tuning strategies 13:25 Putting it all together 13:44 Final review and comparison of technique...
Fine-tuning 101 | Prompt Engineering Conference
มุมมอง 7Kปีที่แล้ว
👉 Start fine-tuning at www.entrypointai.com Intro to fine-tuning LLMs (large language models0 from the Prompt Engineering Conference (2023) Presented by Mark Hennings, founder of Entry Point AI. 00:13 - Part 1: Background Info -How a foundation model is born -Instruct tuning and safety tuning -Unpredictability of raw LLM behavior -Showing LLMs how to apply knowledge -Characteristics of fine-tun...
"I just fine-tuned GPT-3.5 Turbo…" - Here's how
มุมมอง 1.2Kปีที่แล้ว
🎁 Join our Skool community: www.skool.com/entry-point-ai In this video, I'm diving into the power and potential of the newly released GPT-3.5's fine-tuning option. After fine-tuning some of my models, the enhancement in quality is undeniably remarkable. Join me as I: - Demonstrate the model I fine-tuned: Watch as the AI suggests additional items for an e-commerce shopping cart and the rationale...
No-code AI fine-tuning (with Entry Point!)
มุมมอง 1.4Kปีที่แล้ว
👉 Sign up for Entry Point here: www.entrypointai.com Entry Point is a platform for no-code AI fine-tuning, with support for Large Language Models (LLMs) from multiple platforms: OpenAI, AI21, and more. In this video I'll demonstrate the core fine-tuning principles while creating an "eCommerce product recommendation" engine in three steps: 1. First I write ~20 examples by hand 2. Then I expand t...
28 April Update: Playground and Synthesis
มุมมอง 84ปีที่แล้ว
28 April Update: Playground and Synthesis
How to Fine-tune GPT-3 in less than 3 minutes.
มุมมอง 4.4Kปีที่แล้ว
🎁 Join our Skool community: www.skool.com/entry-point-ai Learn how to fine-tune GPT-3 (and other) AI models without writing a single line of python code. In this video I'll show you how to create your own custom AI models out of GPT-3 for specialized use cases that can work better than ChatGPT. For demonstration, I'll be working on a classifier AI model for categorizing keywords from Google Ads...
Can GPT-4 Actually Lead a D&D Campaign? 🤯
มุมมอง 395ปีที่แล้ว
If you want to create / fine-tune your own AI models, check out www.entrypointai.com/
Entry Point Demo 1.0 - Keyword Classifier (AI Model)
มุมมอง 365ปีที่แล้ว
Watch over my shoulder as I fine-tune a model for classifying keywords from Google Ads.

ความคิดเห็น

  • @saqibkhatib-y7e
    @saqibkhatib-y7e 15 ชั่วโมงที่ผ่านมา

    What a brilliant channel! thank you thank you! looking forward to all your videos!

  • @atommax_1676
    @atommax_1676 วันที่ผ่านมา

    Really good explanation! Thank you<3

  • @starlingdavidhunter3
    @starlingdavidhunter3 3 วันที่ผ่านมา

    Thanks. I found this very informative. I studied Prospect Theory and behavioral decision theory in grad school and so found it fascinating how K&T's work has been adapted to LLMs.

  • @arcryptllm
    @arcryptllm 18 วันที่ผ่านมา

    Super useful. Thanks.

  • @rhythmzhengxinchengyi
    @rhythmzhengxinchengyi 20 วันที่ผ่านมา

    Thx a lot! Great explanations!

  • @Code-and-Chords-s2g
    @Code-and-Chords-s2g 24 วันที่ผ่านมา

    8:06 Qlora

  • @PaulFidika
    @PaulFidika 26 วันที่ผ่านมา

    Great explanations! I’d like to see how this looks in a PyTorch training loop too

  • @mugomuiruri2313
    @mugomuiruri2313 28 วันที่ผ่านมา

    good

  • @nekokokone
    @nekokokone หลายเดือนก่อน

    As s beginner of AI, this is very useful to me! Thank you!

  • @abdelrahmanmagdi6767
    @abdelrahmanmagdi6767 หลายเดือนก่อน

    just perfect explaination !

  • @BijouBakson
    @BijouBakson หลายเดือนก่อน

    Thank you

  • @joshuatettey7771
    @joshuatettey7771 หลายเดือนก่อน

    Great video

  • @omarsherif88
    @omarsherif88 2 หลายเดือนก่อน

    Very useful, thank you!

  • @luckelly2378
    @luckelly2378 2 หลายเดือนก่อน

    Such an incredibly clear presentation! Thank you

  • @hoyinleunghk
    @hoyinleunghk 2 หลายเดือนก่อน

    Excellent explanation thank a lot!

  • @pamelamadingdong
    @pamelamadingdong 2 หลายเดือนก่อน

    Fact that you gave a concrete examples really helped me go through this! Thank you for the great video

  • @web3namesai
    @web3namesai 2 หลายเดือนก่อน

    Great insights on AI techniques! With Web3NS, you can fine-tune AI agents on YourName.Web3, combining RAG and prompts to create smarter, decentralized responses tailored to your needs. 🚀

  • @gayathrisaranath666
    @gayathrisaranath666 2 หลายเดือนก่อน

    Thanks for this clear explanation about the topic! Your way of relating back to research papers is very interesting and helpful!

  • @dickerjunge2119
    @dickerjunge2119 3 หลายเดือนก่อน

    I think there is a mistake your explanation. It uses higher precision while finetuning, but it does **not** recover "back to normal". It stays in the quantized dataformat.

  • @princekhunt1
    @princekhunt1 3 หลายเดือนก่อน

    Nice

  • @rameshpjain
    @rameshpjain 4 หลายเดือนก่อน

    Such a simple way to explain. Thank you.

  • @AbdoGhazala-y5p
    @AbdoGhazala-y5p 4 หลายเดือนก่อน

    can you share the presentation document

  • @pratikmandlecha6672
    @pratikmandlecha6672 4 หลายเดือนก่อน

    Loved the explanation. Despite knowing these terms well, I was curious to see how it would be explained and I am glad that I watched this video

  • @jb-mk5ln
    @jb-mk5ln 4 หลายเดือนก่อน

    Incredibly high quality content, thank you.

  • @Gayatritravelandfitnessvlogs
    @Gayatritravelandfitnessvlogs 4 หลายเดือนก่อน

    Thanks a ton!

  • @priscillaleapman2367
    @priscillaleapman2367 4 หลายเดือนก่อน

    Martin Shirley Jackson Kenneth Allen Mary

  • @chrisder1814
    @chrisder1814 4 หลายเดือนก่อน

    hello can you help me

  • @kritarthlohomi3305
    @kritarthlohomi3305 4 หลายเดือนก่อน

    bradley cooper in limitless tf

  • @MarshallRoy-h9e
    @MarshallRoy-h9e 4 หลายเดือนก่อน

    Melisa Branch

  • @mandrakexTV
    @mandrakexTV 5 หลายเดือนก่อน

    This is the best detailed video and nicest explanation on youtube right now. I do think your channel will grow because you are doing an EXCELENT job. Thank you man.

  • @TheDinosaurDemocracy
    @TheDinosaurDemocracy 5 หลายเดือนก่อน

    This is amazing! How do you deal with request limits? I'm set to 3RPM for all models, so I'm unable to drag and drop this into a sheet as it will only do 3 at a time.

    • @EntryPointAI
      @EntryPointAI 5 หลายเดือนก่อน

      Good question. I don't have a good answer right now except to find yourself a higher limit :) 3rpm is really really low and you should be able to find even a free service or wrapper that offers more. Check out Groq or Together.ai

  • @bobrarity
    @bobrarity 5 หลายเดือนก่อน

    damn bro thanks, I'm new to fine-tuning, needed to migrate my rag model to the ragtag model, and your video was very clear and helped me a lot 😊

    • @EntryPointAI
      @EntryPointAI 5 หลายเดือนก่อน

      Sick, glad to hear it!

  • @tgzhu3258
    @tgzhu3258 5 หลายเดือนก่อน

    so good!!

  • @NathanielMaymon
    @NathanielMaymon 5 หลายเดือนก่อน

    What's the name of the paper you referenced in the video?

    • @EntryPointAI
      @EntryPointAI 5 หลายเดือนก่อน

      Here's LoRA: arxiv.org/abs/2106.09685 and QLoRA: arxiv.org/abs/2305.14314

  • @adriadebatlle5755
    @adriadebatlle5755 5 หลายเดือนก่อน

    It may be a stupid qüestion, is it possible to download the fine tunned model at entrypoint? Thankyou. Grate video! I came trying to see if it is possible to download it but I understood more about the website and it's grate.

    • @EntryPointAI
      @EntryPointAI 5 หลายเดือนก่อน

      It depends on what service you use to fine-tune, the proprietary ones like OpenAI don't let you download it. Others like Replicate do.

  • @AryanKumarBaghel-cp1jv
    @AryanKumarBaghel-cp1jv 5 หลายเดือนก่อน

    Perfect differentiations. Thankyou

  • @partymarty1856
    @partymarty1856 5 หลายเดือนก่อน

    blud why your eyes like that

  • @someguyO2W
    @someguyO2W 5 หลายเดือนก่อน

    Best video I've seen on the topic. Thank you!

  • @andrepemmelaar8728
    @andrepemmelaar8728 5 หลายเดือนก่อน

    Very useful! Marvelous clear explanation with the right amount of detail about a subject that’s worth understanding

  • @routergods
    @routergods 5 หลายเดือนก่อน

    I've been binge watching LLM-related videos and most have been regurgitation of docs or (probably) GPT/AI generated fluff pieces. This video clearly explained several concepts that I was trying to wrap my head around. Great job!

    • @someguyO2W
      @someguyO2W 5 หลายเดือนก่อน

      Similar comments. He actually explained it properly.

  • @UfcFan-d6s
    @UfcFan-d6s 6 หลายเดือนก่อน

    Amazing for struggling students. Love from Korea😂

  • @thndesmondsaid
    @thndesmondsaid 6 หลายเดือนก่อน

    Thank you!

  • @archchana7756
    @archchana7756 6 หลายเดือนก่อน

    very well explained, thanks :)

  •  6 หลายเดือนก่อน

    Awesome. Thanks

  • @yesid3777
    @yesid3777 6 หลายเดือนก่อน

    Thanks!, an amazing explanation!, +1 Sub.

  • @mohammedAlbared
    @mohammedAlbared 6 หลายเดือนก่อน

    "Great video with an amazing explanation! Thank you for sharing."

  • @aliumut33
    @aliumut33 6 หลายเดือนก่อน

    Thanks. Is it possible to make a webscraping from a given url in a column?

    • @EntryPointAI
      @EntryPointAI 6 หลายเดือนก่อน

      This is a task better suited for another tool. I'd check out apify.com/

  • @Larimuss
    @Larimuss 6 หลายเดือนก่อน

    QLORA let's me train on a 4070ti with only 12gb vram. Though I can't go over 7b model

  • @CatarinaReis-g3y
    @CatarinaReis-g3y 6 หลายเดือนก่อน

    Thisa saved me. Thank you. Keep doing this :)

  • @brianbarnes746
    @brianbarnes746 6 หลายเดือนก่อน

    Great explanation, best that I've seen