[인공지능,머신러닝,딥러닝] (심화) Direct preference optimization (DPO)

แชร์
ฝัง
  • เผยแพร่เมื่อ 27 ก.ย. 2024
  • - Direct Preference Optimization: Your Language Model is Secretly a Reward Model
    - NeurIPS 2023

ความคิดเห็น • 2

  • @Tony-ed3ke
    @Tony-ed3ke 5 หลายเดือนก่อน

    내용을 자세히 쉽게 설명해줘서 이해하는데 큰 도움이 되었습니다. 또한 코드도 같이 설명해주셔서 좋았습니다. 감사합니다.

  • @dayol2026
    @dayol2026 6 หลายเดือนก่อน

    내용 쉽게 설명해주셔서 감사합니다.! 그런데 소리가 너무 작아서 최대로 키워도 작게들리네요 ㅠㅠ