Efficient Self-Attention for Transformers

แชร์
ฝัง
  • เผยแพร่เมื่อ 20 ม.ค. 2025

ความคิดเห็น • 11

  • @javadkhataei970
    @javadkhataei970 ปีที่แล้ว +1

    Very informative. Thank you!

    • @PyMLstudio
      @PyMLstudio  ปีที่แล้ว

      Glad it was helpful!

  • @pabloealvarez
    @pabloealvarez ปีที่แล้ว +1

    good explanation, very clear

    • @PyMLstudio
      @PyMLstudio  ปีที่แล้ว +1

      Thank you for the nice comment! Glad you find the videos useful!

  • @brianlee4966
    @brianlee4966 10 หลายเดือนก่อน +1

    Thank you so much

  • @benji6296
    @benji6296 7 หลายเดือนก่อน +1

    what would be the advantage of this methods vs Flash attention. Flash attention speeds up the computation and it is an exact computation most of these methods are approximations. I would like if possible to see a video explaining other attention types as Paged attention and Flash Attention. Great content :)

    • @PyMLstudio
      @PyMLstudio  7 หลายเดือนก่อน +1

      Thank you for the suggestion! You're absolutely right. In this video, I focused on purely algorithmic approaches, not hardware-based solutions like FlashAttention. FlashAttention is an IO-aware exact attention algorithm that uses tiling to reduce memory reads/writes between GPU memory levels, which results in significant speedup without sacrificing model quality.
      I appreciate your input and will definitely consider making a video to explain FlashAttention!

    • @PyMLstudio
      @PyMLstudio  5 หลายเดือนก่อน

      Thanks for the suggestion, I made a new video on Flash Attention:
      FlashAttention: Accelerate LLM training
      th-cam.com/video/LKwyHWYEIMQ/w-d-xo.html
      I would love to hear your comments and if you have any other suggestions

  • @buh357
    @buh357 8 หลายเดือนก่อน

    you should include axial attention and axial position embedding, its simple yet work great on image, and video.

    • @PyMLstudio
      @PyMLstudio  8 หลายเดือนก่อน +1

      Thanks for the suggestion, yes I agree. I have briefly described axial attention in the vision transformer series
      th-cam.com/video/bavfa_Rr2f4/w-d-xo.htmlsi=0SB9Yc_0SasafhJN

    • @buh357
      @buh357 8 หลายเดือนก่อน

      @@PyMLstudio thats awesome, thanks you!