Ernest Ryu
Ernest Ryu
  • 11
  • 9 301
Aaron Defazio Talk (12.06.2024, UCLA)
Title. Schedules & Schedule-Free Learning
Abstract. I will introduce an alternative view of learning rate schedules, where they are considered as a technique for ensuring optimal convergence rates for the last iterate of an optimization procedure. This view leads to highly predictive theory of optimal learning rate schedules, explaining learning rate warmup and annealing procedures used in practice. Going beyond this, I will show how this viewpoint suggests Schedule-Free approaches, where learning rate schedules are replaced by iterate averaging schemes, which yield a number of benefits: no need to specify the stopping time in advance, smoother loss curves and often better eval metrics.
มุมมอง: 369

วีดีโอ

Mathematics and Science of Large Language Models (Ernest Ryu, UCLA Applied Math Colloquium)
มุมมอง 3732 หลายเดือนก่อน
UCLA Applied Math Colloquium, Ernest Ryu, Oct 31, 2024. Title: Mathematics and Science of Large Language Models Abstract: Large language models (LLMs) represent an engineering marvel, but their inner workings are notoriously challenging to understand. In this talk, we present two analyses of LLMs. The first result is a mathematical guarantee on LoRA fine-tuning for LLMs, showing that the traini...
Simple Drop-in LoRA Conditioning on Attention Layers Will Improve Your Diffusion Model
มุมมอง 963 หลายเดือนก่อน
Presentation of paper Simple Drop-in LoRA Conditioning on Attention Layers Will Improve Your Diffusion Model, J. Y. Choi, J. R. Park, I. Park, J. Cho, A. No, and E. K. Ryu, Transactions on Machine Learning Research, 2024. openreview.net/forum?id=38P40gJPrI
Python Tutorial 3
มุมมอง 1.2K3 ปีที่แล้ว
Python Tutorial 3
Python Tutorial 2
มุมมอง 1.6K3 ปีที่แล้ว
Python Tutorial 2
Python Tutorial 1
มุมมอง 4.2K3 ปีที่แล้ว
Python Tutorial 1
Plug-and-Play Methods Provably Converge with Properly Trained Denoisers
มุมมอง 8395 ปีที่แล้ว
This video presents the paper "Plug-and-Play Methods Provably Converge with Properly Trained Denoisers" published in ICML 2019 proceedings.mlr.press/v97/ryu19a.html The slides are available at www.math.ucla.edu/~eryu/papers/Plug_and_Play_Slides.pdf
Proof of Theorem 3 of Uniqueness of DRS as the 2 Operator Resolvent-Splitting
มุมมอง 1245 ปีที่แล้ว
This video explains the details of the proof of the paper Uniqueness of DRS as the 2 Operator Resolvent-Splitting and Impossibility of 3 Operator Resolvent-Splitting. A usual talk that explains the results at a high level without going into the details of the proof can be found in the following link: th-cam.com/video/NTY-BDKXwUE/w-d-xo.html
Proof of Theorem 2 of Uniqueness of DRS as the 2 Operator Resolvent-Splitting
มุมมอง 1295 ปีที่แล้ว
This video explains the details of the proof of the paper Uniqueness of DRS as the 2 Operator Resolvent-Splitting and Impossibility of 3 Operator Resolvent-Splitting. A usual talk that explains the results at a high level without going into the details of the proof can be found in the following link: th-cam.com/video/NTY-BDKXwUE/w-d-xo.html
Proof of Theorem 4 of Uniqueness of DRS as the 2 Operator Resolvent-Splitting
มุมมอง 1075 ปีที่แล้ว
This video explains the details of the proof of the paper Uniqueness of DRS as the 2 Operator Resolvent-Splitting and Impossibility of 3 Operator Resolvent-Splitting. A usual talk that explains the results at a high level without going into the details of the proof can be found in the following link: th-cam.com/video/NTY-BDKXwUE/w-d-xo.html
Proof of Theorem 1 of Uniqueness of DRS as the 2 Operator Resolvent-Splitting
มุมมอง 2715 ปีที่แล้ว
This video explains the details of the proof of the paper Uniqueness of DRS as the 2 Operator Resolvent-Splitting and Impossibility of 3 Operator Resolvent-Splitting. A usual talk that explains the results at a high level without going into the details of the proof can be found in the following link: th-cam.com/video/NTY-BDKXwUE/w-d-xo.html