How to Train Sparse Large Language Models with Vithu Thangarasa
ฝัง
- เผยแพร่เมื่อ 29 พ.ค. 2023
- In this episode of the Cerebras podcast we explore the latest research in sparse neural networks. We discuss:
• What is sparsity and why is it important in training neural networks
• The latest progress in applying sparsity to large language models
• Our latest paper “SPDF: Sparse Pre-training and Dense Fine-tuning For Large Language Models”
arxiv.org/abs/2303.10464
• Why Cerebras hardware is uniquely suited to training large sparse models
• Future directions in sparsity research
Speakers:
Vithu Thangarasa (@vithursant19) - Senior ML Research Scientist, Cerebras
James Wang (@draecomino) - Senior Product Marketing Manager, Cerebras - วิทยาศาสตร์และเทคโนโลยี