1st Multilingual Model Workshop - Pretraining the Jais Bilingual Arabic-English Language Models

แชร์
ฝัง
  • เผยแพร่เมื่อ 11 ก.พ. 2024
  • In this talk, Joel presents details of the pretraining process for Jais models, a series of bilingual Arabic-English language models. Jais models are the current best Arabic models, and they show English capability competitive with state-of-the-art models, even when trained with fewer English tokens. Jais models have been scaled to 13B and 30B parameters.
    Joel discusses pretraining techniques that result in state-of-the-art Arabic capabilities. The vocabulary selection process ensures the model can access balanced capability in both Arabic and English. Joel describes the use of Maximal Update Parameterization, which simplifies hyperparameter selection leading to predictable model scaling. The scaling laws tests show we can mix Arabic and English in a 1:2 ratio and achieve near-perfect scaling in both languages.
    Jais models are developed through a collaboration between Core42’s Inception, the Mohammed Bin Zayed University (MBZUAI), and Cerebras Systems.
  • วิทยาศาสตร์และเทคโนโลยี

ความคิดเห็น • 1