LLM frenzy week in the month of March
ฝัง
- เผยแพร่เมื่อ 12 ก.ย. 2024
- Efficiency meets performance: Comparing open-source LLMs - DBRX, Jamba, Qwen
Last week of the March 2024 was an LLM frenzy week.
Databricks/Mosaic launched their DBRX as new SOTA model. AI21 launched Jamba combining best of the both world - Transformer + Mamba. Alibaba Qwen release Qwen1.5-MoE-A2.7B. During the same time, Mistral mentioned that they are releasing their 7B v0.2 base model - don’t know if that was only for the hackathon they ran in San Francisco.
All these companies are not only competing against each other but also competing with other SOTA open-source models like LLaMa 2 as well as proprietary models like Claude 3, GPT3.5 and GPT-4. Some of them even beating those model in benchmark testing or getting closer to the performance of those proprietary models.
In this video, I compare three of these models against each other and see which one fares better. Let's take a look.
DBRX: www.databricks...
Jamba: www.ai21.com/b...
Qwen: qwenlm.github....