LLM frenzy week in the month of March

แชร์
ฝัง
  • เผยแพร่เมื่อ 12 ก.ย. 2024
  • Efficiency meets performance: Comparing open-source LLMs - DBRX, Jamba, Qwen
    Last week of the March 2024 was an LLM frenzy week.
    Databricks/Mosaic launched their DBRX as new SOTA model. AI21 launched Jamba combining best of the both world - Transformer + Mamba. Alibaba Qwen release Qwen1.5-MoE-A2.7B. During the same time, Mistral mentioned that they are releasing their 7B v0.2 base model - don’t know if that was only for the hackathon they ran in San Francisco.
    All these companies are not only competing against each other but also competing with other SOTA open-source models like LLaMa 2 as well as proprietary models like Claude 3, GPT3.5 and GPT-4. Some of them even beating those model in benchmark testing or getting closer to the performance of those proprietary models.
    In this video, I compare three of these models against each other and see which one fares better. Let's take a look.
    DBRX: www.databricks...
    Jamba: www.ai21.com/b...
    Qwen: qwenlm.github....

ความคิดเห็น •