Exciting Breakthrough in Extreme Quantization - HF Cracks Code of BitNet - HF1BitLLM

แชร์
ฝัง
  • เผยแพร่เมื่อ 21 ก.ย. 2024
  • This video shows how to locally install Llama3-8B-1.58-100B-tokens in which Hugging Face has finally cracked the code for BitNet - no pretraining needed with warmup quantization! By fine-tuning a Llama3 8B model, they’ve achieved results close to Llama 1 & 2 7B on key downstream tasks.
    🔥 Buy Me a Coffee to support the channel: ko-fi.com/fahd...
    🔥 Get 50% Discount on any A6000 or A5000 GPU rental, use following link and coupon:
    bit.ly/fahd-mirza
    Coupon code: FahdMirza
    ▶ Become a Patron 🔥 - / fahdmirza
    #bitnet #warmupquantization #hf1bitllm
    PLEASE FOLLOW ME:
    ▶ LinkedIn: / fahdmirza
    ▶ TH-cam: / @fahdmirza
    ▶ Blog: www.fahdmirza.com
    RELATED VIDEOS:
    ▶ Resource huggingface.co...
    All rights reserved © Fahd Mirza

ความคิดเห็น • 3

  • @TheZEN2011
    @TheZEN2011 3 ชั่วโมงที่ผ่านมา +2

    A great solution for those with lighter weight computer systems. Aesome research!

    • @fahdmirza
      @fahdmirza  2 ชั่วโมงที่ผ่านมา +1

      It indeed is.

  • @Dr.UldenWascht
    @Dr.UldenWascht 12 นาทีที่ผ่านมา

    Great video, as always! My brain is doing somersaults upon hearing about the concept of 1.58 bits! 😄 Since I always imagined 'bits' to be binary by definition. Whatever they are doing sounds fascinating. If my math is correct, a 70B model in 1.58Q should be around 16 gigabytes in size, which means it will be viable for some commercially available GPUs. I'm really curious how a Q1.58 version of such large models will actually perform in practice.