Byte Latent Transformer - BLT explained (Entropy of Next Byte, META)

แชร์
ฝัง
  • เผยแพร่เมื่อ 23 ธ.ค. 2024

ความคิดเห็น • 11

  • @code4AI
    @code4AI  4 วันที่ผ่านมา +4

    Please note, with the automatic dubbing from TH-cam /Google you hear a synthetic voice in your regional language. To hear my original voice in English, switch to "Default" or "English" in the settings. Thank you.

  • @mrpocock
    @mrpocock 4 วันที่ผ่านมา +12

    Byte-level LLMs are obviously the way forward for that first round of training where you're predicting 1..n tokens given the prefix, particularly for multi-language models. Tokenization is clearly a hack, like in the dark ages of image neural networks, where we would hand-craft feature detection kernels.

  • @wwkk4964
    @wwkk4964 4 วันที่ผ่านมา +1

    Thank you so much for covering this paper! I had been thinking about this specific implementation for a year and i believe its a significant step towards having truly general learning architecture that is minimizing hand crafted human priors.

  • @ProgrammingWIthRiley
    @ProgrammingWIthRiley 3 วันที่ผ่านมา

    Brother, you are amazing.
    Thank you for doing this.

  • @davidwynter6856
    @davidwynter6856 4 วันที่ผ่านมา +1

    Can you clarify that the pre training will have to use the BLT embeddings. I.e. unless models pre trained using BLT start appearing on huggingface or elsewhere we mere mortals will not be able to take advantage of this new method?

  • @TalsBadKidney
    @TalsBadKidney 4 วันที่ผ่านมา +1

    very very cool

  • @JeomonGeorge
    @JeomonGeorge 4 วันที่ผ่านมา

    Does the small transformer have bpe then in the H(xi) is it finding the cross entropy. 26:13

  • @themax2go
    @themax2go 4 วันที่ผ่านมา +1

    i'm having a plantbased BLT right now

  • @King_Deundel
    @King_Deundel 3 วันที่ผ่านมา

    BLT seems the way to go in an ideal world, but there are definetly problems with it, I think tokenizers have accomplished tremendous work and we are on this state thanks to improving the vocab size and the tokenizations mechanisms, but from this point we may have the technology and resources to try to perform BLT on a model ( I still don't think it would work that much better)

    • @augmentos
      @augmentos วันที่ผ่านมา

      Can you expand on ‘definitely problems’ with it

  • @ivangoncharuk607
    @ivangoncharuk607 4 วันที่ผ่านมา +1

    Bacon Lettuce Tomato