Transformer LLMs are Turing Complete after all !?

Coding Interviews Be Like

MLBBQ: Conditional Positional encodings for Vision Transformers by William Ashbee

ILLSLICK - KILLSHOT REMIX [FULL VERSION]

ยิ่งกว่าถูกหวย ! เจอ Threadripper และ RTX 2080 Ti ในถังขยะ #ExtremeIT

เมื่อน้องเอมี่ตื่นมาหิวตอนตีสาม😊 #น้องเอมี่ #เอมี่ #thedragger #บ้านฉัน #แม่โม #ตลก #amydragger

I am a Strange Dataset: Metalinguistic Tests for Language Models - Paper Explained [🔴 at ACL 2024]

AI Coffee Break with Letitia

มุมมอง 1 708

เพิ่มลงใน
- เพลย์ลิสต์ของฉัน
- ดูภายหลัง
แชร์

แชร์

ฝัง

ขนาดวิดีโอ:

แสดงแผงควบคุมโปรแกรมเล่น

เล่นอัตโนมัติ

เล่นใหม่

เผยแพร่เมื่อ 27 พ.ย. 2024

ความคิดเห็น •

@juanmanuelcirotorres6155 2 หลายเดือนก่อน ⁺³
Wow, I can't believe it, two of my idols together! Congrats, Tristan! Go Contextual!
@harumambaru 2 หลายเดือนก่อน ⁺³
Such a great format of video! Very short and I love it more that 2min papers because of author’s explanation.
Now to the questions: Great insight of words and tokens, does this mean that we need bigger models, that will learn on their own how to transform words into tokens. From my understanding words are just less computational easy tokens. So if we do go to 400B or larger models, following the idea that “size is all you need” maybe it will work. I wonder what is the performance if largest model published would be
@AICoffeeBreak 2 หลายเดือนก่อน ⁺¹
Thanks for the kind words!
Now, to your question: Yes, we already kind of have such models that are tokenizer-free and work directly on characters or byte representation, but this makes the input sequence length frow much bigger. So, models of the like of MAMBA or long-context transformers could eventually, if big enough, surpass existing tokenizer-based models. But we have not seen that happen yet at GPT4 scale. I do not know if they even tried that path.
@kathyh8047 2 หลายเดือนก่อน ⁺²
I mean I'd definitely assume that sentences like these don't occur a whole lot in the training data
@AICoffeeBreak 2 หลายเดือนก่อน
I agree. But I am sure that some sentences like this were there. Every LLM today trains of Wikipedia and they must have seen the entries of Douglas Hofstadter and his books.
@harumambaru 2 หลายเดือนก่อน ⁺¹
@@AICoffeeBreak Adding I Am a Strange Loop to my reading list. I agree that it is in wikipedia, but how do you explain "how many R in strawberry" as if not with tokenization bug? That we need different kind of models (or tokenizers) to solve this
@AICoffeeBreak 2 หลายเดือนก่อน ⁺¹
D'accord.
@simonstorf7080 2 หลายเดือนก่อน ⁺²
Interesting, another reason might be that the task is just too far out of distribution?
@AICoffeeBreak 2 หลายเดือนก่อน ⁺²
Training on data exactly like this might make models perform better on these examples. But I doubt these models didn't see any self-referential statements somewhere in their training data.
@anind3r 2 หลายเดือนก่อน ⁺¹
Those E and F shapes test dont make sense unless fed as an image
@kathyh8047 2 หลายเดือนก่อน ⁺⁴
well it would still look different to the model from just feeding the sentences as normal. I think most LLM tokenizers even preserve the linebreaks
@MariaM-pu4fx หลายเดือนก่อน
I understand compleatly nothing. I was focused on this guy aspergery passion. How can I work with such cyborgs :D sorry. I am a hater here but I started questioning my role on the job market after watching it.
@MariaM-pu4fx หลายเดือนก่อน
chenged my mind I like the explanation sorry for my ADHD

ต่อไป

เล่นอัตโนมัติ

Transformer LLMs are Turing Complete after all !?

Transformer LLMs are Turing Complete after all !?

Coding Interviews Be Like

Coding Interviews Be Like

MLBBQ: Conditional Positional encodings for Vision Transformers by William Ashbee

MLBBQ: Conditional Positional encodings for Vision Transformers by William Ashbee

ILLSLICK - KILLSHOT REMIX [FULL VERSION]

ILLSLICK - KILLSHOT REMIX [FULL VERSION]

ยิ่งกว่าถูกหวย ! เจอ Threadripper และ RTX 2080 Ti ในถังขยะ #ExtremeIT

ยิ่งกว่าถูกหวย ! เจอ Threadripper และ RTX 2080 Ti ในถังขยะ #ExtremeIT

เมื่อน้องเอมี่ตื่นมาหิวตอนตีสาม😊 #น้องเอมี่ #เอมี่ #thedragger #บ้านฉัน #แม่โม #ตลก #amydragger

เมื่อน้องเอมี่ตื่นมาหิวตอนตีสาม😊 #น้องเอมี่ #เอมี่ #thedragger #บ้านฉัน #แม่โม #ตลก #amydragger

MISS GRAND KHON KAEN / PHATTHALUNG 2025 | FINAL SHOW

MISS GRAND KHON KAEN / PHATTHALUNG 2025 | FINAL SHOW

Mission: Impossible language models - Paper Explained [ACL 2024 recording]

Mission: Impossible language models – Paper Explained [ACL 2024 recording]

My PhD Journey in AI / ML (while doing YouTube on the side)

My PhD Journey in AI / ML (while doing YouTube on the side)

MAMBA and State Space Models explained | SSM explained

MAMBA and State Space Models explained | SSM explained

Discrete Diffusion Modeling by Estimating the Ratios of the Data Distribution - Paper Explained

Discrete Diffusion Modeling by Estimating the Ratios of the Data Distribution – Paper Explained

How NVIDIA's new Small Language Model is changing AI...

How NVIDIA's new Small Language Model is changing AI...

AI isn't gonna keep improving

AI isn't gonna keep improving

I Got AI Interviewed AND BROKE IT

I Got AI Interviewed AND BROKE IT

Transformers explained | The architecture behind LLMs

Transformers explained | The architecture behind LLMs

Direct Preference Optimization: Your Language Model is Secretly a Reward Model | DPO paper explained

Direct Preference Optimization: Your Language Model is Secretly a Reward Model | DPO paper explained

#快成长计划 #年轻影画创作之星办法总比困难多，还是儿子有办法。#乡村幽默#家庭趣事#搞笑创作#农村风情#幽默生活

#快成长计划 #年轻影画创作之星办法总比困难多，还是儿子有办法。#乡村幽默#家庭趣事#搞笑创作#农村风情#幽默生活

coco在求救？ #小丑 #天使 #shorts

coco在求救？ #小丑 #天使 #shorts

🔴LIVE เชียร์สด : เซาแธมป์ตัน พบ ลิเวอร์พูล | หงส์แดงบุกเยือนนักบุญ MW12

🔴LIVE เชียร์สด : เซาแธมป์ตัน พบ ลิเวอร์พูล | หงส์แดงบุกเยือนนักบุญ MW12

หาทำ EP.54 : ลาบปลาทับทิมทอดครั้งแรก ของ "เจ๊มิ่ง" | จือปาก

หาทำ EP.54 : ลาบปลาทับทิมทอดครั้งแรก ของ "เจ๊มิ่ง" | จือปาก

แมนเชสเตอร์ ซิตี้ 3-3 เฟเยนูร์ด | ไฮไลต์ ยูฟ่า แชมเปี้ยนส์ ลีก Champions League 24/25

แมนเชสเตอร์ ซิตี้ 3-3 เฟเยนูร์ด | ไฮไลต์ ยูฟ่า แชมเปี้ยนส์ ลีก Champions League 24/25

ยางจัดฟัน สีประเทศไทย‼️เหมือนมั้ย?? #jamsai #แจ่มใส #jamsaijs #จัดฟัน

ยางจัดฟัน สีประเทศไทย‼️เหมือนมั้ย?? #jamsai #แจ่มใส #jamsaijs #จัดฟัน

ดบดล 2024 #12 (แข่งทัวร์50ทรู วันที่ 1)

ดบดล 2024 #12 (แข่งทัวร์50ทรู วันที่ 1)

การท่าเรือ เอเอสเอ็ม พบ ห้องเย็นท่าข้าม MEA ฟุตซอลไทยลีก2024 นัดที่ 17

การท่าเรือ เอเอสเอ็ม พบ ห้องเย็นท่าข้าม MEA ฟุตซอลไทยลีก2024 นัดที่ 17