Mamba: Linear-Time Sequence Modeling with Selective State Spaces (Paper Explained)

Do we need Attention? - Linear RNNs and State Space Models (SSMs) for NLP

Local GraphRAG with LLaMa 3.1 - LangChain, Ollama & Neo4j

100 วัน Minecraft โลกสุดสยองของฮีโร่บาย #3 @DrakiKona

ย่าน - ปรีชา ปัดภัย : เซิ้ง|Music 【Official MV】

🔴Live สด! PUBG GLOBAL SERIES 6 | FINAL STAGE DAY 3

Mamba: Linear-Time Sequence Modeling with Selective State Spaces

Gabriel Mongaras

มุมมอง 9 702

เพิ่มลงใน
- เพลย์ลิสต์ของฉัน
- ดูภายหลัง
แชร์

แชร์

ฝัง

ขนาดวิดีโอ:

แสดงแผงควบคุมโปรแกรมเล่น

เล่นอัตโนมัติ

เล่นใหม่

เผยแพร่เมื่อ 10 พ.ย. 2024

ความคิดเห็น • 14

@gabrielmongaras 11 หลายเดือนก่อน ⁺¹⁰
I forgot to mention that this model is trained like a normal transformer and since everything is causal, you should be able to train using the efficient parallel technique that the transformer uses, a single forward pass for an entire sequence of data.
@berkk1993 11 หลายเดือนก่อน ⁺⁶
I just opened your channel to ask your for Mamba video and here I see this video. You are awesome dude, I can express how much you contribute to my life. Thank you many times!!!
@Anonn724 11 หลายเดือนก่อน ⁺⁴
Please don't stop with this videos. They are extremely useful to go through with you. Much love
@orrimoch5226 10 หลายเดือนก่อน ⁺¹
Wow Gabrial, Great job!
I like your calm attitude and simple way of explaining this complex subject!
As electrical engineer and as a data scientist I highly appreciate your content!
@AM-yk5yd 11 หลายเดือนก่อน ⁺⁸
19:50 I think A is DxN because they use diagonal matrix. They mention S4D, and that paper has example of also linear initialization: "A = -0.5 + 1j * np.pi * np.arange(N//2) # S4D-Lin initialization". It's structured after all.
@marshallmcluhan33 11 หลายเดือนก่อน ⁺³
Thanks for the vid. I Can't wait to see if it's overhyped or not hehe. TriDao knows his attention mechanisms.
@MatterExplained 11 หลายเดือนก่อน ⁺³
thx for doing this paper, was a bit lost on state space models
@acasualviewer5861 11 หลายเดือนก่อน ⁺¹
I was a bit lost.. now I'm more lost. ;)
@MatterExplained 11 หลายเดือนก่อน
@@acasualviewer5861haha, i did watch some lectures by the first author tho
@grimsk 10 หลายเดือนก่อน ⁺¹
점점 물리학의 개념들에 가까워지는 기분이.. 🙂
@ml-ok3xq 11 หลายเดือนก่อน ⁺²
I think it's independent because you can diagonalize the state transition matrix and then each value only interacts with itself.
@saculzemog 10 หลายเดือนก่อน ⁺³
shouldn't 24:28 A,B, and C be LxN not LxD ?
@yccui 7 หลายเดือนก่อน
If all the matrices are learnable, I wonder why the authors use the HiPPO matrix to initialize A？ What's the point?
@gabrielmongaras 7 หลายเดือนก่อน
I was actually wrong about the HiPPO "A" matrix being learnable. I think this matrix is actually static, which makes sense as it adds some basic structure to the model.

ต่อไป

เล่นอัตโนมัติ

Mamba: Linear-Time Sequence Modeling with Selective State Spaces (Paper Explained)

Mamba: Linear-Time Sequence Modeling with Selective State Spaces (Paper Explained)

Do we need Attention? - Linear RNNs and State Space Models (SSMs) for NLP

Do we need Attention? - Linear RNNs and State Space Models (SSMs) for NLP

Local GraphRAG with LLaMa 3.1 - LangChain, Ollama & Neo4j

Local GraphRAG with LLaMa 3.1 - LangChain, Ollama & Neo4j

100 วัน Minecraft โลกสุดสยองของฮีโร่บาย #3 @DrakiKona

100 วัน Minecraft โลกสุดสยองของฮีโร่บาย #3 @DrakiKona

ย่าน - ปรีชา ปัดภัย : เซิ้ง|Music 【Official MV】

ย่าน - ปรีชา ปัดภัย : เซิ้ง|Music 【Official MV】

🔴Live สด! PUBG GLOBAL SERIES 6 | FINAL STAGE DAY 3

🔴Live สด! PUBG GLOBAL SERIES 6 | FINAL STAGE DAY 3

นี่คือความลับในหนังเรื่อง Interstellar ที่คุณอาจยังไม่รู้ #อวกาศ #วิทยาศาสตร์ #Interstellar

นี่คือความลับในหนังเรื่อง Interstellar ที่คุณอาจยังไม่รู้ #อวกาศ #วิทยาศาสตร์ #Interstellar

Will Merrill: The Illusion of State in State-Space Models

Will Merrill: The Illusion of State in State-Space Models

KAN Practical Implementation (Kolmogorov-Arnold Networks Algorithm)

KAN Practical Implementation (Kolmogorov–Arnold Networks Algorithm)

Leave No Context Behind: Efficient Infinite Context Transformers with Infini-attention

Leave No Context Behind: Efficient Infinite Context Transformers with Infini-attention

Visual AutoRegressive Modeling:Scalable Image Generation via Next-Scale Prediction

Visual AutoRegressive Modeling:Scalable Image Generation via Next-Scale Prediction

Mixture-of-Depths: Dynamically allocating compute in transformer-based language models

Mixture-of-Depths: Dynamically allocating compute in transformer-based language models

The Era of 1-bit LLMs: All Large Language Models are in 1.58 Bits and BitNet

The Era of 1-bit LLMs: All Large Language Models are in 1.58 Bits and BitNet

[Paper Review] Mamba: Linear-Time Sequence Modeling with Selective State Spaces

[Paper Review] Mamba: Linear-Time Sequence Modeling with Selective State Spaces

Stable Diffusion 3: Scaling Rectified Flow Transformers for High-Resolution Image Synthesis

Stable Diffusion 3: Scaling Rectified Flow Transformers for High-Resolution Image Synthesis

Why doesn't resonance always build to infinity?

Why doesn't resonance always build to infinity?

“อ.เบียร์ คนตื่นธรรม” มาตามคำเรียกร้อง แหกศาสตร์ฮวงจุ้ย เตือนสติอย่างมงาย | แฉ 7 พ.ย. 67 [1/3]

“อ.เบียร์ คนตื่นธรรม” มาตามคำเรียกร้อง แหกศาสตร์ฮวงจุ้ย เตือนสติอย่างมงาย | แฉ 7 พ.ย. 67 [1/3]

Rodtang’s kisses heal all wounds 😂

Rodtang’s kisses heal all wounds 😂

The IMPOSSIBLE Puzzle..

The IMPOSSIBLE Puzzle..

[LIVE] : ONE ลุมพินี 86 | คู่เอก "คมเพชร vs ชาติพยัคฆ์"

[LIVE] : ONE ลุมพินี 86 | คู่เอก "คมเพชร vs ชาติพยัคฆ์"

# father and daughter funny everyday# funny#daily#funnyvideo

# father and daughter funny everyday# funny#daily#funnyvideo

（พากย์ไทย）ผู้พิทักษ์หมัดเทวดา 2 The Thousand Faces of Dunshu 2 | แอคชั่น แฟนตาซี |

（พากย์ไทย）ผู้พิทักษ์หมัดเทวดา 2 The Thousand Faces of Dunshu 2 | แอคชั่น แฟนตาซี |

Things are HEATING UP 🔥 Who's ready for Rodtang and Takeru to collide in 2025? 🙌

Things are HEATING UP 🔥 Who's ready for Rodtang and Takeru to collide in 2025? 🙌

หนังกำลังภายใน | หลินชิงเสีย เดชคัมภีร์เทวดา ภาค 2 (Swordsman II) | Mei Ah Movie | หนังจีนพากย์ไทย

หนังกำลังภายใน | หลินชิงเสีย เดชคัมภีร์เทวดา ภาค 2 (Swordsman II) | Mei Ah Movie | หนังจีนพากย์ไทย