DeepSeek Janus Pro 7b - Unified Vision and generation in one model (paper explained)

Q-Learning Tutorial in Python - Reinforcement Learning

Panama to Quit China's Belt & Road Initiative after Trump Threats | Vantage with Palki Sharma | N18G

“โดนัท มนัสนันท์” ไหว้ขอสามีมีอีหนูเถอะ!! “หนุ่ม กรรชัย” พร้อมช่วยเหลือ! | 3 แซ่บ (Full) 15 ธ.ค. 67

The White Lotus Season 3 | Official Teaser | Max

🔴LIVE โหนกระแส ศึกชิงมรดก 500 ล้าน ทายาทฟ้องเด็กรับใช้ปลอมลายเซ็น

DeepSeek R1 Incentivizing Reasoning Capability in LLMs via Reinforcement Learning (paper explained)

AI Bites

มุมมอง 3 556

เพิ่มลงใน
- เพลย์ลิสต์ของฉัน
- ดูภายหลัง
แชร์

แชร์

ฝัง

ขนาดวิดีโอ:

แสดงแผงควบคุมโปรแกรมเล่น

เล่นอัตโนมัติ

เล่นใหม่

เผยแพร่เมื่อ 4 ก.พ. 2025

ความคิดเห็น • 13

@PoleesuSrinivasCh 3 วันที่ผ่านมา ⁺¹
A Better explanation even which will lead in all Sources.
@SapienSpace 7 วันที่ผ่านมา ⁺²
Fascinating review. I glanced at the paper, particularly at GRPO and GAE. GRPO looks a lot like Fuzzy-Logic with nodes or attention heads adapted to experience (e.g. such as "relative" via using K-means group clustering).
Looking more deeply at GAE (Generalized Advantage Estimation) it is for an adaptive control system.
I would not be surprised if the origin of the deep learning usage of Theta is an angle of a pendulum.
@SapienSpace 7 วันที่ผ่านมา ⁺¹
Overlapping membership functions used in Fuzzy Logic is very similar to KL.
@AIBites 5 วันที่ผ่านมา
Don't have much experience with fuzzy logic. But I like your perspective 🙂
@KhurramXahiL-py5dq 7 วันที่ผ่านมา ⁺¹
Great explanation
@AIBites 5 วันที่ผ่านมา
Thanks 👍
@francesclopez6192 7 วันที่ผ่านมา ⁺¹
Thank you for your explanation !
@AIBites 5 วันที่ผ่านมา
My pleasure 😊
@mukeshreddy7909 6 วันที่ผ่านมา ⁺¹
great video
@AIBites 5 วันที่ผ่านมา
Thanks!
@amortalbeing 7 วันที่ผ่านมา ⁺¹
thanks a lot.
@AIBites 5 วันที่ผ่านมา
Most welcome!

ต่อไป

เล่นอัตโนมัติ

DeepSeek Janus Pro 7b - Unified Vision and generation in one model (paper explained)

DeepSeek Janus Pro 7b - Unified Vision and generation in one model (paper explained)

Q-Learning Tutorial in Python - Reinforcement Learning

Q-Learning Tutorial in Python - Reinforcement Learning

Panama to Quit China's Belt & Road Initiative after Trump Threats | Vantage with Palki Sharma | N18G

Panama to Quit China's Belt & Road Initiative after Trump Threats | Vantage with Palki Sharma | N18G

“โดนัท มนัสนันท์” ไหว้ขอสามีมีอีหนูเถอะ!! “หนุ่ม กรรชัย” พร้อมช่วยเหลือ! | 3 แซ่บ (Full) 15 ธ.ค. 67

“โดนัท มนัสนันท์” ไหว้ขอสามีมีอีหนูเถอะ!! “หนุ่ม กรรชัย” พร้อมช่วยเหลือ! | 3 แซ่บ (Full) 15 ธ.ค. 67

The White Lotus Season 3 | Official Teaser | Max

The White Lotus Season 3 | Official Teaser | Max

🔴LIVE โหนกระแส ศึกชิงมรดก 500 ล้าน ทายาทฟ้องเด็กรับใช้ปลอมลายเซ็น

🔴LIVE โหนกระแส ศึกชิงมรดก 500 ล้าน ทายาทฟ้องเด็กรับใช้ปลอมลายเซ็น

HIGHLIGHTS : Singapore 2-4 Thailand | ASEAN Championship 2024 | 17.12.24

HIGHLIGHTS : Singapore 2-4 Thailand | ASEAN Championship 2024 | 17.12.24

DeepSeek-R1 Paper Explained - A New RL LLMs Era in AI?

DeepSeek-R1 Paper Explained - A New RL LLMs Era in AI?

[GRPO Explained] DeepSeekMath: Pushing the Limits of Mathematical Reasoning in Open Language Models

[GRPO Explained] DeepSeekMath: Pushing the Limits of Mathematical Reasoning in Open Language Models

LightRAG - A simple and fast RAG that beats GraphRAG? (paper explained)

LightRAG - A simple and fast RAG that beats GraphRAG? (paper explained)

AI Is Making You An Illiterate Programmer

AI Is Making You An Illiterate Programmer

Paper: DeepSeek-R1: Incentivizing Reasoning Capability in LLMs via Reinforcement Learning

Paper: DeepSeek-R1: Incentivizing Reasoning Capability in LLMs via Reinforcement Learning

I tested DeepSeek vs. OpenAI-o1 for data science tasks: Here’s what I found.

I tested DeepSeek vs. OpenAI-o1 for data science tasks: Here’s what I found.

Transformers (how LLMs work) explained visually | DL5

Transformers (how LLMs work) explained visually | DL5

DeepSeek R1 Theory Overview | GRPO + RL + SFT

DeepSeek R1 Theory Overview | GRPO + RL + SFT

DeepSeek R1 Explained to your grandma

DeepSeek R1 Explained to your grandma

เนื้อเรื่องที่ท่านจะโมโหจนน้ำตาไหล | Mouthwashing

เนื้อเรื่องที่ท่านจะโมโหจนน้ำตาไหล | Mouthwashing

Real Vs Mannequin Challenge😱

Real Vs Mannequin Challenge😱

OHANA บ้าพลัง EP.134 : เกมการ์ดโอฮาน่า X วัยหนุ่ม 2544

OHANA บ้าพลัง EP.134 : เกมการ์ดโอฮาน่า X วัยหนุ่ม 2544

หนูขอไปด้วย #แม่สุซูกัส #ตลก #shorts

หนูขอไปด้วย #แม่สุซูกัส #ตลก #shorts

ไก่วิเศษ #การ์ตูน #นิทาน #cartoon

ไก่วิเศษ #การ์ตูน #นิทาน #cartoon

เจ้าของแทบทรุด บ้านสร้างได้ 3 เดือน พังทรุดตัว เพจดังชี้สาเหตุ ไม่ใช่เกิดจากเสาเข็ม

เจ้าของแทบทรุด บ้านสร้างได้ 3 เดือน พังทรุดตัว เพจดังชี้สาเหตุ ไม่ใช่เกิดจากเสาเข็ม

ไทยพลิกแซงสิงคโปร์ 2-4! อาเซียนยกเป็นแมตช์สุดมันส์!! เหงียนชมดูไทยเล่นสนุกจริง!

ไทยพลิกแซงสิงคโปร์ 2-4! อาเซียนยกเป็นแมตช์สุดมันส์!! เหงียนชมดูไทยเล่นสนุกจริง!

แพนด้าจะไม่ทน #cartoon #cartoonnetwork #short

แพนด้าจะไม่ทน #cartoon #cartoonnetwork #short