Reinforcement Learning in TensorFlow with TF-Agents (TF Dev Summit '19)

Efficient Large-Scale Language Model Training on GPU Clusters

Sonnet 2.0 (TF Dev Summit ‘19)

BABYMONSTER - 'DRIP' M/V

The NO Long Hair Rule Does Not Apply to PROS! 🤣

เกลือแต่รวย จัดเด็คเกลือขายจนซื้อร้าน 5,000 ดอล | TCG Card Shop Simulator - Part 4

Mesh-TensorFlow: Model Parallelism for Supercomputers (TF Dev Summit ‘19)

TensorFlow

มุมมอง 17 062

เพิ่มลงใน
- เพลย์ลิสต์ของฉัน
- ดูภายหลัง
แชร์

แชร์

ฝัง

ขนาดวิดีโอ:

แสดงแผงควบคุมโปรแกรมเล่น

เล่นอัตโนมัติ

เล่นใหม่

เผยแพร่เมื่อ 8 พ.ย. 2024

ความคิดเห็น • 14

@ojbk ปีที่แล้ว ⁺¹
So great!
@azrathashemi1523 3 ปีที่แล้ว ⁺¹
Please refresh the time line with this design
@stephennfernandes 3 ปีที่แล้ว
In 6:49 he said, activation have batch dimensions and E parameters have batch dimensions ?? Is that correct ? I used to think batch size is an independent dimension when defining model and is initialised for all parameters including W and activation parameters
@pierricklee8520 2 ปีที่แล้ว
Batch size is an independent size yes, so the input of the whole model has batch size and also it's outputs of every layer, which are activations. Activations are not parameter, you don't need these activations at all. What you want to train is the W and V rather than X (input) or Y (output). Parameters like W and V don't have batch size, because every batch inputs will dot with the same W or V.
@abcfy2 5 ปีที่แล้ว ⁺⁴
No subtitles for this video?
@krzysiek2768 5 ปีที่แล้ว ⁺¹
added
@conflagration95 5 ปีที่แล้ว
I imagine we can also split some layers by h and some layers by d?
@chrisminnoy3637 4 ปีที่แล้ว ⁺²
I'm not convinced. Let me say why. Only a happy few, very few, have the possibility to use a supercomputer or a TPU for that matter. But most of us already have access to a cluster of non homogenius nodes. Some nodes faster, more powerful, some quite slow but with maybe more memory/disk space. It makes more sense to have tensorflow be able to detect capabilities, detect latencies, and build a graph that fits that cluster best. That way ALL can see model parallelism AND data parallelism with affordable equipment. One might take the step a bit further, and even include nodes over the internet, not needing fiber, if the graphs are setup right. But this needs to be automated, while now it is pure manual work.
@gauravvij137 4 ปีที่แล้ว
I believe one of the challenges with homogeneity is the use of collective communications at the end i.e. All reduce. So the fastest nodes end up waiting for the slowest node so that the outputs can be collected and gradients can be redistributed.
@saltcl9101 3 ปีที่แล้ว
mostly, we don't need to train gaint models
@AidanGomez 5 ปีที่แล้ว
WOOOOOOOOOO Noammmm!
@abhijeetghodgaonkar 5 ปีที่แล้ว
Yay
@azrathashemi1523 3 ปีที่แล้ว
So make sure my five sensors are programmed by tensorflow supercomputer federated learning deep learning machine learning replacement because Bluetooth sensors are the very best definition for neural linguistics programming like FBI's co Intel pro Super computer NLP from the fifties

ต่อไป

เล่นอัตโนมัติ

Reinforcement Learning in TensorFlow with TF-Agents (TF Dev Summit '19)

Reinforcement Learning in TensorFlow with TF-Agents (TF Dev Summit '19)

Efficient Large-Scale Language Model Training on GPU Clusters

Efficient Large-Scale Language Model Training on GPU Clusters

Sonnet 2.0 (TF Dev Summit ‘19)

Sonnet 2.0 (TF Dev Summit ‘19)

BABYMONSTER - 'DRIP' M/V

BABYMONSTER - 'DRIP' M/V

The NO Long Hair Rule Does Not Apply to PROS! 🤣

The NO Long Hair Rule Does Not Apply to PROS! 🤣

เกลือแต่รวย จัดเด็คเกลือขายจนซื้อร้าน 5,000 ดอล | TCG Card Shop Simulator - Part 4

เกลือแต่รวย จัดเด็คเกลือขายจนซื้อร้าน 5,000 ดอล | TCG Card Shop Simulator - Part 4

จารย์❌ จาน✅ #ตลก #บ้านกูเอง

จารย์❌ จาน✅ #ตลก #บ้านกูเอง

TensorFlow Probability: Learning with confidence (TF Dev Summit '19)

TensorFlow Probability: Learning with confidence (TF Dev Summit '19)

Alpa: Automated Model-Parallel Deep Learning - Zhuohan Li | Stanford MLSys #59

Alpa: Automated Model-Parallel Deep Learning - Zhuohan Li | Stanford MLSys #59

Tips and tricks for distributed large model training

Tips and tricks for distributed large model training

TensorFlow Federated (TFF): Machine Learning on Decentralized Data (TF Dev Summit ‘19)

TensorFlow Federated (TFF): Machine Learning on Decentralized Data (TF Dev Summit ‘19)

Training LLMs at Scale - Deepak Narayanan | Stanford MLSys #83

Training LLMs at Scale - Deepak Narayanan | Stanford MLSys #83

Colossal-AI: A Unified Deep Learning System For Large-Scale Parallel Training

Colossal-AI: A Unified Deep Learning System For Large-Scale Parallel Training

Edge TPU live demo: Coral Dev Board & Microcontrollers (TF Dev Summit '19)

Edge TPU live demo: Coral Dev Board & Microcontrollers (TF Dev Summit '19)

A friendly introduction to distributed training (ML Tech Talks)

A friendly introduction to distributed training (ML Tech Talks)

Exascale Deep Learning for Climate Analytics (TF Dev Summit ‘19)

Exascale Deep Learning for Climate Analytics (TF Dev Summit ‘19)

ถ่ายทอดสด l ฟุตซอลชิงแชมป์อาเซียน 2024 l รอบรองชนะเลิศ l เวียดนาม v ออสเตรเลีย

ถ่ายทอดสด l ฟุตซอลชิงแชมป์อาเซียน 2024 l รอบรองชนะเลิศ l เวียดนาม v ออสเตรเลีย

คาเซมิโร่ โวย การ์นาโช่ ทำไมไม่ไล่บอล!! #SHOTเข้มเต็มข้อ #พรีเมียร์ลีก #แมนยู #เชลซี

คาเซมิโร่ โวย การ์นาโช่ ทำไมไม่ไล่บอล!! #SHOTเข้มเต็มข้อ #พรีเมียร์ลีก #แมนยู #เชลซี

ทำสีจากหิน Lapis ?

ทำสีจากหิน Lapis ?

มายคราฟ แต่"TNT"สุดแปลก!? #minecraft #มายคราฟ #minecraftbut #tnt

มายคราฟ แต่"TNT"สุดแปลก!? #minecraft #มายคราฟ #minecraftbut #tnt

แมนเชสเตอร์ ยูไนเต็ด 2-0 พีเอโอเค | ยูโรปา ลีก ไฮไลต์ Europa League 24/25

แมนเชสเตอร์ ยูไนเต็ด 2-0 พีเอโอเค | ยูโรปา ลีก ไฮไลต์ Europa League 24/25

ถ่ายทอดสด l ฟุตซอลชิงแชมป์อาเซียน 2024 l รอบรองชนะเลิศ l อินโดนีเซีย v ไทย

ถ่ายทอดสด l ฟุตซอลชิงแชมป์อาเซียน 2024 l รอบรองชนะเลิศ l อินโดนีเซีย v ไทย

ถ้าบัวขาวติดเกาะ จะเลือกใครไปด้วย ??? #Shorts | Buakaw Banchamek

ถ้าบัวขาวติดเกาะ จะเลือกใครไปด้วย ??? #Shorts | Buakaw Banchamek