Scaling AI Workloads with Kubernetes: Sharing GPU Resources Across Multiple Containers - Jack Ong

Andrew Ng Explores The Rise Of AI Agents And Agentic Reasoning | BUILD 2024 Keynote

OCI as a Standard for ML Artifact Storage and Retrieval - Peyman Norouzi & Eric Koepfle, Bloomberg

ทำผิดกฏหมาย 100 ข้อ ในวันเดียว!!

BABYMONSTER - 'Love In My Heart' M/V

🔴Live : สิงคโปร์ พบ ไทย #MATCHDAY รวมพลัง #เชียร์ไทยให้กึกก้อง

Keynote: Accelerating AI Workloads with GPUs in Kubernetes - Kevin Klues & Sanjay Chatterjee

CNCF [Cloud Native Computing Foundation]

มุมมอง 3 952

เพิ่มลงใน
- เพลย์ลิสต์ของฉัน
- ดูภายหลัง
แชร์

แชร์

ฝัง

ขนาดวิดีโอ:

แสดงแผงควบคุมโปรแกรมเล่น

เล่นอัตโนมัติ

เล่นใหม่

เผยแพร่เมื่อ 7 ก.พ. 2025
Don't miss out! Join us at our next Flagship Conference: KubeCon + CloudNativeCon North America in Salt Lake City from November 12 - 15, 2024. Connect with our current graduated, incubating, and sandbox projects as the community gathers to further the education and advancement of cloud native computing. Learn more at kubecon.io
Keynote: Accelerating AI Workloads with GPUs in Kubernetes - Kevin Klues, Distinguished Engineer & Sanjay Chatterjee, Engineering Manager, NVIDIA
As AI and machine learning become ubiquitous, GPU acceleration is essential for model training and inference at scale. However, effectively leveraging GPUs in Kubernetes brings challenges around efficiency, configuration, extensibility, and scalability.
This talk provides an overview of the capabilities needed to address these challenges, enabling seamless support for next-generation AI applications on Kubernetes.
GPU resource-sharing mechanisms such as MPS (Multiple-Process Service), Time-Slicing, MIG (Multi-Instance GPU), and GPU virtualization
Flexible accelerator configuration using the traditional device plugin and the upcoming Dynamic Resource Allocation (DRA) feature
Advanced scheduling and resource management techniques, including gang scheduling, topology-awareness, fault-tolerance and more
Key learnings (and areas of improvement) necessary to scale multi-node AI/ML jobs in large production clusters
Some of these capabilities are already supported today and some of them are not. By addressing the remaining challenges, Kubernetes is poised to emerge as the go-to platform for accelerated AI/ML in the cloud, mirroring Linux's pervasive dominance in the datacenter.

ความคิดเห็น • 2

@luchen3414 9 หลายเดือนก่อน ⁺¹
A perfect overview of GPU with Kubernetes today. Thank you, Kevin and Sanjay.
@artemZinn 3 หลายเดือนก่อน
Welp, AMD is really behind on this, they urgently need a strong technical leader to execute on K8S integration.
Great overview talk and capabilities.

ต่อไป

เล่นอัตโนมัติ

Scaling AI Workloads with Kubernetes: Sharing GPU Resources Across Multiple Containers - Jack Ong

Scaling AI Workloads with Kubernetes: Sharing GPU Resources Across Multiple Containers - Jack Ong

Andrew Ng Explores The Rise Of AI Agents And Agentic Reasoning | BUILD 2024 Keynote

Andrew Ng Explores The Rise Of AI Agents And Agentic Reasoning | BUILD 2024 Keynote

OCI as a Standard for ML Artifact Storage and Retrieval - Peyman Norouzi & Eric Koepfle, Bloomberg

OCI as a Standard for ML Artifact Storage and Retrieval - Peyman Norouzi & Eric Koepfle, Bloomberg

ทำผิดกฏหมาย 100 ข้อ ในวันเดียว!!

ทำผิดกฏหมาย 100 ข้อ ในวันเดียว!!

BABYMONSTER - 'Love In My Heart' M/V

BABYMONSTER - 'Love In My Heart' M/V

🔴Live : สิงคโปร์ พบ ไทย #MATCHDAY รวมพลัง #เชียร์ไทยให้กึกก้อง

🔴Live : สิงคโปร์ พบ ไทย #MATCHDAY รวมพลัง #เชียร์ไทยให้กึกก้อง

Nec Red Rockets Kawasaki vs. LP Bank Ninh Binh - Pool B | Highlights | Club World Champs 2024

Nec Red Rockets Kawasaki vs. LP Bank Ninh Binh - Pool B | Highlights | Club World Champs 2024

Do NOT Learn Kubernetes Without Knowing These Concepts...

Do NOT Learn Kubernetes Without Knowing These Concepts...

[Webinar] How to Build a Modern Agentic System

[Webinar] How to Build a Modern Agentic System

AMD's CEO Wants to Chip Away at Nvidia's Lead | The Circuit with Emily Chang

AMD's CEO Wants to Chip Away at Nvidia's Lead | The Circuit with Emily Chang

Groq CEO Jonathan Ross - Tech Giants in the Generative AI Age

Groq CEO Jonathan Ross - Tech Giants in the Generative AI Age

GPU Acceleration in Kubernetes with Jellyfin (or Anything!) - Intel, Nvidia, AMD

GPU Acceleration in Kubernetes with Jellyfin (or Anything!) - Intel, Nvidia, AMD

NVIDIA GPU Operator Overview

NVIDIA GPU Operator Overview

Nvidia CEO Huang New Chips, AI, Musk, Meeting Trump

Nvidia CEO Huang New Chips, AI, Musk, Meeting Trump

What runs ChatGPT? Inside Microsoft's AI supercomputer | Featuring Mark Russinovich

What runs ChatGPT? Inside Microsoft's AI supercomputer | Featuring Mark Russinovich

Andrew Ng: Opportunities in AI - 2023

Andrew Ng: Opportunities in AI - 2023

ช่วยหนูด้วยคะ #shorts #แม่สุซูกัส

ช่วยหนูด้วยคะ #shorts #แม่สุซูกัส

มายคราฟแต่ "น้ำกับลาวา" สลับกัน!?

มายคราฟแต่ "น้ำกับลาวา" สลับกัน!?

“โดนัท มนัสนันท์” ไหว้ขอสามีมีอีหนูเถอะ!! “หนุ่ม กรรชัย” พร้อมช่วยเหลือ! | 3 แซ่บ (Full) 15 ธ.ค. 67

“โดนัท มนัสนันท์” ไหว้ขอสามีมีอีหนูเถอะ!! “หนุ่ม กรรชัย” พร้อมช่วยเหลือ! | 3 แซ่บ (Full) 15 ธ.ค. 67

🔴LIVE สด! PGC 2024 ศึกชิงแชมป์โลกพับจี Circuit 3 วันที่ 2

🔴LIVE สด! PGC 2024 ศึกชิงแชมป์โลกพับจี Circuit 3 วันที่ 2

บังอาจ ทาบบารมี ! ผ่าเบื้องลึก 1 วันก่อนสังหาร เดินเกมล้มตระกูล “วิลาวัลย์” #ถกไม่เถียง

บังอาจ ทาบบารมี ! ผ่าเบื้องลึก 1 วันก่อนสังหาร เดินเกมล้มตระกูล “วิลาวัลย์” #ถกไม่เถียง

"ทักษิณ" ยึดปราจีนฯ ลูกน้องโกทรแปรพักตร์| DAILYNEWSTODAY 17/12/67

"ทักษิณ" ยึดปราจีนฯ ลูกน้องโกทรแปรพักตร์| DAILYNEWSTODAY 17/12/67

🔴LIVE กัมพูชา vs ติมอร์-เลสเต | ฟุตบอล ASEAN Mitsubishi Electric Cup™ 2024 | รอบแรก กลุ่ม A

🔴LIVE กัมพูชา vs ติมอร์-เลสเต | ฟุตบอล ASEAN Mitsubishi Electric Cup™ 2024 | รอบแรก กลุ่ม A