Architecture of a ML Platform with Resource Sharing on Kubernetes

Bay.Area.AI: vLLM Project Update, Zhuohan Li, Woosuk Kwon

Enabling Cost-Efficient LLM Serving with Ray Serve

การแข่งขัน RoV Pro League 2024 Winter | รอบเก็บคะแนน Week 5 Day 3

เจอน้องหมูเด้งตัวจริง🦛 #หมูเด้ง #cg #ตัดต่อ

เอาชีวิตรอดจาก ปู่ผี ที่ปาร์ตี้บ้านเพื่อน!! (SPD RUN)

Serving Gemma on GKE using vLLM

Container Bytes

มุมมอง 534

เพิ่มลงใน
- เพลย์ลิสต์ของฉัน
- ดูภายหลัง
แชร์

แชร์

ฝัง

ขนาดวิดีโอ:

แสดงแผงควบคุมโปรแกรมเล่น

เล่นอัตโนมัติ

เล่นใหม่

เผยแพร่เมื่อ 15 ก.ย. 2024
Gemma is a family of lightweight, state-of-the-art open models built from the same research and technology used to create the Gemini models.
vLLM is a fast and easy-to-use library for LLM inference and serving.
In this video Mofi Rahman and Ali Zaidi walks through the process of deploying Gemma on GKE using vLLM serving engine.
Find Gemma on Huggingface - huggingface.co...
Follow along the guide: cloud.google.c...
Find other guides for serving Gemma and other AIML resources for GKE: g.co/cloud/gke-aiml
Find other resources for learning about Gemma: ai.google.dev/...

ความคิดเห็น •

ต่อไป

เล่นอัตโนมัติ

Architecture of a ML Platform with Resource Sharing on Kubernetes

Architecture of a ML Platform with Resource Sharing on Kubernetes

Bay.Area.AI: vLLM Project Update, Zhuohan Li, Woosuk Kwon

Bay.Area.AI: vLLM Project Update, Zhuohan Li, Woosuk Kwon

Enabling Cost-Efficient LLM Serving with Ray Serve

Enabling Cost-Efficient LLM Serving with Ray Serve

การแข่งขัน RoV Pro League 2024 Winter | รอบเก็บคะแนน Week 5 Day 3

การแข่งขัน RoV Pro League 2024 Winter | รอบเก็บคะแนน Week 5 Day 3

เจอน้องหมูเด้งตัวจริง🦛 #หมูเด้ง #cg #ตัดต่อ

เจอน้องหมูเด้งตัวจริง🦛 #หมูเด้ง #cg #ตัดต่อ

เอาชีวิตรอดจาก ปู่ผี ที่ปาร์ตี้บ้านเพื่อน!! (SPD RUN)

เอาชีวิตรอดจาก ปู่ผี ที่ปาร์ตี้บ้านเพื่อน!! (SPD RUN)

ใครคือแก๊งตัวตลกที่แฝงเข้ามาในกลุ่มพวกเรา!? | ปิด เกม ล่า EP.2

ใครคือแก๊งตัวตลกที่แฝงเข้ามาในกลุ่มพวกเรา!? | ปิด เกม ล่า EP.2

Do NOT Learn Kubernetes Without Knowing These Concepts...

Do NOT Learn Kubernetes Without Knowing These Concepts...

Create A GKE Cluster

Create A GKE Cluster

Serve LLM on Google Kubernetes Engine on L4 GPUs

Serve LLM on Google Kubernetes Engine on L4 GPUs

STOP Learning These Programming Languages (for Beginners)

STOP Learning These Programming Languages (for Beginners)

vLLM and Neural Magic Office Hours - June 5, 2024

vLLM and Neural Magic Office Hours - June 5, 2024

Deploying machine learning models on Kubernetes

Deploying machine learning models on Kubernetes

Build Internal Developer Platforms on GKE using GKE Enterprise

Build Internal Developer Platforms on GKE using GKE Enterprise

Fast LLM Serving with vLLM and PagedAttention

Fast LLM Serving with vLLM and PagedAttention

The cloud is over-engineered and overpriced (no music)

The cloud is over-engineered and overpriced (no music)

เอาชีวิตรอด 100วัน โดยเป็นแฟรี่ 🦋 Minecraft Hardcore (Full)

เอาชีวิตรอด 100วัน โดยเป็นแฟรี่ 🦋 Minecraft Hardcore (Full)

感觉要去见太奶了！！有同款宝爸宝妈吗？ #看一遍笑一遍 #宝爸带娃 #人类幼崽 #亲子日常 #露兮粑粑

感觉要去见太奶了！！有同款宝爸宝妈吗？ #看一遍笑一遍 #宝爸带娃 #人类幼崽 #亲子日常 #露兮粑粑

BUS 'LIAR' OFFICIAL MV

BUS 'LIAR' OFFICIAL MV

BUS - แค่ไหนแค่นั้น (NO MATTER WHAT) + LIAR l Thailand Music Countdown EP.19 15 Sep 2024

BUS - แค่ไหนแค่นั้น (NO MATTER WHAT) + LIAR l Thailand Music Countdown EP.19 15 Sep 2024

ไฮไลท์ฟุตบอล พรีเมียร์ลีก 2024/25 สัปดาห์ที่ 4 : เซาธ์แฮมป์ตัน พบ แมนเชสเตอร์ ยูไนเต็ด

ไฮไลท์ฟุตบอล พรีเมียร์ลีก 2024/25 สัปดาห์ที่ 4 : เซาธ์แฮมป์ตัน พบ แมนเชสเตอร์ ยูไนเต็ด

OHANA บ้าพลัง EP.117 : เกมการ์ดโอฮาน่า x จ๊อบ เจง RUBSARB

OHANA บ้าพลัง EP.117 : เกมการ์ดโอฮาน่า x จ๊อบ เจง RUBSARB

เลิกทั้งที่ยังรัก! เจ้าสาวช็อค พ่อเจ้าบ่าวบอกไม่มีสินสอด ก่อนวันวิวาห์วันเดียว lEP.1756l11 ก.ย.67

เลิกทั้งที่ยังรัก! เจ้าสาวช็อค พ่อเจ้าบ่าวบอกไม่มีสินสอด ก่อนวันวิวาห์วันเดียว lEP.1756l11 ก.ย.67

Miss Universe Vietnam 2024 Finals Competition 🛑 LIVE from Vietnam

Miss Universe Vietnam 2024 Finals Competition 🛑 LIVE from Vietnam