Do NOT Learn Kubernetes Without Knowing These Concepts...

Kubernetes Network Policy Deep Dive

Run Uncensored LLAMA on Cloud GPU for Blazing Fast Inference ⚡️⚡️⚡️

จารย์❌ จาน✅ #ตลก #บ้านกูเอง

แฟนบักอ้าย - ต๊ะ มิสเตอร์แคน Feat. แดง คนมอ【OFFICIAL MV】

พระพุทธรูปกินคน | หลอนไดอารี่ EP.254

vLLM on Kubernetes in Production

Kubesimplify

มุมมอง 3 673

เพิ่มลงใน
- เพลย์ลิสต์ของฉัน
- ดูภายหลัง
แชร์

แชร์

ฝัง

ขนาดวิดีโอ:

แสดงแผงควบคุมโปรแกรมเล่น

เล่นอัตโนมัติ

เล่นใหม่

เผยแพร่เมื่อ 12 พ.ย. 2024

ความคิดเห็น • 12

@JohnCodes 5 หลายเดือนก่อน ⁺⁶
Thanks for having me on Saiyam!! It was alot of fun to show you how we use vLLM at OpenSauced!! Happy to answer any questions here people might have!
@DestinoDello หลายเดือนก่อน
Can share the yaml for deployment please?
@aireddy 4 หลายเดือนก่อน ⁺¹
This is absolutely wonderful session to understand how can we deploy LLMs in production on Kubernetes cluster!!
@kubesimplify 4 หลายเดือนก่อน
@@aireddy glad it was helpful!
@nickytonline 2 หลายเดือนก่อน
Great video and great breakdown @JohnCodes!
@DaewonSuh 3 หลายเดือนก่อน
Thanks for the wonderful Demo!
I was wondering why you deploy vllm pod through demonsets rather than deployments.
With daemonset, you can only deploy one pod in one node and a pod occupying a single gpu.
Considering that nodes are usually attached with multiple gpus, I am afraid that using daemonset might make a lot of gpus idle.
@umeshjaiswal5298 5 หลายเดือนก่อน
Thanks for this tutorial Saiyam.
@kubesimplify 5 หลายเดือนก่อน
Glad its useful, you building something with LLM?
@divyamchandel8734 4 หลายเดือนก่อน
Hi John / Saiyam. In the last part you mentioned "In lot of cases could be cheaper"
What are those cases where locally hosting it is cheaper vs when using openai is cheaper:
Is it just dependent on the load which we will have (RPD and max RPM)?
@matrix9083 4 หลายเดือนก่อน
openai is $.50 per million tokens for gpt 3.5 for example. If you rent a gpu server for that same amount, you can generate tens or hundred of millions of tokens in one hour depending on which text generation model you choose. something like mistral 7b, phi 3 series, llama 3 8b, gemma 2b,etc all deliver about the same results if not better than gpt 3.5 and also all fit on a gpu server that costs 44 cents per hour on runpod. (the A5000 gpu server for example.)
@shivangsharma1 3 หลายเดือนก่อน ⁺¹
Loved it...❤
@kubesimplify 3 หลายเดือนก่อน
Glad you found it useful!

ต่อไป

เล่นอัตโนมัติ

Do NOT Learn Kubernetes Without Knowing These Concepts...

Do NOT Learn Kubernetes Without Knowing These Concepts...

Kubernetes Network Policy Deep Dive

Kubernetes Network Policy Deep Dive

Run Uncensored LLAMA on Cloud GPU for Blazing Fast Inference ⚡️⚡️⚡️

Run Uncensored LLAMA on Cloud GPU for Blazing Fast Inference ⚡️⚡️⚡️

จารย์❌ จาน✅ #ตลก #บ้านกูเอง

จารย์❌ จาน✅ #ตลก #บ้านกูเอง

แฟนบักอ้าย - ต๊ะ มิสเตอร์แคน Feat. แดง คนมอ【OFFICIAL MV】

แฟนบักอ้าย - ต๊ะ มิสเตอร์แคน Feat. แดง คนมอ【OFFICIAL MV】

พระพุทธรูปกินคน | หลอนไดอารี่ EP.254

พระพุทธรูปกินคน | หลอนไดอารี่ EP.254

UNLIMITED CHOCOLATE 😲😍| My Dad is a Vending Machine!

UNLIMITED CHOCOLATE 😲😍| My Dad is a Vending Machine!

Accelerating LLM Inference with vLLM

Accelerating LLM Inference with vLLM

Unleashing the Power of AI in Kubernetes through K8sGPT | Alex Jones

Unleashing the Power of AI in Kubernetes through K8sGPT | Alex Jones

How to pick a GPU and Inference Engine?

How to pick a GPU and Inference Engine?

Kubernetes vs. Docker: It's Not an Either/Or Question

Kubernetes vs. Docker: It's Not an Either/Or Question

Fast LLM Serving with vLLM and PagedAttention

Fast LLM Serving with vLLM and PagedAttention

Why is Kubernetes Popular | What is Kubernetes?

Why is Kubernetes Popular | What is Kubernetes?

Deploying machine learning models on Kubernetes

Deploying machine learning models on Kubernetes

CUDA Mode Keynote | Lily Liu | vLLM

CUDA Mode Keynote | Lily Liu | vLLM

vLLM - Turbo Charge your LLM Inference

vLLM - Turbo Charge your LLM Inference

พระพุทธรูปกินคน | หลอนไดอารี่ EP.254

พระพุทธรูปกินคน | หลอนไดอารี่ EP.254

[Full] 4 ต่อ 4 Celebrity EP.922 | 10 พ.ย. 67 | one31

[Full] 4 ต่อ 4 Celebrity EP.922 | 10 พ.ย. 67 | one31

F.HERO Ft. JSPKK x ลำไย ไหทองคำ x M-PEE - ไม่สนิทบิดหมด (Thai Riders Anthem) [Official MV]

F.HERO Ft. JSPKK x ลำไย ไหทองคำ x M-PEE - ไม่สนิทบิดหมด (Thai Riders Anthem) [Official MV]

Smart Parenting Gadget for a Mess-Free Mealtime 🍽️👍 #parenting #gadgets #asmr

Smart Parenting Gadget for a Mess-Free Mealtime 🍽️👍 #parenting #gadgets #asmr

ONE 169 Full Fight | 9 พ.ย. 2567 | Ch7HD

ONE 169 Full Fight | 9 พ.ย. 2567 | Ch7HD

The IMPOSSIBLE Puzzle..

The IMPOSSIBLE Puzzle..

มายคราฟแต่ถ้าผมเห็น "สีน้ำเงิน" คลิปนี้จะระเบิด!?

มายคราฟแต่ถ้าผมเห็น "สีน้ำเงิน" คลิปนี้จะระเบิด!?

🔴Live สด! PUBG GLOBAL SERIES 6 | FINAL STAGE DAY 3

🔴Live สด! PUBG GLOBAL SERIES 6 | FINAL STAGE DAY 3