How to use LLMs for Fact Checking

Output Predictions - Faster Inference with OpenAI or vLLM

How to save money with Gemini Context Caching

🔴 LIVE : ถ่ายทอดสด การออกรางวัลสลากกินแบ่งรัฐบาล งวดวันที่ 16 ธันวาคม 2567

🔴LIVE สด! PGC 2024 ศึกชิงแชมป์โลกพับจี Circuit 3 วันที่ 1

มายคราฟ แต่ ผมห้ามตาย..!!! #minecraft #พี่เก้า #มายคราฟ #minecraftmtr

CONTEXT CACHING for Faster and Cheaper Inference

Trelis Research

มุมมอง 1 955

เพิ่มลงใน
- เพลย์ลิสต์ของฉัน
- ดูภายหลัง
แชร์

แชร์

ฝัง

ขนาดวิดีโอ:

แสดงแผงควบคุมโปรแกรมเล่น

เล่นอัตโนมัติ

เล่นใหม่

เผยแพร่เมื่อ 25 ม.ค. 2025

ความคิดเห็น • 12

@alleskepler9526 2 หลายเดือนก่อน
Bro u a gem
@TrelisResearch 2 หลายเดือนก่อน
appreciate it
@heski6847 4 หลายเดือนก่อน ⁺¹
Thank you, as always very useful content!
@TrelisResearch 4 หลายเดือนก่อน
you're welcome
@Rishab-l1u 2 หลายเดือนก่อน
How do we deal with hallucination resulting from our background info?
@TrelisResearch 2 หลายเดือนก่อน
Take a look at my video on synthetic data generation. I cover it there.
Unless I’m misreading your Q and it relates to caching?
@explorer945 4 หลายเดือนก่อน ⁺¹
How does it different from cachi7by UI libraries like chainlit where they use redis to store the embeddings of prompt and if it matches they return the previous response without even hitting the llm api. Which is better?
@TrelisResearch 4 หลายเดือนก่อน ⁺¹
Howdy! What you're mentioning is embedding caching, which is a complete cache (i.e. the whole answer is stored and retrieved if there's a match).
This here is kv cache embedding, it's partial embedding for LLM inference. When part of a prompt is being reused (and it has to be the first part), there are some intermediate values (k and v) that can be reused in the forward pass to generate the response.
@explorer945 4 หลายเดือนก่อน
@@TrelisResearch got it. why it has to first part? i couldn't quite get it from the video. Also, it is based on initial layers or end layers? how does it help with RAG architectures?
@MrMoonsilver 4 หลายเดือนก่อน ⁺¹
Do you think this will come to open source, self-hosted models?
@TrelisResearch 4 หลายเดือนก่อน ⁺¹
Yup, I show SGLang (same approach for vLLM) in this video!
@MrMoonsilver 4 หลายเดือนก่อน
Super cool, thank you so much.

ต่อไป

เล่นอัตโนมัติ

How to use LLMs for Fact Checking

How to use LLMs for Fact Checking

Output Predictions - Faster Inference with OpenAI or vLLM

Output Predictions - Faster Inference with OpenAI or vLLM

How to save money with Gemini Context Caching

How to save money with Gemini Context Caching

🔴 LIVE : ถ่ายทอดสด การออกรางวัลสลากกินแบ่งรัฐบาล งวดวันที่ 16 ธันวาคม 2567

🔴 LIVE : ถ่ายทอดสด การออกรางวัลสลากกินแบ่งรัฐบาล งวดวันที่ 16 ธันวาคม 2567

🔴LIVE สด! PGC 2024 ศึกชิงแชมป์โลกพับจี Circuit 3 วันที่ 1

🔴LIVE สด! PGC 2024 ศึกชิงแชมป์โลกพับจี Circuit 3 วันที่ 1

มายคราฟ แต่ ผมห้ามตาย..!!! #minecraft #พี่เก้า #มายคราฟ #minecraftmtr

มายคราฟ แต่ ผมห้ามตาย..!!! #minecraft #พี่เก้า #มายคราฟ #minecraftmtr

🔴𝐋𝐈𝐕𝐄 การแข่งขัน RoV นานาชาติ AIC 2024 รอบ Swiss Stage วันที่ 9

🔴𝐋𝐈𝐕𝐄 การแข่งขัน RoV นานาชาติ AIC 2024 รอบ Swiss Stage วันที่ 9

Long Context Summarization

Long Context Summarization

🄵🄰🄸🄻 Leetcode 2657. Find the Prefix Common Array of Two Arrays

🄵🄰🄸🄻 Leetcode 2657. Find the Prefix Common Array of Two Arrays

Claude Prompt Caching: Did Anthropic Create a Better Alternative to RAG?

Claude Prompt Caching: Did Anthropic Create a Better Alternative to RAG?

Optimize RAG Resource Use With Semantic Cache

Optimize RAG Resource Use With Semantic Cache

Fine tune and Serve Faster Whisper Turbo

Fine tune and Serve Faster Whisper Turbo

Making Long Context LLMs Usable with Context Caching

Making Long Context LLMs Usable with Context Caching

What Makes A Great Developer

What Makes A Great Developer

Advanced Embedding Models and Techniques for RAG

Advanced Embedding Models and Techniques for RAG

Context Caching with Gemini LLM

Context Caching with Gemini LLM

ทัวร์สตรีมเมอร์ ROV ชิงเงินรางวัลรวม 25,000 บาท 8 ทีม : รอบ 8 ทีม

ทัวร์สตรีมเมอร์ ROV ชิงเงินรางวัลรวม 25,000 บาท 8 ทีม : รอบ 8 ทีม

เจ้าของแทบทรุด บ้านสร้างได้ 3 เดือน พังทรุดตัว เพจดังชี้สาเหตุ ไม่ใช่เกิดจากเสาเข็ม

เจ้าของแทบทรุด บ้านสร้างได้ 3 เดือน พังทรุดตัว เพจดังชี้สาเหตุ ไม่ใช่เกิดจากเสาเข็ม

ทัวร์สตรีมเมอร์ ROV รอบชิงชนะเลิศ | ชิงเงินรางวัลรวม 25,000 บาท

ทัวร์สตรีมเมอร์ ROV รอบชิงชนะเลิศ | ชิงเงินรางวัลรวม 25,000 บาท

Cool Items!🥰 New Gadgets, Smart Appliances, Kitchen Tools Utensils, Home Cleaning, Beauty #shorts

Cool Items!🥰 New Gadgets, Smart Appliances, Kitchen Tools Utensils, Home Cleaning, Beauty #shorts

ไม่มีใครรักหนูเลย #shorts #แม่สุน้องซูกัส

ไม่มีใครรักหนูเลย #shorts #แม่สุน้องซูกัส

เนื้อเรื่องที่ท่านจะโมโหจนน้ำตาไหล | Mouthwashing

เนื้อเรื่องที่ท่านจะโมโหจนน้ำตาไหล | Mouthwashing

总算是用上情侣手机壳了 #玩一种很新的东西 #手机壳 #情侣

总算是用上情侣手机壳了 #玩一种很新的东西 #手机壳 #情侣

#อึ้ง!เหลือจะเชื่อ!ไทยพลิกนรกดับสิงคโปร์คาบ้าน ทะลุเข้ารอบรองชนะเลิศ! คารวะอิชิอิโคตรการเปลี่ยนแปลง!

#อึ้ง!เหลือจะเชื่อ!ไทยพลิกนรกดับสิงคโปร์คาบ้าน ทะลุเข้ารอบรองชนะเลิศ! คารวะอิชิอิโคตรการเปลี่ยนแปลง!