Graph RAG: Improving RAG with Knowledge Graphs

How to save money with Gemini Context Caching

Kyutais New "VOICE AI" is INSANE (and open source)

#ปูมัณฑนา ยืนยันไม่ได้ถังแตก มีที่ดินหลายร้อยไร่ | Shorts Clip 2024

Despicable Me Fart Blaster

กินหมึกกรุบ Sunsu ต่อหน้า CEO!? อร่อยจริงรึเปล่า!? 🥵🔥 Ft. @Bearhugsk #ramune #sunsu #หมึกกรุบ

Making Long Context LLMs Usable with Context Caching

Prompt Engineering

มุมมอง 3 657

เพิ่มลงใน
- เพลย์ลิสต์ของฉัน
- ดูภายหลัง
แชร์

แชร์

ฝัง

ขนาดวิดีโอ:

แสดงแผงควบคุมโปรแกรมเล่น

เล่นอัตโนมัติ

เล่นใหม่

เผยแพร่เมื่อ 10 ก.ค. 2024
Google's Gemini API now supports context caching, aimed at addressing limitations of long context LLMs by reducing processing time and costs. This video explains how to use the caching feature, its impact on performance, and implementation details with examples.
LINKS:
Context Caching: tinyurl.com/4263z4da
Vertex AI: tinyurl.com/yex8ua5h
Notebook: tinyurl.com/2et8spkf
Pricing: ai.google.dev/pricing
💻 RAG Beyond Basics Course:
prompt-s-site.thinkific.com/c...
Let's Connect:
🦾 Discord: / discord
☕ Buy me a Coffee: ko-fi.com/promptengineering
|🔴 Patreon: / promptengineering
💼Consulting: calendly.com/engineerprompt/c...
📧 Business Contact: engineerprompt@gmail.com
Become Member: tinyurl.com/y5h28s6h
💻 Pre-configured localGPT VM: bit.ly/localGPT (use Code: PromptEngineering for 50% off).
Signup for Newsletter, localgpt:
tally.so/r/3y9bb0
TIMESTAMPS
00:00 Introduction to Google's Context Caching
00:48 How Context Caching Works
01:00 Setting Up Your Cache
03:07 Cost and Storage Considerations
04:46 Example Implementation
08:57 Creating and Using the Cache
11:06 Managing Cache Metadata
12:53 Conclusion and Future Prospects
All Interesting Videos:
Everything LangChain: • LangChain
Everything LLM: • Large Language Models
Everything Midjourney: • MidJourney Tutorials
AI Image Generation: • AI Image Generation Tu...
วิทยาศาสตร์และเทคโนโลยี

ความคิดเห็น • 8

@unclecode 9 วันที่ผ่านมา ⁺³
Thanks! I found the ability to update the TTL very interesting. Imagine building an assistant application for answering questions or customer service. On the server side, we could update the TTL another let's say 5 minutes. When a new user sends a question, we can update it again. When there's no new user, it will be gone. Five minutes is just an example, but it's a great way to keep your cache ready and clear it when you don't need it.
I think the minimum token requirement is likely about profit. They need a minimum number to offer the service economically, saving expenses. Below that threshold, it wouldn't be cost-effective for them. That's my guess.
@engineerprompt 9 วันที่ผ่านมา ⁺¹
Dynamically controlling TTL can be really helpful and I agree the token limit is probably related to cost. I hope they implement the latency reduction soon, since that will make more sense.
@paraconscious790 8 วันที่ผ่านมา
this is very helpful buddy, very time saving and quickly updating my own biological cache without searching for it explicitly. Thanks!
@boooosh2007 9 วันที่ผ่านมา ⁺²
Seems similar but more expensive to vector storage. What am I missing?
@engineerprompt 9 วันที่ผ่านมา ⁺⁵
A couple of thing that differentiate it from vector storage. When you use retrieve info with vector based search, you only get some "chunks" where the LLM doesn't have the whole context of the document, an approach like this will provide complete context to the LLM. Caching can also be really useful with RAG as well. I agree it is going to be more expensive than vectorstores but will potentially save on the infra. Will be interesting to see how it evolves.
@boooosh2007 8 วันที่ผ่านมา
@@engineerprompt yeah chunking would have to be perfect to match the context. But if vector representation and chunking are accurate it should match in context quality. Time will tell ehh?
@DearGeorge3 9 วันที่ผ่านมา
Great news! Thanks!!
@engineerprompt 9 วันที่ผ่านมา
thank you.

ต่อไป

เล่นอัตโนมัติ

Graph RAG: Improving RAG with Knowledge Graphs

Graph RAG: Improving RAG with Knowledge Graphs

How to save money with Gemini Context Caching

How to save money with Gemini Context Caching

Kyutais New "VOICE AI" is INSANE (and open source)

Kyutais New "VOICE AI" is INSANE (and open source)

#ปูมัณฑนา ยืนยันไม่ได้ถังแตก มีที่ดินหลายร้อยไร่ | Shorts Clip 2024

#ปูมัณฑนา ยืนยันไม่ได้ถังแตก มีที่ดินหลายร้อยไร่ | Shorts Clip 2024

Despicable Me Fart Blaster

Despicable Me Fart Blaster

กินหมึกกรุบ Sunsu ต่อหน้า CEO!? อร่อยจริงรึเปล่า!? 🥵🔥 Ft. @Bearhugsk #ramune #sunsu #หมึกกรุบ

กินหมึกกรุบ Sunsu ต่อหน้า CEO!? อร่อยจริงรึเปล่า!? 🥵🔥 Ft. @Bearhugsk #ramune #sunsu #หมึกกรุบ

แข่งหนีพี่เป็ด ใครหนีได้ไกลที่สุดชนะ !!

แข่งหนีพี่เป็ด ใครหนีได้ไกลที่สุดชนะ !!

Claude Artifacts: What it can do and limitations

Claude Artifacts: What it can do and limitations

✅ Easiest Way to Build AI Agents With RAG & CrewAI Locally

✅ Easiest Way to Build AI Agents With RAG & CrewAI Locally

15 INSANE Use Cases for NEW Claude Sonnet 3.5! (Outperforms GPT-4o)

15 INSANE Use Cases for NEW Claude Sonnet 3.5! (Outperforms GPT-4o)

Google Gemma-2: Technical Report Deep Dive

Google Gemma-2: Technical Report Deep Dive

The Gemini API: From prototype to production

The Gemini API: From prototype to production

Unlimited AI Agents running locally with Ollama & AnythingLLM

Unlimited AI Agents running locally with Ollama & AnythingLLM

Build AI Agents with Docker, Here’s How

Build AI Agents with Docker, Here’s How

System Design: Design a URL Shortener like TinyURL

System Design: Design a URL Shortener like TinyURL

What is an LLM Router?

What is an LLM Router?

รีวิว Samsung Galaxy Z Fold6 : มากกว่าเรือธงพับได้ คือ AI Phone ที่พับได้ !

รีวิว Samsung Galaxy Z Fold6 : มากกว่าเรือธงพับได้ คือ AI Phone ที่พับได้ !

Samsung Galaxy Unpacked July 2024: Official Livestream

Samsung Galaxy Unpacked July 2024: Official Livestream

XB-70 Valkyrie เครื่องบินทิ้งระเบิดทางยุทธศาสตร์ความเร็วเหนือเสียง #xb70 #valkyrie #เครื่องบิน

XB-70 Valkyrie เครื่องบินทิ้งระเบิดทางยุทธศาสตร์ความเร็วเหนือเสียง #xb70 #valkyrie #เครื่องบิน

he followed the finger movements #shortvideo #iphonefold #smartphone

he followed the finger movements #shortvideo #iphonefold #smartphone

พรีวิว CMF Phone 1 มือถืออินดี้ที่เปลี่ยนสีเครื่องได้ แค่ขันน็อต 4 ตัว 🤯 !!!

พรีวิว CMF Phone 1 มือถืออินดี้ที่เปลี่ยนสีเครื่องได้ แค่ขันน็อต 4 ตัว 🤯 !!!

Amazing 2024 New Cheep Price Mobile

Amazing 2024 New Cheep Price Mobile

#phonescreenprotector #tempered #smartphone #temperedglass #cellphone #goodthing #mobilephone #tech

#phonescreenprotector #tempered #smartphone #temperedglass #cellphone #goodthing #mobilephone #tech

WATERPROOF RATED IP-69🌧️#oppo #oppof27pro#oppoindia

WATERPROOF RATED IP-69🌧️#oppo #oppof27pro#oppoindia