00:28 🛡 Prompt Injection Attack: A technique for large language models (LLMs) allowing attackers to manipulate model output via carefully crafted prompts, potentially accessing sensitive data or executing unauthorized functions. 01:39 🌐 Prompt Injection Example: Demonstrates injecting hidden instructions into web content, manipulating the model's output when interacting with scraped data. 03:42 🖼 Image-based Prompt Injection: Embedding instructions within an image, prompting the model to generate specific responses when processing visual content. 04:47 🔍 Hidden Instructions in Images: Obscuring prompts within images, exploiting the model's response to generate unexpected links or content. 06:22 📰 Prompt Injection via Search Results: Demonstrates how search engine responses can carry manipulated instructions, potentially leading to malicious actions. 07:43 🛠 Jailbreaks on LLMs: Techniques involve manipulating or redirecting the initial prompts of LLMs to generate unintended content, either through prompt or token level jailbreaks. 08:38 🕵♂ Token-based Jailbreak Example: Exploiting Base64 encoding to manipulate prompts and generate unexpected responses from the model. 09:49 🐟 Fishing Email Jailbreak: Using encoded prompts to coax the model into generating potentially malicious email content, exploiting its response. 11:37 🐼 Image-based Jailbreak: Demonstrating how carefully designed noise patterns in images can prompt the model to generate unintended responses, posing a new attack surface. 13:29 🔒 Growing Security Concerns: Highlighting the potential escalation of security threats as reliance on LLMs and multimodal models increases, emphasizing the need for a robust security approach.
🎯 Key Takeaways for quick navigation: 00:00 🧐 *Prompt injection attack is a new technique for manipulating large language models (LLMs) using carefully crafted prompts to make them ignore instructions or perform unintended actions, potentially revealing sensitive data or executing unauthorized functions.* 01:24 📝 *Examples of prompt injection include manipulating websites to execute specific instructions and crafting images or text to influence LLM responses, potentially leading to malicious actions.* 05:25 🚧 *Prompt injection can also involve hiding instructions in images, leading to unexpected behaviors when processed by LLMs, posing security risks.* 07:43 🔒 *Jailbreak attacks manipulate or hijack LLMs' initial prompts to direct them towards malicious actions, including prompt-level and token-level jailbreaks.* 10:03 💻 *Base64 encoding can be used to create malicious prompts that manipulate LLM responses, even when the model is not supposed to provide such information, potentially posing security threats.* 11:37 🐼 *Jailbreaks can involve introducing noise patterns into images, leading to unexpected LLM responses and posing new attack surfaces on multimodal models, such as those handling images and text.* Made with HARPA AI
Prompt injection: if anyone develops a website and implement code or content that is use to query or generate an output in the front end. They should not be writing code. That’s like putting and hiding sql in or API keys in the front end.
can you please make a video about hands-on comparing the new Gemini Pro(Bard) vs GPT3.5 vs GPT4? I am looking for a straight up comparison with real examples but everyone just uses the edited hand picked marketing material which is useless
00:28 🛡 Prompt Injection Attack: A technique for large language models (LLMs) allowing attackers to manipulate model output via carefully crafted prompts, potentially accessing sensitive data or executing unauthorized functions.
01:39 🌐 Prompt Injection Example: Demonstrates injecting hidden instructions into web content, manipulating the model's output when interacting with scraped data.
03:42 🖼 Image-based Prompt Injection: Embedding instructions within an image, prompting the model to generate specific responses when processing visual content.
04:47 🔍 Hidden Instructions in Images: Obscuring prompts within images, exploiting the model's response to generate unexpected links or content.
06:22 📰 Prompt Injection via Search Results: Demonstrates how search engine responses can carry manipulated instructions, potentially leading to malicious actions.
07:43 🛠 Jailbreaks on LLMs: Techniques involve manipulating or redirecting the initial prompts of LLMs to generate unintended content, either through prompt or token level jailbreaks.
08:38 🕵♂ Token-based Jailbreak Example: Exploiting Base64 encoding to manipulate prompts and generate unexpected responses from the model.
09:49 🐟 Fishing Email Jailbreak: Using encoded prompts to coax the model into generating potentially malicious email content, exploiting its response.
11:37 🐼 Image-based Jailbreak: Demonstrating how carefully designed noise patterns in images can prompt the model to generate unintended responses, posing a new attack surface.
13:29 🔒 Growing Security Concerns: Highlighting the potential escalation of security threats as reliance on LLMs and multimodal models increases, emphasizing the need for a robust security approach.
Thanks for keeping us up to date with understandable examples
🎯 Key Takeaways for quick navigation:
00:00 🧐 *Prompt injection attack is a new technique for manipulating large language models (LLMs) using carefully crafted prompts to make them ignore instructions or perform unintended actions, potentially revealing sensitive data or executing unauthorized functions.*
01:24 📝 *Examples of prompt injection include manipulating websites to execute specific instructions and crafting images or text to influence LLM responses, potentially leading to malicious actions.*
05:25 🚧 *Prompt injection can also involve hiding instructions in images, leading to unexpected behaviors when processed by LLMs, posing security risks.*
07:43 🔒 *Jailbreak attacks manipulate or hijack LLMs' initial prompts to direct them towards malicious actions, including prompt-level and token-level jailbreaks.*
10:03 💻 *Base64 encoding can be used to create malicious prompts that manipulate LLM responses, even when the model is not supposed to provide such information, potentially posing security threats.*
11:37 🐼 *Jailbreaks can involve introducing noise patterns into images, leading to unexpected LLM responses and posing new attack surfaces on multimodal models, such as those handling images and text.*
Made with HARPA AI
Until Sunday, what should I do? Okay, I'll soak up this stuff for now. Thanks Kris
Prompt injection: if anyone develops a website and implement code or content that is use to query or generate an output in the front end. They should not be writing code. That’s like putting and hiding sql in or API keys in the front end.
Wait. I don't get it.
Isn't generating output in the front end the safest way to process user info?
What are we supposed to do then?
Greate video!
Where do you found this scrapping python tool? did you created?
Heya man , was wondering if you could please do an updated whisper tutorial, please? just one on getting full transcripts with the python code 😀
can you please make a video about hands-on comparing the new Gemini Pro(Bard) vs GPT3.5 vs GPT4? I am looking for a straight up comparison with real examples but everyone just uses the edited hand picked marketing material which is useless
Wow 😮
🤓
Thanks for keeping us up to date with understandable examples