I guess one use case might be using a larger LLM to create system prompts for a smaller faster model to enable it to better follow instructions and collect information before the information is summarized and then passed back to the larger model to formalize. For example. Model A instructs model B how to interview the customer and collect the required information which then gets passed back to model A to fill out and submit an online form. this approach would be faster and cheaper than getting model A to do all of the work because A-teir models are often 10X the cost of B-tier models. This kind of system would work really well when collecting information via email, instant message or over the phone.
Interesting. I was envisioning the opposite. Use Haiku to generate the prompt and pass that to either Sonnet or Opus. Worth experimenting with both, I think.
DSPy pipeline work indeed kinda that way, so you use larger models in the beginning to optimize prompts and finetune automatically, create data with larger models to train smaller models, and then run the decomposed tasks on smaller models and evaluate the pipeline using larger models
This is the truth. It is just semantically and Syntactically difficult to adjust to them all. If you add the video, and audio generation it gets 😅 great video!
Whenever I do A/B testing between chatgpt and the free claude model I end up choosing chatgpt mainly because for whatever reason claude tends to hallucinate in authoritative sounding ways whereas if chatgpt doesn't understand something it is more likely to admit that (but not always). For instance today I told claude I added PEA to a gas engine and it assumed I was using biodiesel and proceeded to give a long chat about that. Chatgpt understood that PEA is polyetheramine for cleaning gas systems. So it's hard for me to take claude seriously as yet.
Thanks for sharing this insightful video! It's great to see how Anthropic's Metaprompt can really enhance prompt quality and model interactions. Looking forward to experimenting with it myself. Keep up the awesome work! 😊👍
Really interesting. This will help having hyperspecialized agents. When we know that swarms of these are the future of AI, at least for the coming month ... Thank You Sam
Great tutorial and nicely explained. I believe you assume a certain level of knowledge from the viewer. For a beginner where do we enter the Claude api key? Just as an example of things you assume the viewer already knows. Maybe direct to a basic video explaining so it is not redundant?
By YouSum 00:01:58 Utilize Metaprompt for precise AI instructions. 00:08:49 Metaprompt aids in generating detailed and specific prompts. 00:10:42 Metaprompt ensures polite, professional, and customized AI responses. 00:11:55 Experiment with Metaprompt for tailored AI agent interactions. By YouSum
there are some really nice research and applications of using RAG to choose the best exemplars/examples for few shot learning. What you are saying certainly makes sense.
Looking at how we work together with humans, it would make a lot more sense if the prompting process could be split up: First you chat/onboard the model with the right task/context, they ask for clarifications where needed, then they execute a task, and ask for a review/feedback reflecting back on what they did. Especially the "ask for clarification", "admit lack of knowledge" and "request feedback" parts aren't a default part of the commercial tools yet. Luckily things like meta-prompting and using LangChain agents all seem to converge in that a direction though, like little pieces of the puzzle.
Hi Sam. THX, I am a newbee. I don´t understand why I have to use google Colab?. Is there a difference if i use the AI Anthropic Opus directly? The Output is the same or not? I wil get the prompt with variables in google colab or direct in A. Opus.
@@samwitteveenai o direct would be the similar output. Except that it's free. Because via Colab I would have to pay for the API. Did I understand correctly?
What is the "typical" UX for developers who are using these LLVMs on a daily basis? Is it all in Notebooks like the above video? Web browser based conversations? IDE integrations? And especially if IDE, how do you keep your proprietary code from being scanned and leaking back up to these companies?
Interesting, just when I concluded that too long prompts are not good and usually a symptom of trying to cram in too much info ;D (depending on the situation obviously). Nevertheless the concept is nice, and indeed some of us utilised it for a long time for prompt development and optimisation ;)
I am sure that is because ‘and’ is normally used as an operator in boolean equations (like ‘or’). So the editor (wrongly) highlights it everywhere even when it is not used as a logical operator.
Hello Sam. I hope you're holding up well. There's a video you talked about open-sourcing a web scraping tool. Did you open-source the project? I'd like to contribute to a tool that automates web scraping.
This looks great but very specific to Anthropic models, no? is not that what we are after with using programmatic tools such as DSPy for example to reach the same goal but more "generically"? (similar with Instructor only that this one more focused on formatting I think)
Thanks for the content. With meta prompting in mind, do you thing something like DSPy is a more programmatic alternative or there doing similar things under the hood? And if you’re looking for video ideas… 😊
I am part of a team who is building a GenAI powered analytic tool, we still use a combination of GPT-3.5 and 4…Don’t get me wrong Claude is good, specially sonet the price to performance ration is just out of this world, I guess we are just primarily impressed by OpenAI’s track record of releasing quality model which are significantly better than the previous version under the same API umbrella.
@@Walczyk No i mean when you write a prompt. Before you send it to the Ultra model another model tried to modify it to make it longer and more detailed with concise instructions before sending it to the Ultra model.
oic, i had a feeling it was doing that; because it would read out this clean structure of what it would do; i could see it had received that as its prompt@@nicdemai
Thank you, Sam. I'm excited to dive deep into Metaprompt and learn how to create comprehensive prompts with precise instructions that produce the desired outcomes for users. Can you suggest a course or resource to help me get started on this journey?
Kinda yes, kinda no. You could imagine you have to talk to different humans in different ways etc. Models are like. Of course ideally we would like them to understand everything we want, even humans aren't good at that though
prompt engineering still seems like a such dead end. if it requires each prompt be unrolled into something with a lot of common sense filler, why not add that as a filter to the LLM? so you feed in your prompt, some automated system makes reasonable guesses as to what filler to pack it with, and then see what the LLM makes of it. the problem is the user thinks all of the assumptions they make are obvious and shared by the LLM, and it's not always the case. I'd be interested to know if any LLM tries to predict sentences/clauses the user left out of their prompt, or heaven forbid, ask the user questions!! about what they may have omitted or meant. this is but one way out of this nonsense, and i assume people are trying lots of ways to get rid of this besides what i am suggesting.
What I was trying myself, is a Style/Principles Guide Framework... It just doesn't quite apply the principles though, but it did qualitatively increase responses
I asked Claude 3 questions: 1. Who won the FA Cup in 1955? 2. Which composer won the Academy Award in 1944 and for which film? What was the date of the death of Queen Elizabeth II? CLAUDE got all three questions WRONG. I asked the same of Google's Gemini and it got all three CORRECT. I also asked the same questions of Windows' CO-PILOT, and it also got all three correct, although it took its sweet time about it. Therefore, Claude may know how to metaprompt a travel agent, but it doesn't know its arse from its elbow about anything else. Long live Google Gemini! And Co-Pilot! x
I would say in conversation for example if you see a diference in gpt 3.5 and gpt 4 the later just understands better. Same is true between gpt 4 and opus not a lot but slightly. And when it comes to coding and image generation to make a small change on my website i had to prompt i guess 10 times to make understand when it comes to gpt4 but for opus i got it in first time.
@@tomwojcik7896 everything. Except that it gets emotional and moody when you push it the right way. I had a chat with it today where it just refused to respond.
Perfect timing, I was experimenting with this myself yesterday. But this is a much more in depth take, I’ll have to check it out.
glad it was useful
I guess one use case might be using a larger LLM to create system prompts for a smaller faster model to enable it to better follow instructions and collect information before the information is summarized and then passed back to the larger model to formalize.
For example. Model A instructs model B how to interview the customer and collect the required information which then gets passed back to model A to fill out and submit an online form.
this approach would be faster and cheaper than getting model A to do all of the work because A-teir models are often 10X the cost of B-tier models.
This kind of system would work really well when collecting information via email, instant message or over the phone.
Interesting. I was envisioning the opposite. Use Haiku to generate the prompt and pass that to either Sonnet or Opus. Worth experimenting with both, I think.
DSPy pipeline work indeed kinda that way, so you use larger models in the beginning to optimize prompts and finetune automatically, create data with larger models to train smaller models, and then run the decomposed tasks on smaller models and evaluate the pipeline using larger models
Appreciate you bringing attention to this. Great walkthrough
This is the truth. It is just semantically and Syntactically difficult to adjust to them all. If you add the video, and audio generation it gets 😅 great video!
Whenever I do A/B testing between chatgpt and the free claude model I end up choosing chatgpt mainly because for whatever reason claude tends to hallucinate in authoritative sounding ways whereas if chatgpt doesn't understand something it is more likely to admit that (but not always). For instance today I told claude I added PEA to a gas engine and it assumed I was using biodiesel and proceeded to give a long chat about that. Chatgpt understood that PEA is polyetheramine for cleaning gas systems. So it's hard for me to take claude seriously as yet.
Thanks for sharing this insightful video! It's great to see how Anthropic's Metaprompt can really enhance prompt quality and model interactions. Looking forward to experimenting with it myself. Keep up the awesome work! 😊👍
This comment has to be AI generated
@levi2408 thanks for sharing your interesting insights into fake AI generated comments. I can't wait to learn more about AI generated spam.
😉
Great walkthrough! Thanks Sam!
Really interesting. This will help having hyperspecialized agents. When we know that swarms of these are the future of AI, at least for the coming month ... Thank You Sam
Great tutorial and nicely explained. I believe you assume a certain level of knowledge from the viewer. For a beginner where do we enter the Claude api key? Just as an example of things you assume the viewer already knows. Maybe direct to a basic video explaining so it is not redundant?
There is also a framework called DSPy by Omar Khattab that attempts to remove prompt engineering and it works with any LLM!
Criminally underrated channel
Very helpful, constructive, and practical information. Thank you!
By YouSum
00:01:58 Utilize Metaprompt for precise AI instructions.
00:08:49 Metaprompt aids in generating detailed and specific prompts.
00:10:42 Metaprompt ensures polite, professional, and customized AI responses.
00:11:55 Experiment with Metaprompt for tailored AI agent interactions.
By YouSum
Totally what I think the future will be!
Idea: use RAG to grab the closest prompts from GitHub repos to inject into the meta prompt notebook…. This would probably give even better results?
there are some really nice research and applications of using RAG to choose the best exemplars/examples for few shot learning. What you are saying certainly makes sense.
Looking at how we work together with humans, it would make a lot more sense if the prompting process could be split up: First you chat/onboard the model with the right task/context, they ask for clarifications where needed, then they execute a task, and ask for a review/feedback reflecting back on what they did.
Especially the "ask for clarification", "admit lack of knowledge" and "request feedback" parts aren't a default part of the commercial tools yet.
Luckily things like meta-prompting and using LangChain agents all seem to converge in that a direction though, like little pieces of the puzzle.
Hi Sam. THX, I am a newbee. I don´t understand why I have to use google Colab?. Is there a difference if i use the AI Anthropic Opus directly? The Output is the same or not? I wil get the prompt with variables in google colab or direct in A. Opus.
you can use it directly as long as you copy the prompt over fully etc. Colab is showing how to do it through the API
@@samwitteveenai o direct would be the similar output. Except that it's free.
Because via Colab I would have to pay for the API. Did I understand correctly?
What is the "typical" UX for developers who are using these LLVMs on a daily basis? Is it all in Notebooks like the above video? Web browser based conversations? IDE integrations? And especially if IDE, how do you keep your proprietary code from being scanned and leaking back up to these companies?
My favorite AI Guy, thanks again for your content bro. I hope I get to meet you one day.
I use a GPT Claude 3 prompt Optimizer (it loads Claude prompt documentation / cookbook into chatgpt)
Very interesting, there must be so many use cases for this. (Minor point: in August the time is PDT rather than PST.)
Interesting, just when I concluded that too long prompts are not good and usually a symptom of trying to cram in too much info ;D (depending on the situation obviously). Nevertheless the concept is nice, and indeed some of us utilised it for a long time for prompt development and optimisation ;)
Sam, helpful as always, thank you! How do you think these promopting cookbooks could help agents perform tasks?
Probably a silly question, but why is the "and" at 4:24 blue?
I am sure that is because ‘and’ is normally used as an operator in boolean equations (like ‘or’). So the editor (wrongly) highlights it everywhere even when it is not used as a logical operator.
Hello Sam. I hope you're holding up well. There's a video you talked about open-sourcing a web scraping tool. Did you open-source the project? I'd like to contribute to a tool that automates web scraping.
This looks great but very specific to Anthropic models, no? is not that what we are after with using programmatic tools such as DSPy for example to reach the same goal but more "generically"? (similar with Instructor only that this one more focused on formatting I think)
hey this is insightful, thanks for sharing it. The colab link seems to be broken, can you please share the updated one.
cool video ,sir ... did langchain optimize it for claude 3 yet ?
Yeah you can use LangChain with Claude 3 no problems
So we've successfully outsourced ai prompt engineering to ai? Cool.
For some things. It certainly helps.
I wonder how Claude specific it is - would it generate good prompts for OpenAI GTP4 ?
Certainly worth a try. You can alter it and also run on OpenAI or Gemini etc as well
At the moment I use personalise prompts for every task diffrent one - the quality output is much higher:)
Claude free mode only lasts a few messages until pay wall
Thanks for the content. With meta prompting in mind, do you thing something like DSPy is a more programmatic alternative or there doing similar things under the hood? And if you’re looking for video ideas… 😊
have been playing with DSPy a fair bit with some interesting results. It is quite a bit different than Metaprompt but has some very interesting ideas.
Wont the metaprompt work if I just copy it into claude main interface itself?
yesyou can certainly do something similar like that
What happens when your output has xml in it?
You can parse it very easily.
This is cool and all but what is 4? 1:03
I am part of a team who is building a GenAI powered analytic tool, we still use a combination of GPT-3.5 and 4…Don’t get me wrong Claude is good, specially sonet the price to performance ration is just out of this world, I guess we are just primarily impressed by OpenAI’s track record of releasing quality model which are significantly better than the previous version under the same API umbrella.
I thought my screen was scratched there for a second until I realized it's the human head logo's grey outlines.
The meta prompt is a feature currently in Gemini Advanced but its not released yet. Although it’s not as detailed as this.
you mean how it writes out a list first?
@@Walczyk No i mean when you write a prompt. Before you send it to the Ultra model another model tried to modify it to make it longer and more detailed with concise instructions before sending it to the Ultra model.
oic, i had a feeling it was doing that; because it would read out this clean structure of what it would do; i could see it had received that as its prompt@@nicdemai
Thank you, Sam. I'm excited to dive deep into Metaprompt and learn how to create comprehensive prompts with precise instructions that produce the desired outcomes for users. Can you suggest a course or resource to help me get started on this journey?
META-PROMPTING!!!! YES!!!! This topic excites me!
Maybe this is the work of the famous Prompt Engineer & Librarian position at Anthropic with a base salary of 250-375k USD :D
I hate it when it tells me " i feel uncomfortable completing this task".
cool! :) thanks!
This is awesome
Am i the only one thinking that if i have to "fix my prompt or the AI won't understand" it means that the AI simply is not good enough yet?
Kinda yes, kinda no. You could imagine you have to talk to different humans in different ways etc. Models are like. Of course ideally we would like them to understand everything we want, even humans aren't good at that though
Metaprompt feeling like a RAG of all models.
wonderful
i love this
Problem is cost. Imagine sending that prompt on every API call
you don't need to send this prompt each time. the idea is this makes a prompt that is much shorter that you can use.
prompt engineering still seems like a such dead end. if it requires each prompt be unrolled into something with a lot of common sense filler, why not add that as a filter to the LLM? so you feed in your prompt, some automated system makes reasonable guesses as to what filler to pack it with, and then see what the LLM makes of it. the problem is the user thinks all of the assumptions they make are obvious and shared by the LLM, and it's not always the case. I'd be interested to know if any LLM tries to predict sentences/clauses the user left out of their prompt, or heaven forbid, ask the user questions!! about what they may have omitted or meant. this is but one way out of this nonsense, and i assume people are trying lots of ways to get rid of this besides what i am suggesting.
What I was trying myself, is a Style/Principles Guide Framework... It just doesn't quite apply the principles though, but it did qualitatively increase responses
Yeah but nobody wants to try and figure out how to word things so the model will respond it’s junk. I wouldn’t pay for Gemini
I asked Claude 3 questions: 1. Who won the FA Cup in 1955? 2. Which composer won the Academy Award in 1944 and for which film? What was the date of the death of Queen Elizabeth II? CLAUDE got all three questions WRONG. I asked the same of Google's Gemini and it got all three CORRECT. I also asked the same questions of Windows' CO-PILOT, and it also got all three correct, although it took its sweet time about it. Therefore, Claude may know how to metaprompt a travel agent, but it doesn't know its arse from its elbow about anything else. Long live Google Gemini! And Co-Pilot! x
00:00:35 - watching this guy typing with two fingers is so painful
I want cook the soup
Sadly OpenAI is acting worthless lately. I sure hope they release 5 soon
After using claude, gpt4 is poop.
@@nickdisney3D really? in what areas did you find Claude (Opus, i assume) significantly better?
I would say in conversation for example if you see a diference in gpt 3.5 and gpt 4 the later just understands better. Same is true between gpt 4 and opus not a lot but slightly. And when it comes to coding and image generation to make a small change on my website i had to prompt i guess 10 times to make understand when it comes to gpt4 but for opus i got it in first time.
.@@tomwojcik7896
@@tomwojcik7896 everything. Except that it gets emotional and moody when you push it the right way. I had a chat with it today where it just refused to respond.
It is ridiculous that each AI model should be prompted in a different way.
OpenAI has really dropped the ball lately bad leadership
Tighten up your delivery man, I had to stop listening because your intro was too long
Can't you take what you're given...
Someone might have attention span issues 😂
????? What????
1.75x speed is your friend
Be cool, man. Someone is teaching you something. You can increase the speed and give a nice feedback. Everybody wins.