Super cool! Thanks for the video! I wonder - would it be possible to have english prompts in a QGIS environment which connects to your postgres database via PostGIS? i.e you make the prompt (in written english) and the results of the prompt appear within your QGIS screen... 🤔
@@KayleighBarrow-z9b thanks for the feedback! I’m not aware of the QGIS capabilities but you can certainly run an instance of Postgres with both PostGIS and pgvector extensions. QGIS will be able to connect and then you can explore if the tool allows to write and send the prompts to Postgres
Thanks for this video! Got some really promising results in just a few hours! Quick question -- I don't need the context to have the full table ["products", "users", etc]. For my use-case I only need it to have context for the user. (i.e. products for user 1005, purchases for user 1005, etc) If I provide the full tables in include_tables, it very quickly reaches the token limit. Is there a way to dynamically reduce the amount of context when initializing the Langchain Database agent?
Excellent! glad you managed to get it working on your end 👍 Try to instruct the LLM to retrieve that context dynamically. For instance, you can say to execute the query "select name, price from products where id = {id}" setting the id as a parameter. Then, the LLM can perform this request over the database and pull a user-specific context. Also, LangChain support various tools/actions that let LLM pull info from other sources or perform various actions: python.langchain.com/v0.1/docs/modules/tools/
@@DevMastersDb That makes sense, the issue is (unless I'm misunderstanding), passing entire tables to the LLM, regardless of whether the LLM knows that it should filter down to a subset of data, seems to take a lot of tokens. i.e. We're providing a lot of context, and then asking the LLM to disregard most of it (as opposed to providing the narrow context in the first place). As a result, if I add more than two tables, I get the error: {'error': {'message': "This model's maximum context length is 4097 tokens, however you requested 10494 tokens. Please reduce your prompt; or completion length.", 'type': 'invalid_request_error', 'param': None, 'code': None}} I'm extremely new to this stuff (just a day or two), so I could totally be missing something! One thing I'm going to try next is to create a View with the data after I've applied some joins and filters to it and then pass the view in include_tables instead, so I'm providing just the minimum context the model would need.. not sure if that'll work, or is even the right way of thinking about it 🤔
@@anuragangara2619 how many tables do you folks have? First things, yes, ask the agent to look into those which are necessary by defining them in the "include_tables". The LLM will pull only the metadata of those tables and not the actual data. So, it should fit in the context window. And then in your system message for the LLM you also say that use this query to pull the actual data for a selected user. If that data set doesn't fit into the context, then try to filter by the user_id/product_id and some time frame. Anyway, take it slowly and learn LangChain capabilities by playing with a few tutorials. You'll definitely get it figured!
@@DevMastersDb Hmm, odd -- only 5 or 6 tables; personal project, so just me. I'll look into that then! Thanks so much for the great video, really enjoyed this!
@@anuragangara2619 make sure that LLM doesn’t pull data, enable the “verbose” mode to see what happens. Also, start with 1-2 tables and then add more, see when it breaks. It might have to discover the root cause. Glad you liked the video, thanks for feedback!
second query want to switch between different API using chat prompt in context with this how agents will work can you please create a video or share some document.
@@DevMastersDb API i mean our software application API routes I want to call API as per prompt for example. /teacher /student if prompt is show student details student route should execute. So shall i go with agent concept please guide Thank you so much for your prompt response
I see. In fact, the model (at least OpenAI GPT4) can make a call to your API depending on a user prompt. You need to create an OpenAI spec that defines your /teacher and /student endpoints. Then the GPT model will execute one of the endpoints depending on a prompt. Check my latest video on this topic. This should be a perfect starting point for you: th-cam.com/video/Ysh9dwia8FM/w-d-xo.htmlsi=fr9bWEOFrmSaAsO1
Thank you so much. Just one query prepare_agent_prompt Trying to create 5 different agent prompts example query with Mysql, Query with Postgres, Query with Aurora. Should I create 5 different agent prompt or as per {input_text} it will manage
Thanks! Glad you liked the video.👍 The {input_text} is the user prompt (your actual question) that you are asking later in the notebook. What you can do is to add {database_dialect} as another parameter to the agent prompt’s text and pass Postgres, MySQL or another db name. If the behavior between databases is very different, then just create a unique agent prompt for every database.
Glad you found the video useful! As for MongoDB and other NoSQL databases, I don’t see that LangChain supports agents for them. But some folks found a way how to create custom agents using foundational capabilities of LangChain: omershahzad.medium.com/agent-for-mongodb-langchain-ccf69913a11a
Wonderful video!!! What if the table or column names or even the data are not english like, i.e. a product table is named PD and columns are id, prodcode, typecode, or some combination of columns for a particular info. Data sometimes can be some ridiculous abbreviation. how would you educate it? Via prompt? Thanks so much😂
Thanks, glad you liked it! Non-English data is easy as long as the OpenAI GPT LLM speaks and understands many languages. For instance, you can run the Jupyter notebook from the video and ask questions in a different language and/or insert non English data. The LLM should handle it easily. As for cumbersome abbreviations, I have no idea for now :) but I’ll sort this out while preparing for the following live stream next week. Join if you wish, I’ll make sure to answer the question: th-cam.com/users/live-UvpSBHJFdU?si=oirCEwPCnHRJUBYQ
what if I want my agent to add data to the database ,for example If I want to make an agent for a clinic that book a new appointment for the patient and store his information in the database
Yes, the agent can modify data by translating your requests into INSERT, UPDATE, DELETE, and other DML/DDL statements. In the video, the agent connects to the database using a read-only user role. The agent still can generate a statement updating the database but the database won’t let it execute the statement. Overall, you can create a database role/user for the agent with broader permission and use it at your own risk.
@@Mostafa_Sharaf_4_9 if you follow the Jupyter notebook shared in the description of the video, then you'll find this code that opens a connection to the database SQLDatabase.from_uri( "postgresql+psycopg2://sql_agent:password@localhost:5432/postgres", include_tables=["products", "users", "purchases", "product_inventory"]); Now, replace "sql_agent" with "postgres" user in that snippet and the agent will connect to the database using the "postgres" user that can do anything.
Yes, you need to tweak the current implementation as follows: 1. Store the history in some variable like "chat_history". 2. Pass this variable to the agent prompt that is generated by the "prepare_agent_prompt" method. 3. You can append the chat history to the end "agent_prompt" variable as follows ".... also, consider the following chat history {chat_history}"
RAG stands for retrieval-augmented generation. It's a technique to enhance the behavior of an LLM by providing it with more context. Usually, you get that context from your own database that stores your own data. For instance, let's say you ask ChatGTP to recommend a few places to stay in NYC between April 1-7, 2023. ChatGPT doesn't know those details, it was trained on some generic data from the past and didn't have access to the private data of Expedia or Booking.com. But Expedia/Booking's own AI assistant can easily address this task by using the RAG approach. You ask their assistant to recommend the places, they query data from the database and feed it as a context to an LLM (that can be GPT), and then the LLM responds to you like a human would.
@@DevMastersDb Great explanation thanks. So is your langchain example RAG? Because it's providing extra metadata etc to your query? I've looked at a few examples of langchain and it seems to match my idea of what RAG is, but langchain doesn't call itself RAG so maybe I'm missing something. Trying to figure out what all these new buzzwords mean hah
@@slowjocrow6451 yep, LangChain doesn't have any RAG-specific APIs and it doesn't need them. But when you create those chains (with LangChain) and some parts of the chain retrieve additional information from a database or another resource and feed this information as an extra context to an LLM - then you're effectively creating a RAG-based solution with LangChain. Hope it makes sense. Also, I found this LangChain cookbook useful, take a look: python.langchain.com/docs/expression_language/cookbook/retrieval
excellent explanation and the demo. Thank you!
Thank you for feedback! Glad you found it useful.
Really cool! Thanks a lot
Nice well explained content, subscribed!
Welcome aboard! Glad you found the tutorial useful 👍
Crazy stuff, thanks for the video
many thanks and keep going my friend !
Glad you liked it my friend!
Super cool! Thanks for the video! I wonder - would it be possible to have english prompts in a QGIS environment which connects to your postgres database via PostGIS? i.e you make the prompt (in written english) and the results of the prompt appear within your QGIS screen... 🤔
@@KayleighBarrow-z9b thanks for the feedback! I’m not aware of the QGIS capabilities but you can certainly run an instance of Postgres with both PostGIS and pgvector extensions. QGIS will be able to connect and then you can explore if the tool allows to write and send the prompts to Postgres
crazy useful and stragit to the point!!!
Yep, this stuff is crazy. And that’s just the beginning. It’s gonna be much wilder soon )
Thanks for this video! Got some really promising results in just a few hours!
Quick question -- I don't need the context to have the full table ["products", "users", etc]. For my use-case I only need it to have context for the user. (i.e. products for user 1005, purchases for user 1005, etc) If I provide the full tables in include_tables, it very quickly reaches the token limit. Is there a way to dynamically reduce the amount of context when initializing the Langchain Database agent?
Excellent! glad you managed to get it working on your end 👍
Try to instruct the LLM to retrieve that context dynamically. For instance, you can say to execute the query "select name, price from products where id = {id}" setting the id as a parameter. Then, the LLM can perform this request over the database and pull a user-specific context.
Also, LangChain support various tools/actions that let LLM pull info from other sources or perform various actions: python.langchain.com/v0.1/docs/modules/tools/
@@DevMastersDb That makes sense, the issue is (unless I'm misunderstanding), passing entire tables to the LLM, regardless of whether the LLM knows that it should filter down to a subset of data, seems to take a lot of tokens. i.e. We're providing a lot of context, and then asking the LLM to disregard most of it (as opposed to providing the narrow context in the first place). As a result, if I add more than two tables, I get the error:
{'error': {'message': "This model's maximum context length is 4097 tokens, however you requested 10494 tokens. Please reduce your prompt; or completion length.", 'type': 'invalid_request_error', 'param': None, 'code': None}}
I'm extremely new to this stuff (just a day or two), so I could totally be missing something!
One thing I'm going to try next is to create a View with the data after I've applied some joins and filters to it and then pass the view in include_tables instead, so I'm providing just the minimum context the model would need.. not sure if that'll work, or is even the right way of thinking about it 🤔
@@anuragangara2619 how many tables do you folks have? First things, yes, ask the agent to look into those which are necessary by defining them in the "include_tables".
The LLM will pull only the metadata of those tables and not the actual data. So, it should fit in the context window.
And then in your system message for the LLM you also say that use this query to pull the actual data for a selected user. If that data set doesn't fit into the context, then try to filter by the user_id/product_id and some time frame.
Anyway, take it slowly and learn LangChain capabilities by playing with a few tutorials. You'll definitely get it figured!
@@DevMastersDb Hmm, odd -- only 5 or 6 tables; personal project, so just me. I'll look into that then!
Thanks so much for the great video, really enjoyed this!
@@anuragangara2619 make sure that LLM doesn’t pull data, enable the “verbose” mode to see what happens. Also, start with 1-2 tables and then add more, see when it breaks. It might have to discover the root cause.
Glad you liked the video, thanks for feedback!
second query want to switch between different API using chat prompt in context with this how agents will work can you please create a video or share some document.
You mean you want to use another LLM, the one that is different from OpenAI? If yes, LangChain supports many LLM providers.
@@DevMastersDb API i mean our software application API routes I want to call API as per prompt for example. /teacher /student if prompt is show student details student route should execute. So shall i go with agent concept please guide Thank you so much for your prompt response
I see. In fact, the model (at least OpenAI GPT4) can make a call to your API depending on a user prompt. You need to create an OpenAI spec that defines your /teacher and /student endpoints. Then the GPT model will execute one of the endpoints depending on a prompt. Check my latest video on this topic. This should be a perfect starting point for you: th-cam.com/video/Ysh9dwia8FM/w-d-xo.htmlsi=fr9bWEOFrmSaAsO1
Thank you so much. Just one query prepare_agent_prompt Trying to create 5 different agent prompts example query with Mysql, Query with Postgres, Query with Aurora. Should I create 5 different agent prompt or as per {input_text} it will manage
Thanks! Glad you liked the video.👍
The {input_text} is the user prompt (your actual question) that you are asking later in the notebook. What you can do is to add {database_dialect} as another parameter to the agent prompt’s text and pass Postgres, MySQL or another db name. If the behavior between databases is very different, then just create a unique agent prompt for every database.
what model is this using? Can we do this with Azure openai?
I used OpenAI GPT 4. Absolutely, you can use other models including Azure OpenAI
Really wonderful video. Thanks for sharing it with everyone. I just have a question, can we use this with NoSql databases like MongoDB/DynamoDB?
Glad you found the video useful! As for MongoDB and other NoSQL databases, I don’t see that LangChain supports agents for them. But some folks found a way how to create custom agents using foundational capabilities of LangChain: omershahzad.medium.com/agent-for-mongodb-langchain-ccf69913a11a
@@DevMastersDb thank you so much for the quick reply. Appreciate it 🙏🏼
Wonderful video!!! What if the table or column names or even the data are not english like, i.e. a product table is named PD and columns are id, prodcode, typecode, or some combination of columns for a particular info. Data sometimes can be some ridiculous abbreviation. how would you educate it? Via prompt? Thanks so much😂
Thanks, glad you liked it!
Non-English data is easy as long as the OpenAI GPT LLM speaks and understands many languages. For instance, you can run the Jupyter notebook from the video and ask questions in a different language and/or insert non English data. The LLM should handle it easily.
As for cumbersome abbreviations, I have no idea for now :) but I’ll sort this out while preparing for the following live stream next week. Join if you wish, I’ll make sure to answer the question: th-cam.com/users/live-UvpSBHJFdU?si=oirCEwPCnHRJUBYQ
Excelet! thanks alot!
You're welcome!
what if I want my agent to add data to the database ,for example If I want to make an agent for a clinic that book a new appointment for the patient and store his information in the database
Yes, the agent can modify data by translating your requests into INSERT, UPDATE, DELETE, and other DML/DDL statements.
In the video, the agent connects to the database using a read-only user role. The agent still can generate a statement updating the database but the database won’t let it execute the statement.
Overall, you can create a database role/user for the agent with broader permission and use it at your own risk.
@@DevMastersDb can you please tell me how to make a database with border permission ?
@@Mostafa_Sharaf_4_9 if you follow the Jupyter notebook shared in the description of the video, then you'll find this code that opens a connection to the database
SQLDatabase.from_uri(
"postgresql+psycopg2://sql_agent:password@localhost:5432/postgres",
include_tables=["products", "users", "purchases", "product_inventory"]);
Now, replace "sql_agent" with "postgres" user in that snippet and the agent will connect to the database using the "postgres" user that can do anything.
thank you , last question please , how to make agent with custom prompt with memory ?
@@DevMastersDb
Will it work with chat history?
Yes, you need to tweak the current implementation as follows:
1. Store the history in some variable like "chat_history".
2. Pass this variable to the agent prompt that is generated by the "prepare_agent_prompt" method.
3. You can append the chat history to the end "agent_prompt" variable as follows ".... also, consider the following chat history {chat_history}"
What is RAG?
RAG stands for retrieval-augmented generation. It's a technique to enhance the behavior of an LLM by providing it with more context. Usually, you get that context from your own database that stores your own data.
For instance, let's say you ask ChatGTP to recommend a few places to stay in NYC between April 1-7, 2023. ChatGPT doesn't know those details, it was trained on some generic data from the past and didn't have access to the private data of Expedia or Booking.com.
But Expedia/Booking's own AI assistant can easily address this task by using the RAG approach. You ask their assistant to recommend the places, they query data from the database and feed it as a context to an LLM (that can be GPT), and then the LLM responds to you like a human would.
@@DevMastersDb Great explanation thanks. So is your langchain example RAG? Because it's providing extra metadata etc to your query? I've looked at a few examples of langchain and it seems to match my idea of what RAG is, but langchain doesn't call itself RAG so maybe I'm missing something. Trying to figure out what all these new buzzwords mean hah
@@slowjocrow6451 yep, LangChain doesn't have any RAG-specific APIs and it doesn't need them. But when you create those chains (with LangChain) and some parts of the chain retrieve additional information from a database or another resource and feed this information as an extra context to an LLM - then you're effectively creating a RAG-based solution with LangChain. Hope it makes sense.
Also, I found this LangChain cookbook useful, take a look: python.langchain.com/docs/expression_language/cookbook/retrieval
Can I ask you a couple of questions on Telegram?
Really cool! Thanks a lot
Glad you liked it! Anything else you'd like to learn about? It should be related to databases (the focus of my channel)