Mastering Natural Language to SQL with LangChain and LangSmith | NL2SQL | With Code 👇

แชร์
ฝัง
  • เผยแพร่เมื่อ 10 มี.ค. 2024
  • Embark on a journey to redefine database querying with "Mastering Natural Language to SQL with LangChain | NL2SQL." This in-depth video guide will navigate you through the revolutionary process of converting natural language queries into SQL commands using LangChain. Here's what we'll cover:
    🌟 Introduction to NL2SQL: Understand the basics of translating natural language to SQL with LangChain.
    🔨 Building Your First NL2SQL Model: Step-by-step guide on creating a foundational NL2SQL model.
    🎯 Incorporating Few-Shot Learning: How to enhance model accuracy using few-shot examples.
    🔄 Dynamic Few-Shot Example Selection: Tailor your model's learning process with dynamic example selection for improved relevance.
    🔍 Identifying Relevant Database Tables Dynamically: Techniques to automatically determine which tables to query, streamlining your SQL commands.
    ✍️ Customizing Prompts for Accurate Responses: Learn to customize prompts for clearer, more precise model outputs.
    💬 Adding Conversational Memory: Enable your model to handle follow-up questions by integrating memory into your chatbot.
    Designed for data scientists, developers, and anyone interested in leveraging AI for more intuitive database management, this video equips you with the knowledge to make your data querying as simple as having a conversation. Explore the forefront of natural language processing technology and transform your approach to database interactions.
    👉 Code and explanation: blog.futuresmart.ai/mastering...
    If you're curious about the latest in AI technology, I invite you to visit my project, AI Demos, at www.aidemos.com/. It's a rich resource offering a wide array of video demos showcasing the most advanced AI tools.
    For even more in-depth exploration, be sure to visit my TH-cam channel at / @aidemos.futuresmart . Here, you'll find a wealth of content that delves into the exciting future of AI and its various applications.
    🚀 Top Rated Plus Data Science Freelancer with 8+ years of experience, specializing in NLP and Back-End Development. Founder of FutureSmart AI, helping clients build custom AI NLP applications using cutting-edge models and techniques. Former Lead Data Scientist at Oracle, primarily working on NLP and MLOps.
    💡 As a Freelancer on Upwork, I have earned over $100K with a 100% Job Success rate, creating custom NLP solutions using GPT-3, ChatGPT, GPT-4, and Hugging Face Transformers. Expert in building applications involving semantic search, sentence transformers, vector databases, and more.
    #NL2SQL #LangChain #NaturalLanguageProcessing #DataScience #AI #DatabaseManagement #SQL

ความคิดเห็น • 94

  • @carloseduardogabrielsantos9939
    @carloseduardogabrielsantos9939 4 วันที่ผ่านมา

    That's amazing, congratulations and thank you very much for making your time available to share you knowledge.

  • @raumwerk09
    @raumwerk09 หลายเดือนก่อน +7

    Excellent! I've already watched some 10+ other tutorials on LangChain SQL chats, but none of these really got beyond the basic demo for simplistic databases, completely ignoring the problems of token limits and providing custom table info to the LLM for more complex databases...so this one really stands out. Kind Thanks for the insights, and Greetings from Germany :-)

    • @FutureSmartAI
      @FutureSmartAI  หลายเดือนก่อน

      Glad it was helpful!

  • @JohnBoen
    @JohnBoen หลายเดือนก่อน +1

    I have been a database engineer for decades. I dream SQL...
    You made the *perfect* tutorial for me :)

  • @user-zl4vl6oo1l
    @user-zl4vl6oo1l 3 หลายเดือนก่อน +5

    GREAT session. You are legend, smart, and helpful. thank you.

  • @diptimanraichaudhuri6477
    @diptimanraichaudhuri6477 3 หลายเดือนก่อน +3

    Very well explained Pradip! Keep up the good work !

  • @aguntuk10
    @aguntuk10 2 หลายเดือนก่อน +3

    very good expalnation Pradip , really helpful . Thanks a ton

  • @mtsoul9834
    @mtsoul9834 17 วันที่ผ่านมา +1

    This is what i looking for. Really appreciated

  • @RibsCribs
    @RibsCribs 4 วันที่ผ่านมา

    thank you for this video. learn a lot from it.

  • @sakalyamitra9935
    @sakalyamitra9935 3 หลายเดือนก่อน +1

    Went through the video and it is greatly explained in detail. The chaining explanation along with parallel visualization in Langsmith made it super easy to understand how the things are getting executed. Great video to master NL2SQL using Langchain.

    • @FutureSmartAI
      @FutureSmartAI  3 หลายเดือนก่อน

      You're very welcome!

    • @shaktidharreddy6822
      @shaktidharreddy6822 3 หลายเดือนก่อน

      @@FutureSmartAI pls need streamlit code, gitrepo doesnt have

  • @kenchang3456
    @kenchang3456 2 หลายเดือนก่อน +1

    Hi Pradit, thank you very much for a detailed video. It was valuable and insightful for my POC.

    • @FutureSmartAI
      @FutureSmartAI  หลายเดือนก่อน

      Glad it was helpful!

  • @SarbaniMaiti
    @SarbaniMaiti 3 หลายเดือนก่อน +1

    This is awesome Pradip, you just built an application ready to go in production :)..very useful.

    • @FutureSmartAI
      @FutureSmartAI  3 หลายเดือนก่อน

      Glad it was helpful!

  • @dhruvparthasarathy2050
    @dhruvparthasarathy2050 23 วันที่ผ่านมา +2

    Hats off

  • @SanjayRoy-vz5ih
    @SanjayRoy-vz5ih 3 หลายเดือนก่อน +2

    Absolutely fantastic tutorial...only question that you on "top-k"... remember it is used only in case of vector embedding for neighborhood similarity search, or as it is called cosine similarity... otherwise it is of no use...if you are not embedding

  • @YatanveerSingh
    @YatanveerSingh 2 หลายเดือนก่อน

    Hi Pradip,
    Thanks for your blog and this video. This helped me to understand how to use NLP to SQL in a very easy way.
    I am able to use this on my database, however sometime I am getting list index out of range. This is happening specially if there are lot of rows in my tables.
    So I am not sure if I use this in Production.
    Can you advice.
    Thanks
    Yatan

  • @kallolbecs
    @kallolbecs 3 หลายเดือนก่อน +1

    Great video, can u make one for nl 2 mongodb as well?

  • @user-cq7mu7fu6o
    @user-cq7mu7fu6o หลายเดือนก่อน +2

    really amzing content sir , i learn so much from this vdeo
    thank sir :)
    love from hyderabad

  • @RAJARAMRamamoorthi
    @RAJARAMRamamoorthi 3 หลายเดือนก่อน

    Have you used any GraphQL chain , if so, pls do a detailed view on that , very helpful session thanks

  • @KumR
    @KumR 3 หลายเดือนก่อน

    Hey Pradip . Can we connect this to Oracle Fusion ERP

  • @soumyaparida9231
    @soumyaparida9231 2 หลายเดือนก่อน

    Great session.But can you tell how to implement it using gemini pro?

  • @chihebnouri5541
    @chihebnouri5541 2 หลายเดือนก่อน

    i think building sql llm with gemini is better no?

  • @wealth_developer_researcher
    @wealth_developer_researcher 3 หลายเดือนก่อน

    Thanks for nice video. I have one question. What if I am treating this .py file as API Endpoint. And want to use same chain per user. Here I can see you have used Streamlit cache resource so it comes from cache on same server. But my requirement is to use this as endpoint so that i can call it from any server

  • @naveennoelj
    @naveennoelj 2 หลายเดือนก่อน

    Hi Pradip, very detailed explanation on NL2SQL, most comprehensive I have seen. Thank You. One quick Qs: Is it possible to take the output/results of the SQL query and display a chart (bar/pie/line) only if the results can be put on a chart or if the user asks for it. what would it take to do this?

    • @FutureSmartAI
      @FutureSmartAI  หลายเดือนก่อน +1

      Yes. First you should access intermediate steps that gives db results and then ypu can visualise it using normal python libraries

  • @YashNanaware-fe5jz
    @YashNanaware-fe5jz 2 หลายเดือนก่อน

    Hi, I have a problem I have data at cloud datamart I want to load it into the db but I am not able to do it as I get connected to the datamart by using UI where in I pass my ID and auntention is done through azure credentials token. Is there any way I can do it as I don't have password for it.

  • @aryanmishra4594
    @aryanmishra4594 หลายเดือนก่อน

    I also want it to answer questions other that database outside of this database context how to do that

  • @tanzeelmohammed9157
    @tanzeelmohammed9157 2 หลายเดือนก่อน +1

    Hi Pradip..
    I have been trying to do the same thing. The problem I do not have credit in OpenAI and i wanted to know if there is any other way using opensource models to achieve the same result..specially in LangChain

    • @FutureSmartAI
      @FutureSmartAI  หลายเดือนก่อน

      Yes you can use open source llm instead of Open AI

  • @gan13166
    @gan13166 2 หลายเดือนก่อน

    how to display the column names.. the output records are coming , but not the column names

  • @rsivci
    @rsivci 3 หลายเดือนก่อน

    HiPradip,
    Excellent explanation, I have PrestoDB in our organisation, I some how dont see the connection part between Langchain and this prestoDB, any references you can help me with?

    • @FutureSmartAI
      @FutureSmartAI  3 หลายเดือนก่อน

      you should look for PrestoDB SQL Alchemy URI . check this www.twilio.com/en-us/blog/access-presto-database-python-sqlalchemy

  • @user-zl4vl6oo1l
    @user-zl4vl6oo1l หลายเดือนก่อน

    thank you very much,
    what do you think about chatgroq ? can we use it for the same purpose?

  • @ShivamGupta-ek4rd
    @ShivamGupta-ek4rd 2 หลายเดือนก่อน

    How Can I use this with IBM DB2 Database?

  • @pavancharan4236
    @pavancharan4236 หลายเดือนก่อน

    Hi Pradip. Thanks for the explanation. So you say, we crease a csv file for choosing correct tables for handling larger datasets. Correct me if I am wrong. My doubt is How big the csv file can be? How many tables can we use? Does it work good even for 100s of rows in csv file?

    • @FutureSmartAI
      @FutureSmartAI  หลายเดือนก่อน +1

      lets say you have 200 tables and each table has 100 columns, so bsically 20000 lines of information or schema which is not feasible to put inprompt sometime and even if its possbile it will increase cost.
      As we know most of the queries cabe answered max using 10 tables .
      So you can create table selection prompt with 2-3 lines for each table description . Once you select table you need to only add schema of those tables in final prompt

  • @polly28-9
    @polly28-9 หลายเดือนก่อน +1

    Is this create_extraction_chain_pydantic deprecated? What to use instead? Any alternatives? I got error: ValueError: A pending deprecation cannot have a scheduled removal. Please, help me!

    • @FutureSmartAI
      @FutureSmartAI  5 วันที่ผ่านมา

      You can actully use simple prompt that can ouptut name of table

  • @prathamgupta3984
    @prathamgupta3984 2 หลายเดือนก่อน

    Can anyone help me it shows error of that open AI key credit exhausted I tried every possible way to get some free credit but always failed

  • @dataacademy369
    @dataacademy369 3 หลายเดือนก่อน

    Do you provide any online training on NLP, if someone aspire to become a freelancer?

    • @FutureSmartAI
      @FutureSmartAI  3 หลายเดือนก่อน

      Hi No, Did you check my freelancing playlsit? I shared my experiences and tips there

  • @prashanthganna2545
    @prashanthganna2545 หลายเดือนก่อน

    Hi Pradip,
    Im using Azure OpenAI. How can i generate dynamic few shot examples?

    • @rubberuk
      @rubberuk 5 วันที่ผ่านมา

      It all has to go in the system prompt so if you’ve got a large schema plus many examples it adds up to a lot of tokens on every request.

  • @polly28-9
    @polly28-9 หลายเดือนก่อน

    Can you tell me with what LLM model learn ( try to know ) the structure of our database? How the LLM model knows the structure of our database? Only with connection and it learn it? I am confused

    • @FutureSmartAI
      @FutureSmartAI  5 วันที่ผ่านมา

      You can see in video when we connect db . langchain can extract scehma sample records from table , this details gets added in prompt for llm

  • @Thangarajtest
    @Thangarajtest 2 หลายเดือนก่อน

    Hello Sir, Could you pls share the video on how to deploy the this application using langserve?

    • @FutureSmartAI
      @FutureSmartAI  หลายเดือนก่อน

      I have already one video on langserve

  • @rameshh3821
    @rameshh3821 หลายเดือนก่อน

    Hi Pradip. I'm not able to add memory to my chatbot. I'm getting this error-
    "The SQL result is an error message indicating that the SQL query was not executed successfully. Therefore, the user question cannot be answered based on the given information."
    Please help.

    • @rameshh3821
      @rameshh3821 หลายเดือนก่อน

      Anyone getting the same error? Please respond.

  • @user-wk1ou9lv5w
    @user-wk1ou9lv5w 3 หลายเดือนก่อน

    Thanks for video.can please answer my below querie
    .how can we create dynamic examples,bcz what if we don't know the db domain

    • @FutureSmartAI
      @FutureSmartAI  3 หลายเดือนก่อน

      We can use chatGPT to comeup with few shot examples . Eg we can give tables schema and sample records and ask chatGPT to comeup with pairs of natural and sql queries. But few shot examples needs to be correct and better we curate them or verify them and use.
      Other approach we can use feedback mechanism, everytime we show user result we can ask for user feedback and later use positive feedback as few shot examples

    • @user-wk1ou9lv5w
      @user-wk1ou9lv5w 3 หลายเดือนก่อน

      @@FutureSmartAI Thanks for your response.i have seen that you made 3 videos of converting text to SQL ..1.using langchain agent 2)using langchain chain(current video) 3) using llamaaindex....so which one is better

    • @FutureSmartAI
      @FutureSmartAI  3 หลายเดือนก่อน +1

      @@user-wk1ou9lv5w Since those libraries keep changing syntax you should always refer latest video

  • @arunsnmimtimt
    @arunsnmimtimt 19 วันที่ผ่านมา

    Hi Pradip,can we do it without api? I cant expose data

    • @FutureSmartAI
      @FutureSmartAI  11 วันที่ผ่านมา

      you mean open ai api? then you will require local llm

  • @n3cr0ph4g1st
    @n3cr0ph4g1st 3 หลายเดือนก่อน

    The query after you do invoke is "SQLQuery: " so it never executes anything. you need to include all your package versions as well.
    Thanks for the replies

    • @FutureSmartAI
      @FutureSmartAI  3 หลายเดือนก่อน +1

      Hi we dont need any cutom code for langchain . you only need below code snippet which is already in notebook
      os.environ["LANGCHAIN_TRACING_V2"] = "true"
      os.environ["LANGCHAIN_API_KEY"] = ""
      I agree I should include package version. I will update notebook with langchain and other relevant package version

    • @n3cr0ph4g1st
      @n3cr0ph4g1st 3 หลายเดือนก่อน

      @@FutureSmartAI I tried implementing and the AI output always includes "sqlquery:" before the query so the query can never be executed in the lcel chain... Are you seeing this behavior? Could it be related to me using gpt4 instead of 3.5?

    • @FutureSmartAI
      @FutureSmartAI  3 หลายเดือนก่อน +1

      ​@@n3cr0ph4g1st Yes we have seen this behavoir with GPT3.5. basically we are getting something extra with sql query and it resulting in failure while executing. you can use GPT-4.
      Also you can add post processing fucntion after query generation and before query execution using python.langchain.com/docs/expression_language/how_to/functions

    • @FutureSmartAI
      @FutureSmartAI  3 หลายเดือนก่อน +1

      you can even modify prompt to explicitly say to dont write anything other than sq query itself

  • @kamaleshnew
    @kamaleshnew 3 หลายเดือนก่อน

    How do you map column values so user need not mention column names for every query

    • @FutureSmartAI
      @FutureSmartAI  3 หลายเดือนก่อน +1

      Hi can you example of it?

    • @kamaleshnew
      @kamaleshnew 3 หลายเดือนก่อน

      @@FutureSmartAI you have a word 'india' in let say 10 coulmns ....which column to route

    • @kamaleshnew
      @kamaleshnew 3 หลายเดือนก่อน

      @@FutureSmartAI also not all texts has been easily identified and showing data ....

  • @protimaranipaul7107
    @protimaranipaul7107 หลายเดือนก่อน

    BadRequestError: Error code: 400 - {'error': {'message': "Sorry! We've encountered an issue with repetitive patterns in your prompt. Please try again with a different prompt.", 'type': 'invalid_prompt', 'param': 'prompt', 'code': None}}

  • @Animesh_SGIS
    @Animesh_SGIS 22 วันที่ผ่านมา

    Sir, Open Api is not free as of today so what to do?

    • @FutureSmartAI
      @FutureSmartAI  5 วันที่ผ่านมา

      you can use open source llm, but in my experience many of my clients prefer to use open ai in production

  • @ajaym4257
    @ajaym4257 3 หลายเดือนก่อน

    Hello, I want to "Identifying Relevant Database Tables Dynamically". When i run the same code as you as show below i basically get this [ ] square bracket as output. Can you help?
    table_details_prompt = f"""Return the names of ALL the SQL tables that MIGHT be relevant to the user question. \
    The tables are:
    {table_details}
    Remember to include ALL POTENTIALLY RELEVANT tables, even if you're not sure that they're needed."""
    table_chain = create_extraction_chain_pydantic(Table, llm, system_message=table_details_prompt)
    tables = table_chain.invoke({"input": "what is relative utilization is coming from which table?"})
    tables

    • @rameshh3821
      @rameshh3821 หลายเดือนก่อน

      Hi. I'm also getting the same error.

    • @FutureSmartAI
      @FutureSmartAI  หลายเดือนก่อน

      What natural query are you asking ? also you can ignore dynamic table selection since it will not work with follow up question. we will need to add hstory to table selection prompt to make it working

  • @Mostafa_Sharaf_4_9
    @Mostafa_Sharaf_4_9 3 หลายเดือนก่อน

    Can we make this code as api using fastapi and deploy it on AWS .
    If yes, please make more vadios about api ❤

    • @FutureSmartAI
      @FutureSmartAI  3 หลายเดือนก่อน

      Yes it can be integarted in fastapi, Its just normal python code. I have some other videos where I inetgarted open ai and fastapi deployed on aws vm

  • @shaktidharreddy6822
    @shaktidharreddy6822 3 หลายเดือนก่อน +1

    please post streamlit code also

    • @FutureSmartAI
      @FutureSmartAI  หลายเดือนก่อน

      Everything is there , check link in the description

  • @Rasmiya_M
    @Rasmiya_M หลายเดือนก่อน +1

    Sir can I get a data science internship at your company?

    • @FutureSmartAI
      @FutureSmartAI  5 วันที่ผ่านมา

      I post on linkedin when we have openings

  • @manikandanr1242
    @manikandanr1242 2 หลายเดือนก่อน

    can we create a chatbot so it can answer friendly question 'hi'

  • @ansumansatpathy3923
    @ansumansatpathy3923 3 หลายเดือนก่อน

    I see two audio messages overlap around 56:47, very distracting. Also pls plan your content ahead to avoid revisiting after breaking existing flow.

    • @FutureSmartAI
      @FutureSmartAI  3 หลายเดือนก่อน +1

      Thanks for pointing. Actually my recorder stopped workign after some and re recorded some part that resulted in some overlap when i combined them

  • @RaushanKumar-ut2ke
    @RaushanKumar-ut2ke 2 หลายเดือนก่อน

    When i have used semanticsimilarityexampleselector using AzureOpenAIEmbedding i got ->ValueError: Expected EmbeddingFunction.__call__ to have the following signature: odict_keys(['self', 'input']), got odict_keys(['args', 'kwargs']), Can you explain why i am getting this error.

  • @rubberuk
    @rubberuk 8 วันที่ผ่านมา +1

    Be very careful before putting this in production environment, almost impossible to constrain.

    • @FutureSmartAI
      @FutureSmartAI  5 วันที่ผ่านมา

      One basic thing we can do use to use db user that has only read or limited permissions

    • @rubberuk
      @rubberuk 5 วันที่ผ่านมา +1

      If you’re using sql server you could use a security policy and a predicate but ultimately you’ll end up with something that you can’t 100% secure especially in a multi tenant environment. You can also wrap the sql statement in a read only transaction to stop modifications.