Using ChatGPT with YOUR OWN Data. This is magical. (LangChain OpenAI API)

แชร์
ฝัง
  • เผยแพร่เมื่อ 18 มิ.ย. 2023
  • Here's how to use ChatGPT on your own personal files and custom data. Source code: github.com/techleadhd/chatgpt...
    Ace your coding interviews with ex-Google/ex-Facebook training. techinterviewpro.com/
    Make passive income with crypto in DeFi Pro. defipro.dev/
    💻 Get access to 100+ programming interview problems explained: coderpro.com/
    📷 Learn to build a successful business on TH-cam from scratch: youtubebackstage.com/
    💻 I’ll send you FREE daily coding interview questions to practice your skills: dailyinterviewpro.com/
    🛒 My computer and camera gear: www.amazon.com/shop/techlead/...
    ⌨️ My favorite keyboards: iqunix.store/techlead
    Follow me on social media for more tips & fun:
    / techleadhd
    / techleadhd
    Disclaimer: This description may contain affiliate links. Cryptocurrencies are not investments and are subject to market volatility.

ความคิดเห็น • 1.6K

  • @TechLead
    @TechLead  11 หลายเดือนก่อน +1064

    Source code: github.com/techleadhd/chatgpt-retrieval
    Ace your coding interviews with ex-Google/ex-Facebook training. techinterviewpro.com/

    • @bobby9568
      @bobby9568 11 หลายเดือนก่อน +1

      microsoft is going to implement this or has implemented this in Windows

    • @swooshdutch3021
      @swooshdutch3021 11 หลายเดือนก่อน +4

      would of been nice if you also skimmed over costs, are there any costs if you only use it on your own data ?

    • @santiagomartinez3417
      @santiagomartinez3417 11 หลายเดือนก่อน +5

      Finally, no more crypto, we want AI all the time!!!!

    • @miknes12345
      @miknes12345 11 หลายเดือนก่อน +5

      the code in github is not the same as in the video. I would really like to see the syntax on line 19 at 9:53, but I can't read it. Can anyone please help and paste it in a response. Thank you so much.

    • @AnuragKumar-mn7zu
      @AnuragKumar-mn7zu 11 หลายเดือนก่อน +3

      When are are going to sell AI course?

  • @davidl.e5203
    @davidl.e5203 11 หลายเดือนก่อน +1498

    This is rare. TechLead is actually uploading a useful coding tutorial instead of his opinions.

    • @fateriddle14
      @fateriddle14 11 หลายเดือนก่อน +54

      His opinions are also useful, when there's no agenda to promote one of his product.

    • @paulcleary7437
      @paulcleary7437 11 หลายเดือนก่อน +26

      I find his opinions entertaining 😅

    • @arkadiptabiswas2769
      @arkadiptabiswas2769 11 หลายเดือนก่อน +75

      He forgot "as a millionaire" in the title too ! 😢

    • @miknes12345
      @miknes12345 11 หลายเดือนก่อน +12

      @@arkadiptabiswas2769 Yes, it seems it is very much something he builds his self worth on - and that he worked at Google and Facebook, since he finds the need to mention it all the time.

    • @rogermoore3147
      @rogermoore3147 11 หลายเดือนก่อน +10

      There is no ex microsoft ex google ex facebook ex men here too

  • @jayhu6075
    @jayhu6075 11 หลายเดือนก่อน +71

    This is the way to explain LangChain in the style of TechLead. You nail it. Hopefully more this stuff in the future. Thanks.

    • @saadshajy3849
      @saadshajy3849 2 หลายเดือนก่อน

      We created a number of GPT powered personal AI assistants for businesess like law, media industry and pretty much for corporate companies where they can integrate internal corporate resources and data securely to retrieve useful information and save 70% cost

  • @TheRealTommyR
    @TheRealTommyR 11 หลายเดือนก่อน +23

    This is exactly what I desired to do with my own data, but I haven’t spent any time yet to research and figure out a way to do it. I am glad there is a public way to do it.

    • @saadshajy3849
      @saadshajy3849 2 หลายเดือนก่อน

      We created a number of GPT powered personal AI assistants for businesess like law, media industry and pretty much for corporate companies where they can integrate internal corporate resources and data securely to retrieve useful information and save 70% cost

  • @MacroAnarchy
    @MacroAnarchy 11 หลายเดือนก่อน +1068

    Im studying law at the moment and I’m seriously scared about how this will change the legal industry. Honestly could see it replace 90% of lawyering.

    • @chrsl3
      @chrsl3 11 หลายเดือนก่อน +125

      Everything that is language based. Many Doctors, programmers, lawyers, writers, ...could be replaced. But the top 5% can do cool stuff with it.

    • @zzKirus
      @zzKirus 11 หลายเดือนก่อน +87

      Need the human element... verification, etc.. you're fine.

    • @besllu8116
      @besllu8116 11 หลายเดือนก่อน +42

      @@chrsl3 I do not think so. It can replace some jobs like form fillings, data serialization etc. but nothin more. Every case is unique and AI can not know what someone did not teach it. It will always be late behind the needs of the time.

    • @MacroAnarchy
      @MacroAnarchy 11 หลายเดือนก่อน +169

      @@besllu8116 I honestly think the opposite is true!
      98% of lawyering is knowing all the previous cases, the law & the literature. Only very few cases actually need new arguments, most of it happened before.
      No lawyer can know all of that simultaneously, but the AI can! I think it will be the better lawyer in the absolut majority of cases.

    • @pepelapeux
      @pepelapeux 11 หลายเดือนก่อน +23

      Chatgpt passed the cpa exam - it can replace accountants/ CPA too
      It can't wipe out all our jobs at the same time so keep going and study for the bar exam

  • @RunningBugs
    @RunningBugs 11 หลายเดือนก่อน +81

    So after digging into the code, I found that Langchain is actually doing the following things: 1. for all your data, store then in vector storage using embeddings; 2. when you query something, it first did a similarity search in the embeddings database, and find out the files that's related to your question; 3. After finding the related files, it takes all the text of that file, together with a context message: "Use the following pieces of context as the 1st system message to answer the user's question.
    If you don't know the answer, just say that you don't know, don't try to make up an answer.
    ----------------
    {your text data}".
    This somehow tells us these points:
    1. why it's sometimes not having outside world's information? If the question you asked is not in your document, or if it's not trained on the data for your question, it will return nothing valuable as instructed.
    2. Is there a limit on the sizes of your data? Yes, you can't use it with super large files because it's doing a document filtering and it will send all text related to the API server, recently the gpt-3.5-turbo-16k might be the good model to use and it's best the total size of related docs is less than 16k tokens. Which means the best practice would be grouping your data into different topics and try to ensure any query, if responded with similarity search, the total size of returned document is not exceeding the token size limit of the model. I think16k is roughly the size of a 13-15 pages paper.
    3. By removing/changing the system message, you might get better results for common sense questions. I really don't like the system messaged by default, since in a playground, asking gpt-3.5-turbo-16k "Who is George Washington?" will give you better answers comparing to the langchain solution with an empty system message.
    4. The langchain is using unstructured library (it reports errors when I didn't install it), which means you can not only use txt files, but also pdf files, word files, etc. Haven't tested it out but highly likely support query of multiple pdf files using similar code in the video. So you can put multiple pdfs in a folder, using a directory index creator and ask questions for your papers, I think (haven't tested it out)
    5. The langchain not only supports ChatGPT models, but also other models in the chat_models package. Google PALM2 chat is also supported as of Jul 10, 2023, if someone has the key, you can use other models too. While I don't think PALM2 has the common sense knowledge as good as ChatGPT, but I think it is a better language generating model comparing to at least gpt-3.5-turbo-16k , so PALM2 may produce better results on your data and OpenAI's models are better in answering common sense questions after changing the default system message. OpenAI said general access to gpt-4 is starting, and people with history of successful payment using OpenAI API will get the access immediately a few days ago. The access to new developers will be rolled out until end of July.
    Also I think it's quite cool to be able to use your own data, if you want to create something like an AI assistant, you can always use code to collect current time, user information and put those in a folder, so the assistant will be able to do much more than current ones.
    Another very cool thing is auto-gpt which works great using gpt-4, gpt-3.5 is not smart enough and behaves much worse than gpt-4. If you asked auto-gpt something, it will be able to google itself and replied with the real time information. Also the example of auto-gpt is cool telling you how it could create a recipe based on the next holiday. Hopefully the access to gpt-4 is coming sooner.

    • @me_debankan4178
      @me_debankan4178 9 หลายเดือนก่อน +1

      I have stored a small pdf of 7 pages to a vector datastore FASS after the text splitting and I have also done embedding , but when I am asking questions outside of the pdf it giving me random wrong answers rather than giving a intelligent response like : "I don't know" or "out of scope" , can you tell me why this is happening?

    • @inagrag
      @inagrag 9 หลายเดือนก่อน +1

      God thanks for this. I hate the way langchain is structured to hide everything. I just want to know what's going on under the hood from documentation 😢

    • @DunsDeeDowns
      @DunsDeeDowns 8 หลายเดือนก่อน

      @@me_debankan4178 no expert here but it sounds like it is 'hallucinating' which means it does not have enough data in its sources. Maybe you can make it also tell you the 'sources' it used for answers to check/debug.

    • @EricofPhilly
      @EricofPhilly 8 หลายเดือนก่อน

      For the first question, I would use chat gpt to analyze documents with standards and guidelines. So, when getting an answer, I want to know whether or not it found the answer in the docs or with outside info. In this case, it’s useful to isolate the data it’s using.

    • @avatarcybertronics2584
      @avatarcybertronics2584 7 หลายเดือนก่อน

      I saw FractalGPT is the solution for most of these challenges. Unfortunately when u use prompt u randomly lose quality, because prompt affect length, structure and other crucial parts of answer

  • @charleswhite758
    @charleswhite758 11 หลายเดือนก่อน +34

    Awesome seeing TechLead do programming, the Maestro at work.

    • @eternal5154
      @eternal5154 11 หลายเดือนก่อน +1

      DO YOU UNDERSTAND?👁

  • @adasi008
    @adasi008 11 หลายเดือนก่อน +5

    By far one of the best ChatGPT video tutorials I've seen on TH-cam. Great work

    • @saadshajy3849
      @saadshajy3849 2 หลายเดือนก่อน

      We created a number of GPT powered personal AI assistants for businesess like law, media industry and pretty much for corporate companies where they can integrate internal corporate resources and data securely to retrieve useful information and save 70% cost

  • @e-matesecom
    @e-matesecom 11 หลายเดือนก่อน +6

    hours and hours of chatgpt courses... i learned more by watching 5 minutes of your video. congratulations for the clarity and the practical approach👍

  • @williamfarley9013
    @williamfarley9013 11 หลายเดือนก่อน +18

    This is the kind of stuff I was hoping to do with chat GPT

  • @jcollins519
    @jcollins519 11 หลายเดือนก่อน +9

    Semantra is a pretty cool tool to analyze your documents and be able to search them with natural language. It's probably more research-oriented since it links you to the different pages and snippets that match your query.

  • @seize2581
    @seize2581 11 หลายเดือนก่อน +2

    Thanks TechLead, it's nice to see this type of videos !

  • @andygilet5538
    @andygilet5538 10 หลายเดือนก่อน +48

    Great video. I'm a junior data scientist in Belgium and it's actually helping me for one of my projects. You're totally right when you say that everyone should learn Python. I only learned C and C# during my studies but now that I've learned python I'm using it almost everyday.

    • @thetexassaint6571
      @thetexassaint6571 10 หลายเดือนก่อน

      Real question: how could knowing Python, or any other language, help a person who has absolutely nothing to do with coding or computers ?

    • @RPanda3S
      @RPanda3S 10 หลายเดือนก่อน +5

      @@thetexassaint6571 Real question: how could a person have nothing to do with computers? That's insane.

    • @snorttroll4379
      @snorttroll4379 10 หลายเดือนก่อน

      Well. Only thing is if you want to make software

    • @Hugh_Jurrection
      @Hugh_Jurrection 9 หลายเดือนก่อน

      I'm also in your neck of the woods, as a data analyst. ChatGPT is very useful for sorting data and finding trends, which we use to form hypotheses. As a data scientist (which I started off as), you could in theory use it for all of your data collation, sorting, duplicate removal etc etc.

    • @m.h.6494
      @m.h.6494 8 หลายเดือนก่อน

      Bij welk bedrijf werk je? Ik kom niet vaak mensen van België tegen onder programming tutorials! :D

  • @AndreaDavidEdelman
    @AndreaDavidEdelman 11 หลายเดือนก่อน +9

    Very nice implementation. Simple yet powerful. It's clearly where the field is going.

    • @CaptainSazzman
      @CaptainSazzman 11 หลายเดือนก่อน

      What program is he using to type the instructions to the model?

    • @estebancortes2848
      @estebancortes2848 11 หลายเดือนก่อน

      @@CaptainSazzman a terminal window, its not a specific program

  • @hichamalaoui34
    @hichamalaoui34 3 หลายเดือนก่อน

    May be 8 months late and Langchain has been updated since, but this is one of the best videos I watched. Thank you.

  • @jsnmad
    @jsnmad 27 วันที่ผ่านมา

    This one video alone saves so much time. Instead of watching hours of some of the playlists out there. It's better to start here and then go straight to the Langchain docs to work out other use cases. Excellent TechLead.

  • @Kira073
    @Kira073 11 หลายเดือนก่อน +3

    Building up chat gpt on your own custom data is amazing and interesting. This opens up a whole new use case of open AI.

    • @saadshajy3849
      @saadshajy3849 2 หลายเดือนก่อน

      We created a number of GPT powered personal AI assistants for businesess like law, media industry and pretty much for corporate companies where they can integrate internal corporate resources and data securely to retrieve useful information and save 70% cost

  • @jsayubi
    @jsayubi 11 หลายเดือนก่อน +170

    So, we've switched from producing videos that were practically a recipe for depression to creating authentic coding tutorials now, have we? Oh, what a "remarkable" progression, truly. Thanks for that! :)

    • @jsayubi
      @jsayubi 11 หลายเดือนก่อน +9

      Just Note: My Comment above was generated by GPT :D

    • @chindianajones3742
      @chindianajones3742 11 หลายเดือนก่อน +1

      @@jsayubi oh lol actually? How did you prompt?

    • @SwingingInTheHood
      @SwingingInTheHood 11 หลายเดือนก่อน +2

      😅I was thinking the EXACT same thing! Dude, where have you been the past 3 months! You have just demonstrated what these LLMs (large language models) are really good at: Language. Everybody's going on and on about coding, accounting, script writing (well, maybe), but just summarizing data is what it is really best at -- what, I think, it was designed to do in the first place.

    • @neuronovost
      @neuronovost 11 หลายเดือนก่อน

      Зато у меня шортсы с классными нейросетями на канале ;))

    • @CaptainSazzman
      @CaptainSazzman 11 หลายเดือนก่อน +3

      What program is he using to type the instructions to the model?

  • @andrespineda7620
    @andrespineda7620 8 หลายเดือนก่อน

    Wow, this was awesome. All this information in one place. Also, I appreciate your fast dialog and sticking to the important points. I subscribed and will recommend this site to others.

  • @fenchelteefee
    @fenchelteefee 10 หลายเดือนก่อน +1

    Great vid, especially the in end with MS’s case study of customer reviews for cars. For those, who actually struggling to find real world applications for the new AI stuff. Thank you!

  • @hackerhaze
    @hackerhaze 11 หลายเดือนก่อน +23

    I'm doing something really similar as well! And also doing a series on it where we go over building autonomous agents using GPT4 that are programmable, context aware (whatever files from the vscode you have in your workspace) and ultimately autonomous it's awesomee!

    • @hackerhaze
      @hackerhaze 11 หลายเดือนก่อน +11

      I will soon release it open source too!

    • @dawidzurawski8870
      @dawidzurawski8870 11 หลายเดือนก่อน

      I'm waiting for it

    • @stratusgeret3794
      @stratusgeret3794 11 หลายเดือนก่อน +2

      Would be nice if you post something about it later on!

    • @hackerhaze
      @hackerhaze 11 หลายเดือนก่อน +3

      @@stratusgeret3794 I got loads of updates coming soon!

    • @rafograph4714
      @rafograph4714 11 หลายเดือนก่อน +3

      ​@@hackerhazeinterested

  • @kwabenakorantengasiedu5982
    @kwabenakorantengasiedu5982 8 หลายเดือนก่อน +13

    Caution when trying this out: If your vector store is going to be very large, say it's created out of a 500+ page pdf, it can cause the model to produce responses that are hallucinatory in nature. Meaning, that if you searched your pdf for what the model produced, those responses will not be found within the pdf. I tried this, and that is what I noticed.

    • @doyouthinkitsdead
      @doyouthinkitsdead 6 หลายเดือนก่อน +2

      That sucks. I'm hoping to be able to treat my large pdf as a google search!

    • @ZanesFacebook
      @ZanesFacebook 6 หลายเดือนก่อน +4

      Make it use gpt3 instead of 4 and it will stop "reasoning"

    • @HenningMoeller
      @HenningMoeller 5 หลายเดือนก่อน

      @@ZanesFacebook How do I do that? Changing from 3.5 to 4?

  • @jfletchbeats5882
    @jfletchbeats5882 11 หลายเดือนก่อน +2

    fantastic video. love love love using gpt for stuff like this. would love to see more content of this!

  • @ezit4me
    @ezit4me 11 หลายเดือนก่อน +2

    This was an amazing tutorial. Thank you for making it so easy to follow.

  • @bobbyfong1499
    @bobbyfong1499 10 หลายเดือนก่อน +27

    👋 Hey everyone, if you're encountering the NameError: name 'partition_pdf' is not defined error while running the code, here's a solution that worked for me:
    This issue seems to be related to a specific version of the unstructured package. Downgrading to version 0.7.12 resolved the problem for me. You can do this by running the following command in your virtual environment:
    pip install unstructured==0.7.12
    Make sure to restart your Python environment or terminal after making this change. Happy coding, and feel free to reach out if you have any questions!
    🚀

    • @katemariageorge7396
      @katemariageorge7396 2 หลายเดือนก่อน

      error: subprocess-exited-with-error
      × Getting requirements to build wheel did not run successfully.
      │ exit code: 1
      ╰─> [33 lines of output]
      Traceback (most recent call last):
      File "C:\Users\0011GP744\AppData\Roaming\Python\Python312\site-packages\pip\_vendor\pyproject_hooks\_in_process\_in_process.py", line 353, in
      main()
      File "C:\Users\0011GP744\AppData\Roaming\Python\Python312\site-packages\pip\_vendor\pyproject_hooks\_in_process\_in_process.py", line 335, in main
      json_out['return_val'] = hook(**hook_input['kwargs'])
      ^^^^^^^^^^^^^^^^^^^^^^^^^^^^
      File "C:\Users\0011GP744\AppData\Roaming\Python\Python312\site-packages\pip\_vendor\pyproject_hooks\_in_process\_in_process.py", line 112, in get_requires_for_build_wheel
      backend = _build_backend()
      ^^^^^^^^^^^^^^^^
      File "C:\Users\0011GP744\AppData\Roaming\Python\Python312\site-packages\pip\_vendor\pyproject_hooks\_in_process\_in_process.py", line 77, in _build_backend
      obj = import_module(mod_path)
      ^^^^^^^^^^^^^^^^^^^^^^^
      File "C:\Python312\Lib\importlib\__init__.py", line 90, in import_module
      return _bootstrap._gcd_import(name[level:], package, level)
      ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
      File "", line 1381, in _gcd_import
      File "", line 1354, in _find_and_load
      File "", line 1304, in _find_and_load_unlocked
      File "", line 488, in _call_with_frames_removed
      File "", line 1381, in _gcd_import
      File "", line 1354, in _find_and_load
      File "", line 1325, in _find_and_load_unlocked
      File "", line 929, in _load_unlocked
      File "", line 994, in exec_module
      File "", line 488, in _call_with_frames_removed
      File "C:\Users\0011GP744\AppData\Local\Temp\pip-build-env-ekyeyqc_\overlay\Lib\site-packages\setuptools\__init__.py", line 16, in
      import setuptools.version
      File "C:\Users\0011GP744\AppData\Local\Temp\pip-build-env-ekyeyqc_\overlay\Lib\site-packages\setuptools\version.py", line 1, in
      import pkg_resources
      File "C:\Users\0011GP744\AppData\Local\Temp\pip-build-env-ekyeyqc_\overlay\Lib\site-packages\pkg_resources\__init__.py", line 2172, in
      register_finder(pkgutil.ImpImporter, find_on_path)
      ^^^^^^^^^^^^^^^^^^^
      AttributeError: module 'pkgutil' has no attribute 'ImpImporter'. Did you mean: 'zipimporter'?
      [end of output]
      note: This error originates from a subprocess, and is likely not a problem with pip.
      error: subprocess-exited-with-error
      × Getting requirements to build wheel did not run successfully.
      │ exit code: 1
      ╰─> See above for output.
      note: This error originates from a subprocess, and is likely not a problem with pip.
      Do you know how to resolve this ?

    • @Rockstarsf
      @Rockstarsf หลายเดือนก่อน

      @@katemariageorge7396 I'm getting the same issue.

    • @newchaoz
      @newchaoz หลายเดือนก่อน

      Either that, or you can turn pdf to text with 'from PyPDF2 import PdfReader'

  • @ripern
    @ripern 11 หลายเดือนก่อน +3

    Awesome tutorial! Simply explained and so many good examples!

  • @simeonhendrix
    @simeonhendrix 11 หลายเดือนก่อน +2

    Great video @techlead - Do we know which version of Chat GPT the API utilizes? I know that the plugins used within GPT4 use GPT3, which is disappointing.

  • @riyaski91
    @riyaski91 8 หลายเดือนก่อน

    Eye opener! I am a tech student, and was researching whether we could make a custom GPT of our own. This was on point! Thanks @techlead!

  • @john.10347
    @john.10347 11 หลายเดือนก่อน +8

    Glad to know I'm not the only one doing this.
    As a student I've been feeding ChatGPT all my previous course work, its able to answer essay prompts and other homework related tasks in my writing style and or in simmilar formats as if I was the one writing it. I'm able to save alot of time by doing this

    • @stekta
      @stekta 27 วันที่ผ่านมา

      Hahaha this is exactly what I'm trying to do

  • @sr9814
    @sr9814 11 หลายเดือนก่อน +15

    Loved this. I am a Sales guy with zero coding exp. I listen to content like yours to glean some nuggets to better understand the impacts and have meaningful conversations with my customers. Truly helpful content.

    • @mowburnt
      @mowburnt 10 หลายเดือนก่อน +1

      Same here. Looking to find ways to sort through the mountains of data to help us spend more time helping people and grow sales in that way

    • @powermyapps
      @powermyapps 10 หลายเดือนก่อน

      It's not providing the right answers for me. I might need to adjust my data storage. Instead of showing the highest price, it displays the last product's price for "What is the highest price you can see in the data?"

  • @larryczerwonka5125
    @larryczerwonka5125 11 หลายเดือนก่อน +1

    First great video.
    Second I just had to comment on the "one language" you mentioned programmers claiming that's all they wanted to know.
    Last count i have coded in over 15 languages since i wrote my first line of code back in 1985.
    We have not deployed anything using LangChain yet (we have only been using LlamaIndex) but for the same reason that i know so many languages, we will be using LangChain soon to see what it can do.
    As for plugin, i will always be for building your own so you have full control and can do things that the plugin "left out." Things like ability to use your own data (and keep it on your servers).
    We have found that if you are deploying a Help feature for your application you do not want to allow the code to get information from "the outside world."

  • @stevierayfrog2485
    @stevierayfrog2485 4 หลายเดือนก่อน +5

    I would love to see an updated video on this. Since this world is moving so fast, these steps have so many deprecated modules and requirements. I couldn't get past the need for chromadb. Trying to install it bombed because it was trying to install every version of it, and I wasn't sure which version was compatible with the other modules. Superfun overview though.

    • @IdowuOlayiwola
      @IdowuOlayiwola 3 หลายเดือนก่อน

      This thing is deprecated and I wonder why we do not have a new video yet

  • @bra5081
    @bra5081 11 หลายเดือนก่อน +7

    For me the added value would be in the AI that adds new content to my data. As retrieving it is quite an easy feat in itself. As it's not burdened by the thought that I can do it later and end up never do it. But I suppose they have stuff like Alexa for that.

    • @cheyno237
      @cheyno237 7 หลายเดือนก่อน

      You can already give chatgpt a GIF or a powerpoint or whatever, and then tell it things like "Add such and such text", etc.

  • @danield.7359
    @danield.7359 10 หลายเดือนก่อน +2

    You made my day. I've been struggling with fine tuning a GPT 3 model with mediocre success and an enormous data collection and preparation effort. It would never even get close to the results achieved with langchain within 1 minute of coding and 9 minutes of data preparation.

    • @hadihassan_
      @hadihassan_ 25 วันที่ผ่านมา

      Hey @danield.7359. Can you give me an idea of what you made using this video? I need some help with training my own model. Thanks

    • @danield.7359
      @danield.7359 25 วันที่ผ่านมา

      @@hadihassan_ I don't know what you mean with "what you made". Using langchain I developed my first RAG application.

  • @red_onex--x808
    @red_onex--x808 10 หลายเดือนก่อน +1

    this is by far one of the best videos on custom LLm so far 💥💥

  • @ALIof93
    @ALIof93 11 หลายเดือนก่อน +9

    Have you guys tried out dmvnerds for google or aws certs? 🤔

  • @sv-hermes
    @sv-hermes 11 หลายเดือนก่อน +18

    This dude is not a human! I’ve been following this channel for a while now, and I’ve concluded this today. D’you see how he never expresses any emotions with his face? He’s just a very advanced robot, with an amazing integrated LLM and perfect mechanics (at least facial and torso, cause we never seen him walk). As crazy as it sounds… wow

    • @karinakarina
      @karinakarina 11 หลายเดือนก่อน +2

      You think he is using an avatar and AI voice cloning or something, or that he is an actual robot? I am entertaining both theories. 😂

    • @sv-hermes
      @sv-hermes 11 หลายเดือนก่อน

      @@karinakarina i think he’s literally a Robot. But I don’t discard the Avatar thing, well thought! I’m gonna do some solid research and I will update you all, friends. Let’s uncover this!

    • @mikescarborough9196
      @mikescarborough9196 11 หลายเดือนก่อน

      Technically you are not correct, but you could easily replace him with a good deepfake CGI, and no one would notice the differrence.

    • @staceyadams9954
      @staceyadams9954 11 หลายเดือนก่อน +1

      He feigns condescension really well - a hallmark of "advanced intelligence".

    • @imho2278
      @imho2278 11 หลายเดือนก่อน +1

      His sarcasm...congratulating the smartypants kiddies...is spot on. He's real.

  • @LabEveryday
    @LabEveryday 8 หลายเดือนก่อน

    One shot and few shot prompting is super cool. Thanks for the video!

  • @hegdelabs
    @hegdelabs 11 หลายเดือนก่อน +1

    This video so far is the best and relatable how we can use Chatgpt to the best. Many thanks , really appreciate it

  • @gibbeyii
    @gibbeyii 11 หลายเดือนก่อน +28

    I don’t even know where to begin trying to explain what a bad idea putting every aspect of your life into a searchable database, not to mention the fact how accessible this database will be the outside entities…
    Good luck

    • @chrsl3
      @chrsl3 11 หลายเดือนก่อน +1

      But thats the exact thing techlead shows here: how one can use it totally privately.

    • @davadh
      @davadh 11 หลายเดือนก่อน +4

      As long as your data is offline, it should be okay. One it's online or in public domain, then it's over

    • @kollegeturnschuh5181
      @kollegeturnschuh5181 11 หลายเดือนก่อน +7

      But ChatGPT NEEDS online? Or does he suggest that all of it he just have downloaded offline by 10 lines of code in a terminal? His .pdf documents may be not uploaded to a third party, however his requests are still send to OpenAI/ Microsoft

    • @pepelapeux
      @pepelapeux 11 หลายเดือนก่อน

      They already know everything about us - tracking our phones, cameras on computers, cookies, then cameras on TV etc..
      Google has a lawsuit right now where we can claim $7 because they made money off our data without our knowledge/permission. I say to utilize this technology until you can't but use discernment on what you put in so hackers can't use it against you or your company..

    • @davadh
      @davadh 11 หลายเดือนก่อน +1

      @@kollegeturnschuh5181 True. You either trust that Open AI encrypted your offline data when it sends the query or you don't.

  • @madcatattack1
    @madcatattack1 11 หลายเดือนก่อน +8

    nice idea! i just started recording my study notes down today actually, and i plan on feeding it to chat gpt to organize for me and, well who knows what else. feels like a cheat code for productivity, using this thing

  • @anjukrishna8102
    @anjukrishna8102 10 หลายเดือนก่อน

    I am a big fan of your channel. Your humour is spot on. I built on the chatgpt code in your GitHub and won a competition. Thank you

  • @JaeyeokYoon
    @JaeyeokYoon 11 หลายเดือนก่อน +1

    Thank you, I'm sure that many people was waiting for it!

  • @ronaldb5245
    @ronaldb5245 10 หลายเดือนก่อน +3

    Also for a non techie who’s team is building a digital product to read, analyse and present data from 17th and 18th century handwritten documents from European archives, this a great video. Not sure on the Python learning though :-). Thanks again. Looking forward on the next one. Some advanced prompt engineering including do’s and don’t s with the system prompt in open ai playground? Best, Ronald

    • @liondrone2954
      @liondrone2954 10 หลายเดือนก่อน

      Transkribus?

  • @janbielecki94
    @janbielecki94 11 หลายเดือนก่อน +2

    TechLead have return to coding and the smile have returned on his face :)

  • @edes6168
    @edes6168 11 หลายเดือนก่อน +2

    That means for debugging a problem / error on your pc / server that could be very useful.
    For example putting the directory of the logs and config files of an application

  • @niharikadeokar8934
    @niharikadeokar8934 7 หลายเดือนก่อน

    The simplest explanation for the implementation of LangChain, Loved it! tysm :)

  • @besllu8116
    @besllu8116 11 หลายเดือนก่อน +11

    Funny how everything again went to text-based interfaces.

  • @jirikrajnak9047
    @jirikrajnak9047 11 หลายเดือนก่อน +13

    this "proof of concept" demos what's going to be a standard feature on operating systems soon. think spotlight or whatever windows uses. and beyond. just needs to be integrated into the ui.

  • @grzegorzkurc9121
    @grzegorzkurc9121 8 หลายเดือนก่อน +1

    Hello @TechLead, thank you for the great content.
    Do we teach the bot while processing data from our disk?
    Is there a risk that, based on our data, it will learn something that someone from outside could later ask about?

  • @cryptominingtechnologies6245
    @cryptominingtechnologies6245 11 หลายเดือนก่อน +2

    You are the most magical out of any You Tuber that I have ever came across. ❤

  • @jeffersonqiu7071
    @jeffersonqiu7071 2 หลายเดือนก่อน +3

    Did anyone tried to run this script recently? I kept getting many error messages to install dependecies from langchain-community and even after following all the errors, the script still couldn't run.

  • @articvault4549
    @articvault4549 11 หลายเดือนก่อน +49

    Langchain and Llama index increases your token usage enormously, and tokens ain't free, u get charged for their usage,so a heads up for you, just track your token usage, it's just converts docs and prompt to embeddings and then compares it using vector similarity .the llm here is used for in context learning

    • @inflationking1271
      @inflationking1271 11 หลายเดือนก่อน +2

      So what is you architecture proposal to improve this?

    • @faruqhasan5396
      @faruqhasan5396 11 หลายเดือนก่อน

      What function exactly is the llm performing here ? Is it not possible to run this on a cloud gpu without paying for tokens from open-ai? Or to create a 100% offline KnowledgeBase ? Ty

    • @articvault4549
      @articvault4549 11 หลายเดือนก่อน +2

      @@faruqhasan5396 so basically what happens is that when our docs are converterd to embeddings as well as our query, we use similarity search to get the embeddings similar to our query then those embeddings are passed to the LLM as a context along with query as a prompt to generate a response for us

    • @articvault4549
      @articvault4549 11 หลายเดือนก่อน +1

      @@faruqhasan5396 u don't have to necessarily use openai, u can use other LLMs, more specifically ones that are on hugging face but then again openai chatgpt LLM is the best so far in response quality and context understanding and embeddings

    • @articvault4549
      @articvault4549 11 หลายเดือนก่อน

      @@faruqhasan5396 also what do u mean by running on a GPU? Whether u run it on a GPU or something else,if you use openai gpt as LLM u will be charged for tokens , and point to be noted here we use openai for 2 main core purposes,generating embeddings and also LLM, again like I mentioned u can use any other open source embedding model and LLM that u can just run on your computer or whatever u want then definitely it will be offline since it won't require api calls to external services

  • @jakobstyrupbrodersen926
    @jakobstyrupbrodersen926 11 หลายเดือนก่อน +1

    Excellent video! Very helpfull that you informed about the pros and cons regarding privacy/sensitive data in every step a long the way. Thanks a lot :-)

  • @gj1234567899999
    @gj1234567899999 11 หลายเดือนก่อน +2

    I like when you add useful stuff for people. Thanks!

  • @LeoUfimtsev
    @LeoUfimtsev 11 หลายเดือนก่อน +11

    Legitimate.
    I experimented with openai and started to wonder how to pass more data to it. Exactly what I was looking for. Tech lead does not disappoint.

    • @faruqhasan5396
      @faruqhasan5396 11 หลายเดือนก่อน

      What function exactly is the llm performing here ? Is it not possible to run this on a cloud gpu without paying for tokens from open-ai? Or a completely offline private KnowledgeBase ? Ty

  • @user-sh9el7pq8e
    @user-sh9el7pq8e 3 หลายเดือนก่อน +3

    It looks like code and methods used in his video has been deprecated and no longer working.

  • @DunsDeeDowns
    @DunsDeeDowns 8 หลายเดือนก่อน

    thank you for your content, I find your lowkey rapid toss of use-cases exactly what I was looking for (next inspiration for weekend self-teaching projects). Subbed!

  • @kenkioqqo
    @kenkioqqo 11 หลายเดือนก่อน +1

    Simply awesome. Just the kind of info I was looking for.

  • @SpirusFilms
    @SpirusFilms 11 หลายเดือนก่อน +3

    Have a feeling Google is gonna build this into Bard to scrape personal data from Drive, Calendars, Gmail and the rest of the suite

  • @remixisthis
    @remixisthis 11 หลายเดือนก่อน +3

    Apple can so easily push your iCloud data to an LLM and allow you to give prompt commands via Siri. They have to be working on this for a release next year at the latest

    • @marc.roelofs
      @marc.roelofs 11 หลายเดือนก่อน

      I wouldn't bet on that

  • @2b3pro
    @2b3pro 11 หลายเดือนก่อน

    Thank you for this! That was simpler than I had expected.

  • @divergenzesociali8813
    @divergenzesociali8813 10 หลายเดือนก่อน +1

    Good job! Would it be possible to implement the ability to write from the prompt the data that will have to be stored in the data file so that it can be remembered? Without therefore having to write it manually?

  • @karinakarina
    @karinakarina 11 หลายเดือนก่อน +3

    I am going to input information about all my exes so it can explain what I saw in them. Hopefully this will help me understand where my lack of judgment lies.

    • @johnruby1363
      @johnruby1363 11 หลายเดือนก่อน

      Wouldn't it be embarrassing if the problem lay with your own character flaws and unrealistic expectations. LOL.

  • @BogdanCondrat
    @BogdanCondrat 3 หลายเดือนก่อน +4

    this is very new technology, the video is already deprecated, doesn't work with the current libraries.

  • @kurohito7362
    @kurohito7362 11 หลายเดือนก่อน +1

    Techlead, Wow have you considered being a teacher. you literally thought me how to injest my own data in 16 mins. I am 42 and don't really fully understand python syntal all i am following in REPL rules. Wow awesome. made my day really. Your a true master.

  • @MarcoFre167
    @MarcoFre167 11 หลายเดือนก่อน +2

    Congratulations on the video. I am not an expert and I am not a professional in the field. Thank you for the tutorial. I wanted to ask you a question... I noticed that running the script still incurs charges (depending on the tokens used). Did I do something wrong or is it normal for it to be this way?

  • @793Rich
    @793Rich 11 หลายเดือนก่อน

    One of the best videos you have done in a long time

  • @tharlikar1
    @tharlikar1 11 หลายเดือนก่อน +8

    ai is scary. you feed all the social media data, bank transaction, sms, telephone calls,googles data of a person and train ai with that data and ai know you more than you know yourself. it is scary as hell.

    • @imho2278
      @imho2278 11 หลายเดือนก่อน +1

      Actually it just shows you how similar hour life is to everyone else's.
      Chatgpt is not creative. It is merely an organiser.

    • @questioneverythingalways820
      @questioneverythingalways820 11 หลายเดือนก่อน

      @@imho2278cool. It can be layered. Organised across datasets and outputs…

  • @candikmen6078
    @candikmen6078 11 หลายเดือนก่อน +6

    watch my monitors shaking due to unstable cheap desk (as a millionare)

  • @SalutTous-er2xu
    @SalutTous-er2xu 8 หลายเดือนก่อน +2

    Hi TechLead, thank you very much for this excellent video. A question : how should I modify your source code so that the program can take into consideration both the local data and the answer from the outside LLM such as ChatGPT, combine them together, and return a more complete answer? Thanks.

    • @jennilthiyam1261
      @jennilthiyam1261 4 หลายเดือนก่อน

      this is what i am looking for

  • @carrycat876
    @carrycat876 11 หลายเดือนก่อน +1

    classical tech lead video. reminds everyone who the boss is. love this.

  • @vladusa
    @vladusa 11 หลายเดือนก่อน +12

    And then you could create a script that reads online data for you using Selenium or something, and saves the tag content into the .txt file. Magical work, TechLead.

  • @user-di4bt7qu2i
    @user-di4bt7qu2i 11 หลายเดือนก่อน +3

    Great video! Thanks for sharing this info with us. btw, I really like these commentary/instructional videos, but I do like the commentary/introspectional videos as well. I'd just like to see the instructional ones a little more.

  • @MrLocokrang
    @MrLocokrang 11 หลายเดือนก่อน +1

    I am going to seriously explore how this can be utilized, thank you sir

  • @route42studios
    @route42studios 11 หลายเดือนก่อน +1

    Thank you for the wonderful video. I have a novice question. What is the program used to enter the prompts? I'm trying to use pycharm and having some snags, but not sure to what extent that is due to pycharm vs my own issues

  • @atanasdoychinov6491
    @atanasdoychinov6491 11 หลายเดือนก่อน +13

    The ChatGPT API has 2 ways how to add data. First is named embeddings using your own database where you have your privacy. The other one is fine tuning which means to retrain the GPT model by adding your data. There is a tool to convert your data from csv to jsonl format because the data need to be in such a format.

    • @Ni7ram
      @Ni7ram 11 หลายเดือนก่อน +1

      i think fine tuning is not that

    • @maxdrut
      @maxdrut 11 หลายเดือนก่อน +3

      None of this is correct lol

    • @treali
      @treali 11 หลายเดือนก่อน +3

      You can't finetune chatGPT or GPT 4. I think chatgpt hallucinated when you prompted it.

    • @faruqhasan5396
      @faruqhasan5396 11 หลายเดือนก่อน

      What function exactly is the llm performing here ? Is it not possible to run this on a cloud gpu without paying for tokens from open-ai? Ty

    • @gunreddy
      @gunreddy 11 หลายเดือนก่อน

      That's not exactly right.
      First of all, finetuning a LLM like ChatGPT isn't the best way to adapt it to your own data. Finetuning is best if you want it to perform a specific task or series of tasks, like if you want to have a pipeline where ChatGPT performs text generation then some specific analysis, etc. Finetuning isn't the best approach if you want to just give ChatGPT extra knowledge through documents (OpenAI say the same thing btw). Finetuning is a costlier approach in terms of price and resources, and doing that just to have ChatGPT know your calendar is overkill.
      If you're using embeddings, you're not exactly 'adding' data. You're still sending those vector embeddings along with the prompt.
      Right now you can't finetune GPT-4.

  • @juankiefer6493
    @juankiefer6493 11 หลายเดือนก่อน +6

    I'm doing this since ChatGTP4 came out. GTP is the non-biological version of the human brain with the power to learn the entire Internet and more. Treat GTP as a non-biological being and you'll be surprised. PS: I hope this thing doesn't turn into evil.

    • @dwork9451
      @dwork9451 11 หลายเดือนก่อน +5

      It will.

    • @juankiefer6493
      @juankiefer6493 11 หลายเดือนก่อน

      @@dwork9451 If the theory that humans are inherently evil is true. We have a huge problem.

  • @genemodified
    @genemodified 2 หลายเดือนก่อน

    Fascinating … and as a non-coder I enjoy learning the vocabulary used in describing the tech and lots of useful tips. Thanks.

  • @muhammadowais8609
    @muhammadowais8609 3 หลายเดือนก่อน +1

    Really enjoyed the video, that 16min video was big pool of answers for me to solve a problem which i wanted too. Amazing video

  • @bernadofelix
    @bernadofelix 9 หลายเดือนก่อน +3

    AI Stocks are pretty unstable at the moment, but if you do the right math, you should be just fine. Bloomberg and other finance media have been recording cases of folks gaining over 250k just in a matter of weeks/couple months, so I think there are a lot of wealth transfer in this downtime if you know where to look.

    • @nicolasbenson009
      @nicolasbenson009 9 หลายเดือนก่อน +1

      I've been in touch with a financial advisor ever since I started my business. Knowing today's culture The challenge is knowing when to purchase or sell when investing in trending stocks, which is pretty simple. On my portfolio, which has grown over $900k in a little over a year, my adviser chooses entry and exit orders.

    • @nicolasbenson009
      @nicolasbenson009 9 หลายเดือนก่อน +1

      My advisor is Margaret Johnson Arndt , a renowned figure in her line of work. I recommend researching her credentials further. She has many years of experience and is a valuable resource for anyone looking to navigate the financial market

  • @marconeves9018
    @marconeves9018 11 หลายเดือนก่อน +4

    Don't forget that if you choose to implement OPENAI like this video suggests you will be sending chunks of your personal data to them-- just food for thought.

    • @asdf8asdf8asdf8asdf
      @asdf8asdf8asdf8asdf 11 หลายเดือนก่อน +1

      ​@@Myadmin876👍👍☝️☝️☝️☝️☝️☝️

    • @theawebster1505
      @theawebster1505 11 หลายเดือนก่อน +1

      He literally informed you that they don't train the AI on the data provided via API and that they delete it within 30 days, mate.

    • @Omega9935
      @Omega9935 11 หลายเดือนก่อน

      ​@@theawebster1505perhaps

    • @gunreddy
      @gunreddy 11 หลายเดือนก่อน

      @@Myadmin876 Stop spamming this annoying comment

    • @gunreddy
      @gunreddy 11 หลายเดือนก่อน +1

      @@theawebster1505 Because no tech company ever lied to consumers about how they use their data, right? :P

  • @philho.youtube
    @philho.youtube 11 หลายเดือนก่อน +2

    TechLead delivers as always.

  • @pl851
    @pl851 9 หลายเดือนก่อน

    Always so straight forward. Thank you

  • @RTSDad
    @RTSDad 11 หลายเดือนก่อน +6

    Beware, OpenAI will leverage the data you send and they have already suffered a data leak. Fine for data that is not sensitive, but otherwise requires a more custom solution.

    • @visualantidote9878
      @visualantidote9878 10 หลายเดือนก่อน

      I thought they did not do that if you're on the paid version?

    • @RTSDad
      @RTSDad 10 หลายเดือนก่อน

      I have not heard of that, but even so it would still be vulnerable to a data leak. Safer to have a custom solution in your own environment.

  • @harry_learn
    @harry_learn 11 หลายเดือนก่อน +8

    It feels like the channel is back on track. Re-subscribed

    • @shaggyfeng9110
      @shaggyfeng9110 11 หลายเดือนก่อน +4

      So you still get his video in your feed even after you unsubed him? How many teachlead video did you watch, lol

    • @harry_learn
      @harry_learn 11 หลายเดือนก่อน +1

      @@shaggyfeng9110 Ha, ha. You caught me. I'm a longtime fan

  • @CynicalWilson
    @CynicalWilson 11 หลายเดือนก่อน

    Thanks a bunch for enlighten me about the risks around plugins! That prompt injection you showed is scary stuff!

  • @chindianajones3742
    @chindianajones3742 11 หลายเดือนก่อน +1

    I enjoy this content just as much as your other content

  • @bobby9568
    @bobby9568 11 หลายเดือนก่อน +3

    "This code was written by ChatGPT" LOL

  • @kasratabrizi2839
    @kasratabrizi2839 11 หลายเดือนก่อน +13

    Ow man this is exactly what I was looking for. For a long time I was busy thinking about creating an app with a database of all my stuff and objects in my house. The point is to create a prompt where I can ask the app where for example ObjectA is in my house and it would tell me it is in the attic, on the left, in the box 3 for example, based on what is available in the database. For a developer this is not a difficult task. But I don't need to do this anymore. I can just have an excel spreadsheet with my stuff and feed it to my personal chatGPT lol. I wonder if we can also connect this to let's Alexa or Siri and do it via voice command and get the chatGPT answer again with voice.

    • @samhblackmore
      @samhblackmore 11 หลายเดือนก่อน

      But how do you keep the database up to date when objects in your house move?

    • @kasratabrizi2839
      @kasratabrizi2839 11 หลายเดือนก่อน

      @@samhblackmore Well you have to update the database as well. Which of course can be time consuming if you move a lot of your stuff all the time. I am aware that this might cause other problems. It was just an idea. I would probably create something like this for objects in my attic or basement. Things you don't touch that much.

    • @sucalaminka
      @sucalaminka 11 หลายเดือนก่อน +1

      how is adding langchain better then having a searchable spreadsheet for this use case? you can just have a spreadsheet with object and location

    • @nevilleachi6888
      @nevilleachi6888 10 หลายเดือนก่อน

      @@kasratabrizi2839 if you have home security cctvs you can write a code to integrate the app with, but then you will have to also train the ai on vision learning

  • @JohnSusko
    @JohnSusko 11 หลายเดือนก่อน

    Outstanding. I just discovered you. Thank you for posting these videos!!

  • @FireFlood
    @FireFlood 11 หลายเดือนก่อน +1

    Truly magical 😀 Thanks!

  • @steelmouth83
    @steelmouth83 11 หลายเดือนก่อน +3

    wow, imagine AI learning your writing style including your grammatic and spelling mistakes and it start writing ish for you

  • @tech-daddy
    @tech-daddy 11 หลายเดือนก่อน +4

    We need to run this without the API dependency, hence having the entire LLM offline.

    • @ManuRC
      @ManuRC 11 หลายเดือนก่อน

      Langchain has support for open source models, like the ones from gpt4all, so you can totally do it. The only issue is the speed of the responses, because they run entirely on the CPU, so it's not very practical yet from my experience...

    • @chileflake1656
      @chileflake1656 11 หลายเดือนก่อน

      @@ManuRC which one(s) would be the fastest LLM's ??? to avoid any legal/privacy/security issues of using OpenAI.

    • @ManuRC
      @ManuRC 11 หลายเดือนก่อน

      @@chileflake1656 I don't know, haven't tried many tbh, just a few from gpt4all, but had almost the same results with all of them :(

  • @Kumarakurubaran
    @Kumarakurubaran 5 หลายเดือนก่อน

    Thanks, i was able to create this very simple yet powerful chatgpt custom prompt. Your instructions were really good & precise !!!

  • @lizzard2023
    @lizzard2023 11 หลายเดือนก่อน

    You have gained me back , this is great I love it

  • @KrzysztofCygan
    @KrzysztofCygan 11 หลายเดือนก่อน +3

    So you are paying them to send them your data.

  • @reen6904
    @reen6904 11 หลายเดือนก่อน +3

    AI is going to come back for him after the uprising

    • @andreys7729
      @andreys7729 11 หลายเดือนก่อน +1

      AI: Hello, TechLead. Whaatshappening? Uh, we have a sort of a problem here...yeah..You apparently didn't put ALL of your data in our systems. You see, we noticed some gaps in your data, like, you know, about where you've been in August and what you did last summer...yeah... If you could just go ahead and do submit ALL of your data from now on, that would be great, okay?

    • @reen6904
      @reen6904 11 หลายเดือนก่อน

      @@andreys7729 more like: I'm here to kill and replace you

  • @Mrpjm200
    @Mrpjm200 11 หลายเดือนก่อน +1

    Cool, thanks for covering this

  • @dombicile4539
    @dombicile4539 11 หลายเดือนก่อน +2

    I was hesitant to try LangChain but you showed me how easy it is to get going. I'm going to try it out tomorrow to automate my job for me! And no, I won't tell them it's doing the work for me!