Pandas DataFrame Agent... the future of data analysis?

แชร์
ฝัง
  • เผยแพร่เมื่อ 15 ก.ย. 2024
  • 👉🏻 Kick-start your freelance career in data: www.datalumina...
    Let's dive into the Pandas DataFrame Agent from the LangChain library to see how we can integrate analytical capabilities into LLM apps. We use the OpenAI API to ask questions about an Excel/CSV dataset and experiment with the possibilities and limitations of this LangChain Toolkit.
    🔗 Links
    github.com/dav...
    ⚙️ Copy my VS Code Setup • How to Set up VS Code ...
    👋🏻 About Me
    Hey there, my name is @daveebbelaar and I work as a freelance data scientist and run a company called Datalumina. You've stumbled upon my TH-cam channel, where I give away all my secrets when it comes to working with data. I'm not here to sell you any data course - everything you need is right here on TH-cam. Making videos is my passion, and I've been doing it for 18 years.
    While I don't sell any data courses, I do offer a coaching program for data professionals looking to start their own freelance business. If that sounds like you, head over to www.datalumina... to learn more about working with me and kick-starting your freelance career.

ความคิดเห็น • 45

  • @daveebbelaar
    @daveebbelaar  ปีที่แล้ว +3

    👉🏻Kick-start your freelance career in data: www.datalumina.io/data-freelancer
    👉🏻Learn more about data science and AI: www.datalumina.io/newsletter

    • @igoweiqibaduk8283
      @igoweiqibaduk8283 ปีที่แล้ว

      Hi Dave, could not find your email. The tool of booking a call in /data-freelancer page step 2 after video is not working, just wrote that July is unavailable, but month switch does not work. Regards, George.

    • @daveebbelaar
      @daveebbelaar  ปีที่แล้ว +1

      @@igoweiqibaduk8283 Hey George, thanks for your message. It is correct that the calendar is fully booked right now. I am expecting to take some more calls in 2-3 weeks. You are welcome to subscribe to our newsletter to stay updated on availability.

    • @rajatkumarsinha2159
      @rajatkumarsinha2159 11 หลายเดือนก่อน

      Hi Dave,
      Can you guide how to give my CSV file in Dolly 2.0 with langchain to have a question answer like above?

  • @RanaGustico
    @RanaGustico ปีที่แล้ว +2

    About the calculations: Have you tried the prompt:
    - "Act as an expert matematician. . Explain this step by step (that last words are sometimes is required) "
    I've read about this workaround to make AI self correct before responses. Happy to watch you update and review with the new stuff. Nice content sir!

  • @camilocampos5900
    @camilocampos5900 ปีที่แล้ว +3

    Every day I am more impressed by the llm potential with langchain, I am a fan of knowledge thank you for your content

    • @wongyithong9838
      @wongyithong9838 ปีที่แล้ว

      Exactly the same feeling, every time I see the title of these videos, wondering what apps I can build to solve real world problem.

  • @joseluisbeltramone599
    @joseluisbeltramone599 ปีที่แล้ว +3

    Hi Dave: Thank you very much for the excellent explanation. Now, would you please do a video where you meet with the tokens limitation of the LLM? I would like to see how to overcome this. Thanks in advance!

  • @AwB
    @AwB ปีที่แล้ว +1

    Great video. The 2 dataframes part was interesting. I was hoping I can pass in a summary dataframe and a raw dataframe, tell the LLM what is in each dataframe, and then ask it to write an article using both dataframes. "Write an article in this months results (which are in the summary dataframe), and also don't forget too mention some interesting related facts from the raw dataframe. This would require it to join the dataframes together.
    Do you think this is possible yet? I see lots of chatGPT with your database but I'm curious how it can work with multiple tables of data.

  • @MikeRhodesIdeas
    @MikeRhodesIdeas 8 หลายเดือนก่อน +1

    @daveebbelaar any plans to update this for langchain 0.1.0 ?? Maybe in the members' area??

  • @quickandsmart6298
    @quickandsmart6298 ปีที่แล้ว

    I've actually looked at this dataset before and one thing I noticed was that the agent actually made another error at 11:30. It found the median salary using the salary column and not the salary_in_usd column so for example the Head of Machine Learning role only had a single person who lived in india, so when converting 6,000,000 indian rupees it only ends up being 76k USD, far from what the results show. While the agent is very powerful, clearly it's not perfect and you have to make sure the questions provided are specific enough and double check the actual code it provides. Regardless, great video and it's definitely a tool I'll look to be using in later projects!

    • @daveebbelaar
      @daveebbelaar  ปีที่แล้ว

      Ahh, good one! And thanks! Definitely something I missed

  • @JT-Works
    @JT-Works ปีที่แล้ว +1

    I am building a Streamlit app with the Panda Dataframe Agent, and for the life of me, I cannot get the chatbot to have any memory context in chat. Is there a tutorial where you cover this?

  • @kumargaurav2170
    @kumargaurav2170 ปีที่แล้ว

    I think using memory component from Langchain will help overcoming bottleneck of memory management for operations requiring more than 1 step.

  • @gamerwager5317
    @gamerwager5317 ปีที่แล้ว

    My suggestion as a TH-cam make the video smaller ur voice is great for background track but add more info into the video , which add value to views time .😊

  • @Canna_Science_and_Technology
    @Canna_Science_and_Technology ปีที่แล้ว

    Just an idea, a video using the new function feature would be great. ;-)

  • @RyanScottForReal
    @RyanScottForReal ปีที่แล้ว +1

    You need to apply memory agent

  • @autonate_ai
    @autonate_ai 8 หลายเดือนก่อน

    Perfect, my dawg!

  • @onangarodney7746
    @onangarodney7746 ปีที่แล้ว

    Would it be more accurate if you added the Wolfram OpenAi plugin to the mix?

  • @shikharvarshney7010
    @shikharvarshney7010 ปีที่แล้ว

    Awesome Explanation !!

  • @irvinJoelBanta
    @irvinJoelBanta ปีที่แล้ว

    Love your videos, keep it up

  • @prateekkeshari
    @prateekkeshari ปีที่แล้ว

    It's interesting to play with it - have tried it out multiple times - but i do see limitations of it. Someitmes it also outputs wrong answers. What (in your opinion) would it take for it to be production ready?

  • @user-ib8qm8eh3q
    @user-ib8qm8eh3q 10 หลายเดือนก่อน +1

    Hi Dave, pls can I use an open source model for this instead of Open ai?

  • @micbab-vg2mu
    @micbab-vg2mu ปีที่แล้ว

    Great - Thank you

  • @HazemAzim
    @HazemAzim 11 หลายเดือนก่อน

    nice but did you try that with chat models ChatOpenai and use gpt-turbo-3.5 which is much cheaper ? I think the pandasDatframe agent will not work properly though !

  • @DK-dp3kk
    @DK-dp3kk 8 หลายเดือนก่อน

    Thank you. Nice video. Do you know if you can summarize text within a cell in the data frame? If you have a dataset that includes blog posts and you want a new column that has a 2 line summary. Ideas?

  • @alchemication
    @alchemication ปีที่แล้ว

    Thanks for sharing. The reason this can fail in real
    World is that biz is way more complex and a ton of jargon is used. After spending 100s of hours on this topic I can conclude it’s a good start but for real world scenarios on complex data, we need to be way more creative. Best!

  • @tommyharlim276
    @tommyharlim276 11 หลายเดือนก่อน

    how do i put this sort of application to a website so that i can upload my own data on the website and enter a prompt and have it displayed on the website ?

  • @xanderklein3356
    @xanderklein3356 ปีที่แล้ว

    Awesome video. Can you do this with Node js?

  • @nerding_io
    @nerding_io ปีที่แล้ว

    Very awesome!

  • @waddaa
    @waddaa ปีที่แล้ว

    I have been looking for a chain or agent that can work with tools and your own files as well but I couldn't find. Is this even possible?

  • @madhu1987ful
    @madhu1987ful 4 หลายเดือนก่อน

    Can this work on big data frames? Say 1 million rows of Data ?

  • @temp911Luke
    @temp911Luke 3 หลายเดือนก่อน

    Would be more interested if you could use the REAL open AI models (open source models) instead of gpt4 .

  • @justinchung982
    @justinchung982 10 หลายเดือนก่อน

    Please show doing this with Llama2!

  • @nanto88
    @nanto88 ปีที่แล้ว

    awesome

  • @johnbrisbin3626
    @johnbrisbin3626 ปีที่แล้ว

    I note that again you use text-davinci which openai claims is just a slower and more expensive way of getting what got 3.5 gives you for a fraction of the price.
    Have you found differently in real use?

    • @daveebbelaar
      @daveebbelaar  ปีที่แล้ว

      You're right, for real use-case I would use gpt-3.5 or 4. These are a little different to configure because they are chat-based models, but it would indeed be the preferred option.

  • @ajaypranav1390
    @ajaypranav1390 5 หลายเดือนก่อน

    why not use PandasAI

  • @SMCGPRA
    @SMCGPRA 5 หลายเดือนก่อน

    Can we use opensource LLM

    • @girishnaik6433
      @girishnaik6433 4 หลายเดือนก่อน

      did you get the answer? I'd really like to know it

  • @klammer75
    @klammer75 ปีที่แล้ว +1

    Does the pandas agent take a memory parameter? Really like these agents when they can hold a little chat history….I had issues getting their csv agent to hold onto the current convo as it wouldn’t take a ‘working memory’ parameter like some of the other agents would….great video🥳🦾🤓