PostgreSQL as VectorDB - Beginner Tutorial

แชร์
ฝัง
  • เผยแพร่เมื่อ 20 ธ.ค. 2023
  • Want to get started with freelancing? Let me help: www.datalumina.com/data-freel...
    Need help with a project? Work with me: www.datalumina.com/consulting
    🔗 Links in this video
    github.com/daveebbelaar/langc...
    github.com/pgvector/pgvector
    dev.to/confidentai/why-we-rep...
    👤 Connect with me on LinkedIn
    / daveebbelaar
    👋🏻 About Me
    Hey there, my name is @daveebbelaar and I work as a freelance Data Scientist / AI Engineer and run a company called Datalumina. You've stumbled upon my TH-cam channel, where I give away all my secrets when it comes to working with data. If you want to learn more about what I do, then head over to www.datalumina.com/

ความคิดเห็น • 38

  • @fabsync
    @fabsync หลายเดือนก่อน +1

    A new fan here! It will be great to see a video where you use streamlit or something else to create a search with pgvector (full text search)

  • @gr8tbigtreehugger
    @gr8tbigtreehugger 7 หลายเดือนก่อน

    Thanks for this! I was leaning towards pgvector and your video convinced me so!

  • @bin4ry_d3struct0r
    @bin4ry_d3struct0r 7 หลายเดือนก่อน +1

    One of the things I learned in the past few months working with RAG-based LLMs is that it's definitely not one size fits all. The quality of inference depends on the embedding algorithm as well as the indexing and retrieval mechanism of the vector database.
    This was a great video!

  • @abhishekchopda4100
    @abhishekchopda4100 6 หลายเดือนก่อน

    Great Video! Helped me in my work! Thanks :)

  • @myhificloud
    @myhificloud 7 หลายเดือนก่อน

    Clean solution. This is helpful, thank you for this.

  • @tushaar9027
    @tushaar9027 หลายเดือนก่อน

    Hi Dave, this is great video thanks for sharing the knowledge , i really liked the idea of using postgres sql , can you pls make one video on setting up postgres on azure

  • @ConnorLeech
    @ConnorLeech 24 วันที่ผ่านมา

    in the video you are creating the data from text files, but it seems like a main advantage of having it on your postgres db is being able to use / query the data in your tables.
    i'd love to see how to build a full text search or something from data stored in regular postgres tables!

  • @Michael-jl7wn
    @Michael-jl7wn 5 หลายเดือนก่อน

    How would this work if you were using more structured data that needed to be stored in columns and rows?

  • @krunkey
    @krunkey 2 หลายเดือนก่อน

    Thanks for the video. I'll be trying PGVector! Do you know of any good alternative to OpenAI embeddings that can be run locally?

  • @MaliciousCode-gw5tq
    @MaliciousCode-gw5tq หลายเดือนก่อน

    I have follow up question if let say 1 chapter of a book total words count is 3k will it be able to store all the 3k words ?

  • @erwinl7794
    @erwinl7794 6 หลายเดือนก่อน

    What about an open source vector store like qdrant?

  • @jennymelia
    @jennymelia 6 หลายเดือนก่อน

    LOL dave i was googling if i can use postcres somehow instead of pinecone and your video popped up 🤣🤣👍🏽👍🏽👍🏽 Love it!

    • @daveebbelaar
      @daveebbelaar  6 หลายเดือนก่อน +1

      Haha you're becoming a true engineer Jenny. Those are some pretty serious Google searches haha. Let me know if you need further help!

    • @jennymelia
      @jennymelia 6 หลายเดือนก่อน

      @@daveebbelaar for sure dude! 🤌🏽 trying to get in that coder level 😂😂😂

  • @say.xy_
    @say.xy_ 6 หลายเดือนก่อน

    Hi Dave, I’m also using Pgvector but output are not really that good, could you make a video on improving performance of RAG pipeline in langchain and pgvector, thanks.

  • @anand-st7mo
    @anand-st7mo 3 หลายเดือนก่อน

    Bro, did you do any indexing?

  • @eyemazed
    @eyemazed 6 หลายเดือนก่อน

    thing that bothers me about using postgres for RAG is that the vector search works fine, but its full text search capabilities are severely handicapped. it doesn't support partial or fuzzy matching, so you can't really do a nice reciprocial rank fusion between resources retrieved by multiple channels (vector + full text). i'm going to try ElasticSearch next, as i've previously worked with it and its really good at full text search (TF/IDF, fuzzy search, partial search, stemming...), and the newer versions also support vector search. the downside is having to sync elastic with your main db all the time...

  • @henkhbit5748
    @henkhbit5748 6 หลายเดือนก่อน

    Thanks for showing pg vector. weaviate is also free and can be run locally using docker. I agree I am for open source.

  • @touma4659
    @touma4659 หลายเดือนก่อน

    thank you💖💖

  • @izzatirfan2794
    @izzatirfan2794 19 วันที่ผ่านมา

    Greatt!! I enjoy watching your video. I have tried to hands-on the code from your GitHub but i am facing an error ModuleNotFoundError: No module named 'pgvector_service'. Then, I tried to pip install pgvector_service but this occured. ERROR: Could not find a version that satisfies the requirement pgvector_service (from versions: none)
    ERROR: No matching distribution found for pgvector_service
    Do you have any ideas how to overcome this?

  • @EmilioGagliardi
    @EmilioGagliardi 7 หลายเดือนก่อน +1

    THis was super interesting. Do you have a video that explains your PGVector setup (do you install the database locally or do you have a cloud account)? I'd love to have a setup where I can view my document collections and embeddings in my editor like that. I use VSCode right now, so not sure ... good stuff!

    • @daveebbelaar
      @daveebbelaar  7 หลายเดือนก่อน

      I talk about this near the end of the video

  • @DanielWeikert
    @DanielWeikert 7 หลายเดือนก่อน

    How do you update the vectorstore (e.g. replace outdated data?
    br

    • @gr8tbigtreehugger
      @gr8tbigtreehugger 7 หลายเดือนก่อน

      Just update the outdated data like you would in any db.

  • @SigAiOC-ke3ss
    @SigAiOC-ke3ss 7 หลายเดือนก่อน +1

    I didn't fully understood it from the video but are you comparing times between using Pinecone on a remote host vs Postgres ran locally?

    • @daveebbelaar
      @daveebbelaar  7 หลายเดือนก่อน

      Not only processing time (because I know that's not a true fair comparison), but also easy of use and data management.

    • @SigAiOC-ke3ss
      @SigAiOC-ke3ss 7 หลายเดือนก่อน +3

      @@daveebbelaar I get that, but in a production environment it makes a big difference especially when you think of use cases. I would be curious to see a comparison between a cloud hosted postgres and pinecone or,between the locally hosted postgres and something like chroma

  • @3wcdev878
    @3wcdev878 7 หลายเดือนก่อน

    But you tested it with a small dataset, most relational databases go slower as they grow.

  • @MichaelHoughton_
    @MichaelHoughton_ 7 หลายเดือนก่อน +1

    Could you put the vectors inside fire base ? That’d be epic

    • @3wcdev878
      @3wcdev878 7 หลายเดือนก่อน

      Nope, firbase has a limit, tried it.

    • @MichaelHoughton_
      @MichaelHoughton_ 7 หลายเดือนก่อน

      @@3wcdev878 dang that’s unfortunate

  • @gilbertb99
    @gilbertb99 7 หลายเดือนก่อน

    pinecone is managed isnt it? theres more reasons why enterprises would use and pay for it. For simple side projects, then yeah pgvector locally makes sense.

  • @greendsnow
    @greendsnow 6 หลายเดือนก่อน

    pgvector is the WORST performing vector db according to all comparison charts.
    you need to tell people if you're sponsored by supabase, otherwise this is not ethical.

    • @daveebbelaar
      @daveebbelaar  6 หลายเดือนก่อน +3

      Can you share some more insights on this? And no, I am not sponsored or affiliated with Supabase.