There is a high probability that the results returned by "keyword match" are always dropped by "re-ranker". Because re-ranker is basically powered by an LLM, so if the keywords based result are not picked by semantic matching in the first place, then re-ranker will also reject them, as it also match/compare results semantically. Hope that make sense!
I found this description about Rerankers on the Voyage AI website: "Unlike embedding models that encode queries and documents separately, rerankers are cross-encoders that jointly process a pair of query and document, enabling more accurate relevancy prediction. Thus, it is a common practice to apply a reranker on the top candidates retrieved with embedding-based search (or with lexical search algorithms such as BM25 and TF-IDF)."
it's build on Postgres which has an incredible track-record of stability, security and is the most widely loved database out there. It is also allot faster than any dedicated vector database and comes with a range of interesting open-source features like TimeScaleDB's "Hypertables" allowing much faster querying on time-stamped data (there are some cool hacks to make this work for any type of data).
excellent video Dave! some possible extensions could be : orchestration with LangGraph, Caching (Redis ?), Storing chat history etc. Were building something beautiful for the community and it's great to see were on a similar path. Btw, regarding TimeScaleDB - I didn't realize the timescale-vector extensions where available for self-hosting ? interesting. We were using only SQL before but now we will only use it for SCHEMA changes (Alembic) . Cheers!
What would be good options for handling the memoery of the agent? e.g during continuous chat conversations. Or is there any strategy to so the agent knows It has already summarized the same records retrieved from the RAG? I am thinking it could save some processing/resources + avoid loops of answering the same thing
Awesome as always. Would be great to add pg_search (paradedb) to it, instead of using default full text search in Postgres. Curious, is this your plan for the next tutorial? Will be really appreciate it.
I wanted to do this for this tutorial but got stuck after three hours tinkering with both Docker images. But I'll have one of my more technical engineers look into it!
Thanks for the video @Dave! Much appreciated! Are you planning by any chance to make a video in a near future about how to utilize graph databases in the RAG world?
Hey, Thank you for your video. I want to use this PostgreSQL and pgvector system on my local machine without Docker and API keys. Example: llma3.2. Is there an option about it, or what is the best performance option? Or you would make a video works fully local system and what is the best configuration for it Please Lead me. regards
Does anyone know if its as straightfoward to use supasbase postgre and pgvector instead? Supabase now offers vector index compatibility out of the box you just have to come up with your own embedding solution
@@daveebbelaar thanks. If you could make a shorter video that compares reranking or contextual retrieval using timescale vs supabase at a higher level maybe like 10 minutes would help a ton for us that use typescript and others ! I have been going down the rabbit hole the possibilities are insane..
Hey @@daveebbelaar i just finished my rag chatbot application using typescript vercel ai-sdk nextjs and supabase vector enabled tables. *If you ever get questions about that in the future:* Its definitely much slower 500ms in avg for retrieving 25 records. Its fine with text streaming and proper state management and load management but i would highly suggest to build these technologies on python unless you really need to serve users or make an app for a client. Even then i would probably make a python app and use its API instead from a nextjs lighter application if there was money on the table. Lots of bugs... AI on javascript is an absolutely horrible experience
@@daveebbelaar Ah okay! I get the benefit of avoiding vendor lock-in. But is it really cheaper for a production ready RAG-Application? As far as i know, at least on Azure the AI search costs approximately between 450$ and 650$ per month, while the costs for a PostgreSQL database with similar performance are in the same range.
@@jonaaapschl1204 You can get a managed pg db on Azure for about $25 per month although it will not work with pgvectorscale yet because the extension is not supported. If you deploy the db straight on a server, you only pay the server cost, which will also be about $30 or so per month on Azure. But of course, Azure AI Search is really convenient but it's built so that when you scale, you're going to pay a lot - and they can increase prices whenever they want. So you really have to consider your use-case, team, budget and long-term plans to pick the best solution.
@@daveebbelaar Hey Dave. Just joining this discussion. Why not use Elasticsearch? Its open source, and allow the reranking logic to be customized to your liking, and it happens on the elastic instance as the search is executed.
@@daveebbelaar I am quite sure that in GCP the Postgres instances support vector extension. Worth checking. And like stated above, the cheapest Postgres instance will cost around $25, a bit better one around double as much.
Hi Sir! I am 15.I am not even the beginner but I want to make career in AI.But i don't know what field should I choose in AI Can you help me to learn about the career options in AI and what are them (or) to show me the sources from where I can learn about it. 🙏 reply
Dude, I was just researching how to do this yesterday, and today you make a video! Very timely!
I got you!
were sharing the same timeline brothers
Thank you! ✨ Had been hoping for this video for a while :)
Awesome!
There is a high probability that the results returned by "keyword match" are always dropped by "re-ranker". Because re-ranker is basically powered by an LLM, so if the keywords based result are not picked by semantic matching in the first place, then re-ranker will also reject them, as it also match/compare results semantically. Hope that make sense!
Good point and that definitely makes sense.
I found this description about Rerankers on the Voyage AI website:
"Unlike embedding models that encode queries and documents separately, rerankers are cross-encoders that jointly process a pair of query and document, enabling more accurate relevancy prediction. Thus, it is a common practice to apply a reranker on the top candidates retrieved with embedding-based search (or with lexical search algorithms such as BM25 and TF-IDF)."
Perfect! Awesome.
I hope you continue this amazing RAG playlist.
btw why do you prefer pgvector over pinecone and qdrant ?
I have a few more things coming! Why pgvector? Open-source, simple, can use 1 database per project instead of relational + vectordb
it's build on Postgres which has an incredible track-record of stability, security and is the most widely loved database out there. It is also allot faster than any dedicated vector database and comes with a range of interesting open-source features like TimeScaleDB's "Hypertables" allowing much faster querying on time-stamped data (there are some cool hacks to make this work for any type of data).
excellent video Dave! some possible extensions could be : orchestration with LangGraph, Caching (Redis ?), Storing chat history etc. Were building something beautiful for the community and it's great to see were on a similar path. Btw, regarding TimeScaleDB - I didn't realize the timescale-vector extensions where available for self-hosting ? interesting. We were using only SQL before but now we will only use it for SCHEMA changes (Alembic) . Cheers!
What would be good options for handling the memoery of the agent? e.g during continuous chat conversations. Or is there any strategy to so the agent knows It has already summarized the same records retrieved from the RAG? I am thinking it could save some processing/resources + avoid loops of answering the same thing
Awesome as always.
Would be great to add pg_search (paradedb) to it, instead of using default full text search in Postgres. Curious, is this your plan for the next tutorial? Will be really appreciate it.
I wanted to do this for this tutorial but got stuck after three hours tinkering with both Docker images. But I'll have one of my more technical engineers look into it!
Thanks for the video @Dave! Much appreciated! Are you planning by any chance to make a video in a near future about how to utilize graph databases in the RAG world?
Interesting topic Will note it.
Excellent video love your approach thank you.
You're very welcome!
is it better than elasticsearch? have you evaluated the retrievals using metrics like hit rate and mrr?
I am planning on doing another video on evaluation.
Great content , thanks for sharing 😂
Thank you! Could you add video where you implement multi-query retrieval? 🙏😃
Hey, Thank you for your video.
I want to use this PostgreSQL and pgvector system on my local machine without Docker and API keys. Example: llma3.2. Is there an option about it, or what is the best performance option?
Or
you would make a video works fully local system and what is the best configuration for it
Please Lead me. regards
Could you combine this with Anthropic’s Contextual Retrieval?
@@bramjanssen8865 Yes, that blog post inspired me to make his video. Will do a separate video how to integrate that as well.
How rich are you? Contextual Retrieval is craxy expensive
Does anyone know if its as straightfoward to use supasbase postgre and pgvector instead? Supabase now offers vector index compatibility out of the box you just have to come up with your own embedding solution
I think it should be possible. We're also using Supabase for a lot of our projects. I'll look into it!
@@daveebbelaar thanks. If you could make a shorter video that compares reranking or contextual retrieval using timescale vs supabase at a higher level maybe like 10 minutes would help a ton for us that use typescript and others ! I have been going down the rabbit hole the possibilities are insane..
Hey @@daveebbelaar i just finished my rag chatbot application using typescript vercel ai-sdk nextjs and supabase vector enabled tables.
*If you ever get questions about that in the future:*
Its definitely much slower 500ms in avg for retrieving 25 records. Its fine with text streaming and proper state management and load management but i would highly suggest to build these technologies on python unless you really need to serve users or make an app for a client. Even then i would probably make a python app and use its API instead from a nextjs lighter application if there was money on the table. Lots of bugs... AI on javascript is an absolutely horrible experience
👑👑 👑
🙏🏻
how can i create a real app (in production) can you make some video or consulting?
What are the pros of using Vector- and Hybridsearch in a PostgreSQL database, instead of just using lets say Azure AI search?
Pricing and flexibility: It's WAY cheaper and you'll have more granular control over the system without getting a vendor lock-in.
@@daveebbelaar Ah okay! I get the benefit of avoiding vendor lock-in. But is it really cheaper for a production ready RAG-Application? As far as i know, at least on Azure the AI search costs approximately between 450$ and 650$ per month, while the costs for a PostgreSQL database with similar performance are in the same range.
@@jonaaapschl1204 You can get a managed pg db on Azure for about $25 per month although it will not work with pgvectorscale yet because the extension is not supported. If you deploy the db straight on a server, you only pay the server cost, which will also be about $30 or so per month on Azure. But of course, Azure AI Search is really convenient but it's built so that when you scale, you're going to pay a lot - and they can increase prices whenever they want. So you really have to consider your use-case, team, budget and long-term plans to pick the best solution.
@@daveebbelaar Hey Dave. Just joining this discussion. Why not use Elasticsearch? Its open source, and allow the reranking logic to be customized to your liking, and it happens on the elastic instance as the search is executed.
@@daveebbelaar I am quite sure that in GCP the Postgres instances support vector extension. Worth checking. And like stated above, the cheapest Postgres instance will cost around $25, a bit better one around double as much.
Hi Sir! I am 15.I am not even the beginner but I want to make career in AI.But i don't know what field should I choose in AI
Can you help me to learn about the career options in AI and what are them (or) to show me the sources from where I can learn about it.
🙏 reply