@@bugbytes3923 I had to amend my note. You were correct , objective is to minimise cosine distance which is 0 for most similar and 2 for most dissimilar 😅
If anyone has a problem with getting Docker Desktop to work (you need it on Windows for example to get docker commands work) and you get WSL errors while launching it, I recommend: - either reinstalling your WSL/updating it or - (what worked for me) going into your BIOS and changing the virtualization setting to enabled (if you cannot find it, you might have it under the acronym "SVM" mode). I had a bit of a waste of time in order to make this work, was kinda dissapointed that there was no mention of the virtualization settings, but besides that (I'm being very picky) great video! For a PGVector & Jupyter beginner I had almost no problems and everything worked for me. Thanks for the vid.
Thank you so much for sharing the details, Your informative TH-cam videos have been incredibly helpful. Great job on putting together such valuable content! Keep up the outstanding work and continue enlightening us. We truly appreciate your contributions!
Thanks, I am having this error when creating the "vector" extension ERROR: Could not open extension control file "C:/Program Files/PostgreSQL/16/share/extension/vector.control": No such file or directory
@@sa_codes Thanks a lot! It's the same library I installed in this video to work with pgvector - this library has modules for working with Django - more details here: github.com/pgvector/pgvector-python#django
Edit: - Problem 1: My postgres container is within WSL2, which I cannot connect with PgAdmin from Windows - Solution : connect pgAdmin page container with pgvector container. - Problem 2: Object of type PosixPath is not JSON serializable - Solution:Change my POsixPath to string and pass to TextLoader
I have a question. How do we load existing embeddings from database? it should be offline, as we already have generated them, its just we need loading it up into memory, right?
thanks for the video! do you know if there's a way to save the database locally after it's been initalised with `db = PGVector.from_documents( embedding=embeddings, documents=chunks, connection_string=connection_string )`? e.g. Faiss has a save_local() function
Excellent video, any chance instead of OpenAI ada embeddings, how about S-Bert to generate embeddings? possible code snippet would be appreciated. Thanks and love your content.
Supabase uses their vec client for postgres/pgvector. This does not need docker but we are then limited to their free plan storage of 50MB. What do you think?
great video. How does this compare to FTS for search? When would you want to use that over this? Would they get the same results in this case for example?
Thanks! The mechanism for FTS is different, so there's no guarantee that the same results would be reached. Maybe I could do a video quickly comparing these methods!
@@bugbytes3923 Would be a nice video I think. One of the advantages of FTS over this for searching products would be that if you have it on a online website you can't be ddos't to increase your API cost a lot.
i am having this error, pls help me how to solve this Could not open extension control file "/PostgreSQL/16/share/extension/vector.control": No such file or directory.extension "vector" is not available.
Super interesting video. I’m wondering if you know about how to prompt properly to openai to generate the vectors. By this I mean if there are ways to improve the quality of the vectors to query so the answer can be more precise. Thanks
Hello, thank you for the video explanation, i've problem when CREATE EXTENSION vector (in minutes 20:07) ERROR: Could not open extension control file "C:/Program Files/PostgreSQL/16/share/extension/vector.control": No such file or directory.extension "vector" is not available ERROR: extension "vector" is not available SQL state: 0A000 Detail: Could not open extension control file "C:/Program Files/PostgreSQL/16/share/extension/vector.control": No such file or directory. Hint: The extension must first be installed on the system where PostgreSQL is running. Anyone can help ?
Is there any tutorial where I already have a table in postgres ? I found that I uploaded all the dicuments and created the index without langchain and now I want to acces that database but I found that all the tutorials starts from raw data and create the vectorstore in the process.
loving your videos man, thank you for clear concise explanation of these topics. Do have any videos using RAG and agents in Django? I am using Django RestAPI and have been struggling with an agent controller that work fine in the notebook but then times out in my API request with the exact same code usin Char ReAct Description?
hello. great video, helped me a lot with exactly what I was looking for! Keep up the good work. I have a question. I followed you video and I downloaded docker image, I have my pgadmin4, but when i try making extension, it says: Could not open extension control file "C:/Program Files/PostgreSQL/15/share/extension/vector.control": No such file or directory.extension "vector" is not available Do you maybe know what is going on? Thank you in advance
Thank you! Regarding your problem: did you add the port mapping in the Docker run command? From port 5432:5432? I suspect that pgAdmin is trying to connect to Postgres running locally on your machine, rather than in the Docker container. Do you have Postgres running on your machine locally? You may need to stop that if Postgres is running on the same port in the Docker container. Not sure though, but let me know if you get it fixed or if you're still stuck!
@@bugbytes3923 oh, thank you sooo much! postgres did run locally on my machine on same port as doocker container. so i had to stop those proceses, and now it works! can't wait for the django video with pgvector! keep up the good work
@@ajaypalsingh6329 if you have both docker and local postgre in yout pgadmin, you should stop those procceses within the task manager. Go to procceses and end all procceses regarding your postrgres. That is what worked for me honestly. I dont know if you have the same issue.
Hi, I find this video very informative and easy to understand. However, I am getting the below error when downloading pgVector image: Error response from daemon: pull access denied for arcane/pgvector, repository does not exist or may require 'docker login': denied: requested access to the resource is denied"
Is there any way to do hybrid search with this? Meaning, is it possible to do something like keyword search or some other filtering before doing semantic similarity? Or is this kind of feature only available in specific paid vector databases?
In this video, it should be 1536-dimensions. We used OpenAI's latest embedding model to create the embeddings, which has output dimensions of 1536. platform.openai.com/docs/guides/embeddings/second-generation-models
Hi, i followed the steps you mentioned in blog but facing issue while connect and insert vectors to postgres database Please find the error below: texts = [d.page_content for d in documents] ^^^^^^^^^^^^^^ AttributeError: 'tuple' object has no attribute 'page_content'
There's a table of the available integrations here: docs.trychroma.com/integrations Cohere, Google Gemini, Hugging Face, Ollama and more are supported, in addition to OpenAI.
@@bugbytes3923 ers\Administrator> docker run --name pgvector5-demo -e POSTGRES_PASSWORD=test -p 5432:5432 ankane/pgvector popen failure: Cannot allocate memory initdb: error: program "postgres" is needed by initdb but was not found in the same directory as "/usr/lib/postgresql/15/bin/initdb" Despite following the post steps several times, the error still appears. Maybe it's because I'm using Win10.
Are you running Postgres in a Docker container? If you're on windows, you'll need to install PGVector: github.com/pgvector/pgvector?tab=readme-ov-file#windows
@ naah i got the fix, its basically ot change the port number of either docker image run or postgress local, if they are same then we get the above error Anyways thanks and keep doing what you do!!
Kindly help me with the below error.. When I try to execute CREATE EXTENSION vector I'm getting the below error ERROR: Could not open extension control file "/usr/share/postgresql/16/extension/vector.control": No such file or directory.extension "vector" is not available ERROR: extension "vector" is not available SQL state: 0A000 Detail: Could not open extension control file "/usr/share/postgresql/16/extension/vector.control": No such file or directory. Hint: The extension must first be installed on the system where PostgreSQL is running. Note - both Postgres and pgvector running in docker
Hi .. Im getting below error while running the CREATE EXTENSION vector query in Database. Can you please help, ERROR: Could not open extension control file "C:/Program Files/PostgreSQL/16/share/extension/vector.control": No such file or directory.extension "vector" is not available ERROR: extension "vector" is not available SQL state: 0A000 Detail: Could not open extension control file "C:/Program Files/PostgreSQL/16/share/extension/vector.control": No such file or directory. Hint: The extension must first be installed on the system where PostgreSQL is running.
super awesome! It will be great to see this integrated with django-ninja to build a chat with pdf (but without using chatgpt --something similar to this th-cam.com/video/rIV1EseKwU4/w-d-xo.html which is essentially from primordial privategpt....
Extremely complex concepts published in the simplest way! I could run the whole notebook typed without errors! Thank you for the clarity!
Thanks a lot, really happy to hear that! Cheers!
Most excellent. I am now a monthly supporter. You deserve to be paid.
Just yesterday I thought "pgvector would be interesting to see a video about".
And then you publish this! 👏👏👏🥳
Thank you Lyle. 🙏
Thanks a lot Sil!
oh man.. it has been a while and it is still the best tutorial out there.. It will be great to see something with pgvector again with django-ninja...
Thanks a lot! I'd love to do some more on PGVector - if anyone has any project ideas, let me know here!
Dude i fricken love your accent. The way you say pgvector sounds awesome, almost as awesome as this video, big thanks!
Haha thanks, glad you like it! :D
Excellent tutorial well structured . I followed your tutorial on Alembic as well and really helped me in my development 😂
Thanks for the note! And glad you liked the videos, cheers.
@@bugbytes3923 I had to amend my note. You were correct , objective is to minimise cosine distance which is 0 for most similar and 2 for most dissimilar 😅
Fantastic comprehensive walkthrough of how to use PGVector and Python to work with vectors for your AI stuff 😀
Thanks a lot Mattias!
@@bugbytes3923thank YOU, now looking into the one where you use Django as the front end to all of this 😊
What a fantastic video.
Thank you, BugBytes !
Thanks a lot!
If anyone has a problem with getting Docker Desktop to work (you need it on Windows for example to get docker commands work) and you get WSL errors while launching it, I recommend:
- either reinstalling your WSL/updating it
or
- (what worked for me) going into your BIOS and changing the virtualization setting to enabled (if you cannot find it, you might have it under the acronym "SVM" mode).
I had a bit of a waste of time in order to make this work, was kinda dissapointed that there was no mention of the virtualization settings, but besides that (I'm being very picky) great video! For a PGVector & Jupyter beginner I had almost no problems and everything worked for me. Thanks for the vid.
clear and well structured. you have an amazing style of teaching.
Awesome to hear, thanks a lot!
If your curious how to use a local model instead of paying for openai, I successfully followed this tutorial with mistral 7b running on my 8gb m1 Mac!
Nice!
i'd love more information on how you did this? could you provide some code samples?
Really appreciate your efforts you have put in for this tutorial
Thanks a lot!
Brilliant content. Concise, no waffle. Thank you
Thanks a lot!
Thank you so much for sharing the details, Your informative TH-cam videos have been incredibly helpful. Great job on putting together such valuable content! Keep up the outstanding work and continue enlightening us. We truly appreciate your contributions!
Thanks a lot, glad to hear that the videos have been helpful - thanks for watching and supporting the channel!
Straight forward explanation. Thank you
Thanks a lot!
Thank you so much for this tutorial! Very, very high quality!
Thanks a lot, glad you liked!
Dude thanks for making this. I always learn something from your videos. Thank you!
Thanks a lot, glad to hear that! Thank you for the support!
Thanks,
I am having this error when creating the "vector" extension
ERROR: Could not open extension control file "C:/Program Files/PostgreSQL/16/share/extension/vector.control": No such file or directory
Have you solved this problem? Pls help me to do this
Very thorough walkthrough. Thanks!
Fantastic video! Would be interesting to see a follow up on how this might work with Django?
Thanks a lot - I am planning a short video on Django and pgvector. There's a useful extension that integrates the two - coming soon!
@@bugbytes3923 Could I ask what the extension is so I could have a look while you're creating the video. Love your content!
@@sa_codes Thanks a lot! It's the same library I installed in this video to work with pgvector - this library has modules for working with Django - more details here:
github.com/pgvector/pgvector-python#django
@@bugbytes3923 Amazing, thanks!
Thanks for sharing it. Very useful.
No problem - thanks for watching!
well done, thank you so much
Thank you, glad it helped!
thank you so much for this content!
Thanks a lot for watching!
This was super helpful, thanks!
Glad to hear that - thanks for watching!
fantastic content, thank you! would be great if you could do a more in depth video on how do indexing (HNSW) with the same jupyter notebook example
Edit:
- Problem 1: My postgres container is within WSL2, which I cannot connect with PgAdmin from Windows
- Solution : connect pgAdmin page container with pgvector container.
- Problem 2: Object of type PosixPath is not JSON serializable
- Solution:Change my POsixPath to string and pass to TextLoader
thanks for this, it was a great help!
Glad to hear that! Thank you for watching.
Thanks man, Great content!
Thanks a lot!
Great job! Extremely usefull ! tks.
Thanks a lot!
Is there any way I can use the data from the Postgres database directly, instead of using documents data?
Great video, thanks! I have a question: How can I link a document to a specific user when inserting and querying documents?
metadata
Great Video Sir
Thanks a lot!
I have a question. How do we load existing embeddings from database? it should be offline, as we already have generated them, its just we need loading it up into memory, right?
thanks for the video!
do you know if there's a way to save the database locally after it's been initalised with `db = PGVector.from_documents(
embedding=embeddings, documents=chunks, connection_string=connection_string
)`?
e.g. Faiss has a save_local() function
I'm a beginner. What's the best free openAI altertanative?
Thanks this is very helpful
Thanks a lot!
Excellent video, any chance instead of OpenAI ada embeddings, how about S-Bert to generate embeddings? possible code snippet would be appreciated. Thanks and love your content.
Supabase uses their vec client for postgres/pgvector. This does not need docker but we are then limited to their free plan storage of 50MB. What do you think?
Fantastic! Where is the Jupyter notebook?
great video. How does this compare to FTS for search? When would you want to use that over this? Would they get the same results in this case for example?
Thanks! The mechanism for FTS is different, so there's no guarantee that the same results would be reached. Maybe I could do a video quickly comparing these methods!
@@bugbytes3923 Would be a nice video I think. One of the advantages of FTS over this for searching products would be that if you have it on a online website you can't be ddos't to increase your API cost a lot.
Thank you so much for great video!, can please cover on Anthropic Claude with PGVECTOR. That would be a great help !
Good contents. Thanks.
Thanks a lot!
i am having this error, pls help me how to solve this
Could not open extension control file "/PostgreSQL/16/share/extension/vector.control": No such file or directory.extension "vector" is not available.
Super interesting video. I’m wondering if you know about how to prompt properly to openai to generate the vectors. By this I mean if there are ways to improve the quality of the vectors to query so the answer can be more precise. Thanks
with embeeding models there is no prompting, These are not chat models.
@@nedyalkokarabadzhakov5405 so basically the embedding needs to be created by the most accurate text that you can provide right?
Hey, Can you also try to experiment with Langfuse and how it can be leveraged ?
I'll need to look into Langfuse. But possibly! I'm planning more GPT/vector/langchain videos.
Very interesting!
Thanks!
Hello, thank you for the video explanation,
i've problem when CREATE EXTENSION vector (in minutes 20:07)
ERROR: Could not open extension control file "C:/Program Files/PostgreSQL/16/share/extension/vector.control": No such file or directory.extension "vector" is not available
ERROR: extension "vector" is not available
SQL state: 0A000
Detail: Could not open extension control file "C:/Program Files/PostgreSQL/16/share/extension/vector.control": No such file or directory.
Hint: The extension must first be installed on the system where PostgreSQL is running.
Anyone can help ?
Is there any tutorial where I already have a table in postgres ? I found that I uploaded all the dicuments and created the index without langchain and now I want to acces that database but I found that all the tutorials starts from raw data and create the vectorstore in the process.
loving your videos man, thank you for clear concise explanation of these topics. Do have any videos using RAG and agents in Django? I am using Django RestAPI and have been struggling with an agent controller that work fine in the notebook but then times out in my API request with the exact same code usin Char ReAct Description?
hello. great video, helped me a lot with exactly what I was looking for!
Keep up the good work.
I have a question. I followed you video and I downloaded docker image, I have my pgadmin4, but when i try making extension, it says: Could not open extension control file "C:/Program Files/PostgreSQL/15/share/extension/vector.control": No such file or directory.extension "vector" is not available
Do you maybe know what is going on?
Thank you in advance
Thank you!
Regarding your problem: did you add the port mapping in the Docker run command? From port 5432:5432?
I suspect that pgAdmin is trying to connect to Postgres running locally on your machine, rather than in the Docker container. Do you have Postgres running on your machine locally? You may need to stop that if Postgres is running on the same port in the Docker container.
Not sure though, but let me know if you get it fixed or if you're still stuck!
@@bugbytes3923 oh, thank you sooo much!
postgres did run locally on my machine on same port as doocker container. so i had to stop those proceses, and now it works!
can't wait for the django video with pgvector! keep up the good work
I am also facing it can you please add steps so I can also solve this....
Thanks you in advance
@@ajaypalsingh6329 if you have both docker and local postgre in yout pgadmin, you should stop those procceses within the task manager. Go to procceses and end all procceses regarding your postrgres. That is what worked for me honestly.
I dont know if you have the same issue.
@@ajaypalsingh6329 windows or Mac?
What PostgreSQL permissions or operator functions are required or recommended for pgvector?
Is there any way to store in custom schema defined instead of public schema??
thanks. really helpful
Thanks for watching!
@@bugbytes3923 Hey I have this error. do you know why?
connection_string = "postgresql+psycopg2://user:pass@localhost:5432/db"
collection_name = 'financial_qa'
db = PGVector.from_documents(
embedding=instructor_embeddings,
documents=texts,
collection_name=collection_name,
connection_string=connection_string
)
File ~\.conda\envs\financial_qa\lib\site-packages\langchain\vectorstores\pgvector.py:578, in PGVector.from_documents(cls, documents, embedding, collection_name, distance_strategy, ids, pre_delete_collection, **kwargs)
574 connection_string = cls.get_connection_string(kwargs)
576 kwargs["connection_string"] = connection_string
--> 578 return cls.from_texts(
579 texts=texts,
580 pre_delete_collection=pre_delete_collection,
581 embedding=embedding,
582 distance_strategy=distance_strategy,
583 metadatas=metadatas,
584 ids=ids,
585 collection_name=collection_name,
586 **kwargs,
587 )
File ~\.conda\envs\financial_qa\lib\site-packages\langchain\vectorstores\pgvector.py:453, in PGVector.from_texts(cls, texts, embedding, metadatas, collection_name, distance_strategy, ids, pre_delete_collection, **kwargs)
445 """
446 Return VectorStore initialized from texts and embeddings.
447 Postgres connection string is required
448 "Either pass it as a parameter
449 or set the PGVECTOR_CONNECTION_STRING environment variable.
450 """
451 embeddings = embedding.embed_documents(list(texts))
--> 453 return cls.__from(
454 texts,
455 embeddings,
456 embedding,
457 metadatas=metadatas,
458 ids=ids,
459 collection_name=collection_name,
460 distance_strategy=distance_strategy,
461 pre_delete_collection=pre_delete_collection,
462 **kwargs,
463 )
File ~\.conda\envs\financial_qa\lib\site-packages\langchain\vectorstores\pgvector.py:213, in PGVector.__from(cls, texts, embeddings, embedding, metadatas, ids, collection_name, distance_strategy, pre_delete_collection, **kwargs)
210 metadatas = [{} for _ in texts]
211 connection_string = cls.get_connection_string(kwargs)
--> 213 store = cls(
214 connection_string=connection_string,
215 collection_name=collection_name,
216 embedding_function=embedding,
217 distance_strategy=distance_strategy,
218 pre_delete_collection=pre_delete_collection,
219 **kwargs,
220 )
222 store.add_embeddings(
223 texts=texts, embeddings=embeddings, metadatas=metadatas, ids=ids, **kwargs
224 )
226 return store
TypeError: langchain.vectorstores.pgvector.PGVector() got multiple values for keyword argument 'connection_string'
@@bugbytes3923 nvm. The cause is there is another connection_string on virtual environment
Hi, I find this video very informative and easy to understand.
However, I am getting the below error
when downloading pgVector image: Error response from daemon: pull access denied for arcane/pgvector, repository does not exist or may require 'docker login': denied: requested access to the resource is denied"
try this:
docker pull pgvector/pgvector:pg16
How do we connect a local ollama version of this to langsmith?
Is there any way to do hybrid search with this? Meaning, is it possible to do something like keyword search or some other filtering before doing semantic similarity? Or is this kind of feature only available in specific paid vector databases?
hi, do you know what dimensions value should I use when creating vector column?
In this video, it should be 1536-dimensions. We used OpenAI's latest embedding model to create the embeddings, which has output dimensions of 1536.
platform.openai.com/docs/guides/embeddings/second-generation-models
@@bugbytes3923 thank you
Hi, i followed the steps you mentioned in blog but facing issue while connect and insert vectors to postgres database
Please find the error below:
texts = [d.page_content for d in documents]
^^^^^^^^^^^^^^
AttributeError: 'tuple' object has no attribute 'page_content'
can I use any open source LLM instead of openAI for this?
There's a table of the available integrations here: docs.trychroma.com/integrations
Cohere, Google Gemini, Hugging Face, Ollama and more are supported, in addition to OpenAI.
There's no mention of installing PostgreSQL first.
The installation is done via the Docker commands.
@@bugbytes3923 ers\Administrator> docker run --name pgvector5-demo -e POSTGRES_PASSWORD=test -p 5432:5432 ankane/pgvector
popen failure: Cannot allocate memory
initdb: error: program "postgres" is needed by initdb but was not found in the same directory as "/usr/lib/postgresql/15/bin/initdb"
Despite following the post steps several times, the error still appears. Maybe it's because I'm using Win10.
getting this ERROR: could not open extension control file "C:/Program Files/PostgreSQL/12/share/extension/vector.control": No such file or directory
Are you running Postgres in a Docker container?
If you're on windows, you'll need to install PGVector:
github.com/pgvector/pgvector?tab=readme-ov-file#windows
@ naah i got the fix, its basically ot change the port number of either docker image run or postgress local, if they are same then we get the above error
Anyways thanks and keep doing what you do!!
Hi, how to change default table names? like langchain_pg_collection to something else
Kindly help me with the below error..
When I try to execute CREATE EXTENSION vector I'm getting the below error
ERROR: Could not open extension control file "/usr/share/postgresql/16/extension/vector.control": No such file or directory.extension "vector" is not available
ERROR: extension "vector" is not available
SQL state: 0A000
Detail: Could not open extension control file "/usr/share/postgresql/16/extension/vector.control": No such file or directory.
Hint: The extension must first be installed on the system where PostgreSQL is running.
Note - both Postgres and pgvector running in docker
This: CREATE EXTENSION vector; , worked for me
And i used this docker: docker pull pgvector/pgvector:pg16
is it possible to do something using chroma db to load sql data in to vector db there are not a lot of resources and i need to learn that
hey ! how do i get the uuid of records of langchain_pg_embeddings table to delete it later.
What is the rationale behind calling embed_query vs embed_documents?
where can I get the notebook for this?
how to close pgvector connection, after everything is done.
-p 5432:5432
The postgresql and its pgvector have the same port mapping, is that right?
Yes, that's right!
how to do this with docs, csv and pptx files?
Raymond Plaza
Typo in the blogpost:
`CREATE EXTENSION vector;` instead of `CREATE EXTENSION pgvector;`
GPT FineTune and Embedings
Hi .. Im getting below error while running the CREATE EXTENSION vector query in Database. Can you please help,
ERROR: Could not open extension control file "C:/Program Files/PostgreSQL/16/share/extension/vector.control": No such file or directory.extension "vector" is not available
ERROR: extension "vector" is not available
SQL state: 0A000
Detail: Could not open extension control file "C:/Program Files/PostgreSQL/16/share/extension/vector.control": No such file or directory.
Hint: The extension must first be installed on the system where PostgreSQL is running.
could you find a solution to this issue?
Same issue I'm facing
😢khoa học hỏi
Kuvalis Parks
create extension vector; ( recent update )
super awesome! It will be great to see this integrated with django-ninja to build a chat with pdf (but without using chatgpt --something similar to this th-cam.com/video/rIV1EseKwU4/w-d-xo.html which is essentially from primordial privategpt....
Wiegand Turnpike