If you already have a Postgresql database, this makes sense, but if you are starting from scratch, you could also use vector databases that allow hybrid search, which most do. and this also allows for more options in respect to embedding models which could be very critical for a given use case.
Interesting take on embedding management! Love the sync mechanics. What would you say are the main commonalities and differences between pgvector, pgvectorscale and pgai?
Cool video! At around 8:35 you have a Python file open and select some lines which are then run and shown in an interactive Python environment on the right side. This is also awesome! How can this be done? Is this a VS Code extension? Edit: Or do you run the lines by highlighting them and the pressing shift + enter?
This is really great! I've been running into so many limitations using chromadb and redis (with redisearch). ChromaDB in particular uses a lot of memory as the number of embeddings increases, which gets very expensive to host. Is that also the case with pgai? Is it affordable? Also, random question, but what vscode extension are you using to run the script as a jupyter notebook at th-cam.com/video/8oTnUtFYAes/w-d-xo.htmlsi=J21f6uJF09ipZ7sj&t=903?
Can I do this with celery worker with more control over chunking logic / metadata handling or is this something that pgai can already do? I'm using Supabase, which doens't support pgvectorscal nor pgai. Not sure if I should host just plain postgreSQL with these extensions rather than using Supabase with pgvector extension, which they support. Seems like handling embedding related coding is more managable than handling vanilla postgreSQL though.
I’ve been waiting for a video like this as I was quite interested in using timescale db
If you already have a Postgresql database, this makes sense, but if you are starting from scratch, you could also use vector databases that allow hybrid search, which most do. and this also allows for more options in respect to embedding models which could be very critical for a given use case.
Interesting take on embedding management! Love the sync mechanics. What would you say are the main commonalities and differences between pgvector, pgvectorscale and pgai?
Cool video! At around 8:35 you have a Python file open and select some lines which are then run and shown in an interactive Python environment on the right side. This is also awesome! How can this be done? Is this a VS Code extension?
Edit: Or do you run the lines by highlighting them and the pressing shift + enter?
Excellent! 🙏for sharing!
My pleasure!
Great video! Can you make a video about security for RAG-Applications, options to avoid exploits etc?
Keep it up bro
Awesome
where is lexical search ? even a separate lexical search index ?
Check out this video: th-cam.com/video/TbtBhbLh0cc/w-d-xo.html
is there a way to not use OPENAI for the embedding?
This is really great! I've been running into so many limitations using chromadb and redis (with redisearch). ChromaDB in particular uses a lot of memory as the number of embeddings increases, which gets very expensive to host. Is that also the case with pgai? Is it affordable? Also, random question, but what vscode extension are you using to run the script as a jupyter notebook at th-cam.com/video/8oTnUtFYAes/w-d-xo.htmlsi=J21f6uJF09ipZ7sj&t=903?
How efficient and feasible pgvector store i wanna scale it to production?!
It can handle production workloads with millions of vectors using pgvectorscale
@daveebbelaar also if i use timescale db image, wouldn't it cause any trouble down the line?
@ nope, its just Postgres in the backend with added extensions.
First #notificationGang
You fast haha!
Can I do this with celery worker with more control over chunking logic / metadata handling or is this something that pgai can already do?
I'm using Supabase, which doens't support pgvectorscal nor pgai. Not sure if I should host just plain postgreSQL with these extensions rather than using Supabase with pgvector extension, which they support.
Seems like handling embedding related coding is more managable than handling vanilla postgreSQL though.
Turns out, celerey worker can do this. Although, it's quite tricky to implement when the database gets updated very frequently.