I've been looking out/waiting for this functionality to pop up somewhere for months! Thanks a million for the heads up! Gonna go try that this evening.
Connecting something like this with open source projects would be great to attract more people to contribute, as it will save a lot of time trying to understand all the code.
I tried this tool and I don’t recommend it. They say it’s open source but need an auth key to use it. You can’t get away from giving them your information. They give you a 15 day free trial for the front end. It’s a waste of time and so now I know why more people don’t use it.
Dude this is awesome! So many ideas spinning in my head. Most interesting project for me would be to have a vdb of all the posts from my best friends in our group chat, and we also meet every week where we talk about a movie or book or tv show that we all agree to watch, and I have recordings of all that. Would be great to create this and integrate it into our private slack or discord channel as an interactive bot. "What would think about this?" "Answer this question in the form of . Hilarity should ensue and maybe some really interesting use cases.
if your goal in reading books is to answer specific questions that you already have, then I think the answer is probably yes. But the thing about getting answers to questions is that it's only helpful when there's a gap in your knowledge that you're aware of - I think a lot of the knowledge to be gained from reading books comes from filling in gaps in our knowledge that we didn't even know existed.
Thank you, very interesting. A question: lets say you have 1000 documents stored, and you ask a natural language question to a system like this, am I correct in assuming it does this: - take the question and transform it into a vectorised search - takes the top results, look up the original text - feed these resulting texts into a natural language bot like chatgpt - have the bot write an answer based on this data If so, would this not be severely limiting to what you can ask it? It would be a bit like the Bing integration of GPT-4, it has no overview of the information, it can only get snippets then, due to the limited input buffer of GPT models. Would this mean that it is impossible to get for example, a summary of information in some documents you stored? (since that would require full document access, not snippets and search results)
I believe what you describe is roughly how it works. To your point, it seems like there'd be limitations of this approach - for example if part of a document was deemed relevant to answering a question, but additional context was provided elsewhere in the knowledge box that changes the meaning of the excerpt deemed "relevant". Maybe that scenario would yield an answer that isn't accurate. I haven't personally tested these boundaries myself, so I can't say for sure whether this issue exists and whether it is severe or not. It seems like LLMs that have accept massive prompts (tens of thousands of tokens) are an active area of research and would probably help if this problem does exist.
@@codetothemoon Yes I remember seeing a paper on models with over 10k tokens without much performance degradation. That would certainly help. Thank you for the reply, and really enjoy your videos!
thanks so much for the kind words! I'd love to make such a tutorial - just need to figure out what the right medium for distributing such a thing would be. In the meantime, check out this video from Jeff of Fireship if you haven't already - he's got some pretty helpful tips - th-cam.com/video/N6-Q2dgodLs/w-d-xo.html
my understanding is that it first does a semantic search of your entire knowledge box to find information that is likely relevant to answering the question, then feeds all the relevant information, along with the question, into an LLM. Nuclia says there are also some other clever techniques involved that they haven't gone over with me, but that's the high level idea.
Rewind is also doing this tied with locally recording everything you do in a day on your computer so you can search backwards every content you have viewed and meetings you have had etc. Would love to figure out how I can vectorize a large sql server in some way to have conversational queries against data within the system.
I miss the coding aspects of your channel. I feel like I’m just watching adverts from the highest bidder now. This is a trend among quite a few channels nowadays.
thanks for the feedback! ooc are you not interested in vector databases in general, or you are you interested but just want to see things more from the perspective of the code? you may not have made it this far, but fwiw in this one the coding part starts at about 14:45. I think it's important for software developers to have vector databases on their radar and in their toolbox these days, be it Nuclia or something else.
@@codetothemoon I’m interested in learning I guess. I’m definitely interested in all things databases and would love to see a rudimentary implementation. I’d guess, like me, most of your viewers are rustaceans who are looking to enhance our understanding after reading the usual books.
Well he has to maintain the channel and provide us good content, and to be fair, Nuclia is a nice option for vectorDBing. Keep em coming, @codetothemoon !
Honestly, coming from web dev with javascript, I find these videos highly informative. Would happily sit through a bunch of ads for the comp sci info alone. That said, nuclia is actually useful, and I plan to work with it this weekend. Spent last weekend setting up private GPT only to realise the limitations, this looks so much more polished.
Dude, you have the best youtube channel in terms of helping individuals with minimal programmable skills to learn and use advanced topics, and thank you for going in and explaining everything, I appreciate that.
thank you so much for the kind words! I definitely aim to make these as accessible as possible, I'm very happy whenever I hear that I've been successful 😎
Thanks for the video, this topic very much interests me. But why do you keep saying `semantic meaning`? The word semantic means "relating to meaning in language". Just say meaning or semantic not both.
I've been looking out/waiting for this functionality to pop up somewhere for months! Thanks a million for the heads up! Gonna go try that this evening.
nice! yeah it's definitely worth taking for a spin.
Connecting something like this with open source projects would be great to attract more people to contribute, as it will save a lot of time trying to understand all the code.
Your videos are very valuable and of high quality, I feel you are very knowledgeable. Thank you so much!
thanks for the kind words😎
This was a really very informative and helpful video! Just what I needed! Thank you so much!
glad you got something out of it!
I tried this tool and I don’t recommend it. They say it’s open source but need an auth key to use it. You can’t get away from giving them your information. They give you a 15 day free trial for the front end. It’s a waste of time and so now I know why more people don’t use it.
Dude this is awesome!
So many ideas spinning in my head. Most interesting project for me would be to have a vdb of all the posts from my best friends in our group chat, and we also meet every week where we talk about a movie or book or tv show that we all agree to watch, and I have recordings of all that.
Would be great to create this and integrate it into our private slack or discord channel as an interactive bot. "What would think about this?" "Answer this question in the form of . Hilarity should ensue and maybe some really interesting use cases.
So potentially could upload a book into a knowledge base and query it in natural language ergo making reading books close to obsolete?
If you think that knowing things without having to ask an llm is useless, then sure
How exactly do you use books...
if your goal in reading books is to answer specific questions that you already have, then I think the answer is probably yes. But the thing about getting answers to questions is that it's only helpful when there's a gap in your knowledge that you're aware of - I think a lot of the knowledge to be gained from reading books comes from filling in gaps in our knowledge that we didn't even know existed.
@@codetothemoon fair point
Love your videos 👍🏻 and „great, good, excellent“ on the opposite end of a cartesian plane to „JavaScript“ 😂
thank you! Glad someone caught this, was worried the joke would slip past everyone 😎
Wow this product looks great. I can tell it's gonna be very successful !
I agree that the potential here is huge!
Thank you, very interesting. A question: lets say you have 1000 documents stored, and you ask a natural language question to a system like this, am I correct in assuming it does this:
- take the question and transform it into a vectorised search
- takes the top results, look up the original text
- feed these resulting texts into a natural language bot like chatgpt
- have the bot write an answer based on this data
If so, would this not be severely limiting to what you can ask it? It would be a bit like the Bing integration of GPT-4, it has no overview of the information, it can only get snippets then, due to the limited input buffer of GPT models. Would this mean that it is impossible to get for example, a summary of information in some documents you stored? (since that would require full document access, not snippets and search results)
I believe what you describe is roughly how it works. To your point, it seems like there'd be limitations of this approach - for example if part of a document was deemed relevant to answering a question, but additional context was provided elsewhere in the knowledge box that changes the meaning of the excerpt deemed "relevant". Maybe that scenario would yield an answer that isn't accurate. I haven't personally tested these boundaries myself, so I can't say for sure whether this issue exists and whether it is severe or not. It seems like LLMs that have accept massive prompts (tens of thousands of tokens) are an active area of research and would probably help if this problem does exist.
@@codetothemoon Yes I remember seeing a paper on models with over 10k tokens without much performance degradation. That would certainly help. Thank you for the reply, and really enjoy your videos!
Really enjoy all your videos, you do a great job. I would be really interested in a tutorial about how you make your videos.
thanks so much for the kind words! I'd love to make such a tutorial - just need to figure out what the right medium for distributing such a thing would be. In the meantime, check out this video from Jeff of Fireship if you haven't already - he's got some pretty helpful tips - th-cam.com/video/N6-Q2dgodLs/w-d-xo.html
@@codetothemoon Thanks, I'll check it out
How does it do the question answering? What is going on under the hood?
my understanding is that it first does a semantic search of your entire knowledge box to find information that is likely relevant to answering the question, then feeds all the relevant information, along with the question, into an LLM. Nuclia says there are also some other clever techniques involved that they haven't gone over with me, but that's the high level idea.
Rewind is also doing this tied with locally recording everything you do in a day on your computer so you can search backwards every content you have viewed and meetings you have had etc. Would love to figure out how I can vectorize a large sql server in some way to have conversational queries against data within the system.
I miss the coding aspects of your channel. I feel like I’m just watching adverts from the highest bidder now. This is a trend among quite a few channels nowadays.
thanks for the feedback! ooc are you not interested in vector databases in general, or you are you interested but just want to see things more from the perspective of the code? you may not have made it this far, but fwiw in this one the coding part starts at about 14:45. I think it's important for software developers to have vector databases on their radar and in their toolbox these days, be it Nuclia or something else.
@@codetothemoon I’m interested in learning I guess. I’m definitely interested in all things databases and would love to see a rudimentary implementation.
I’d guess, like me, most of your viewers are rustaceans who are looking to enhance our understanding after reading the usual books.
Well he has to maintain the channel and provide us good content, and to be fair, Nuclia is a nice option for vectorDBing.
Keep em coming, @codetothemoon !
Honestly, coming from web dev with javascript, I find these videos highly informative. Would happily sit through a bunch of ads for the comp sci info alone. That said, nuclia is actually useful, and I plan to work with it this weekend. Spent last weekend setting up private GPT only to realise the limitations, this looks so much more polished.
i replicated the code of this tutorial letter for letter, but the results wont print, it doesnt seem to enter the for loop
Lol at the subtle dunk on JS 1:30
Dude, you have the best youtube channel in terms of helping individuals with minimal programmable skills to learn and use advanced topics, and thank you for going in and explaining everything, I appreciate that.
thank you so much for the kind words! I definitely aim to make these as accessible as possible, I'm very happy whenever I hear that I've been successful 😎
Thank you for this. Any clues as to the pricing of the api? Couldn't find anything on their website (they need a little search box themselves :P)
Damn this is inspiring. Gonna try it out making a knowledge Db
nice! would love to hear how it goes.
I wonder if I add this video to NucliaDB and someone searches for Tupac, they will arrive at this video 14:43 😂
Thank you for this fantastic content!
Thanks for the video, this topic very much interests me.
But why do you keep saying `semantic meaning`? The word semantic means "relating to meaning in language". Just say meaning or semantic not both.
👍Thanks.
thanks for watching!
😎🤖
🚀
Rust nation!
I request you to not use Good, great, excellent words with javascript.
😆
Right in front if my face 😂
Poor JavaScript 🤣
don't feel sorry for it! 🙃
I was excited about this until I learned you need a professional web account.. I will stick with flexgpt.
yeah my guess is that they do that to throttle bot signups, maybe try using login with Github or Google Workspaces?
%s/sytem/system/g
I know! I didn't notice this until after posting the video, was hoping it'd slip under everyone's radar, but you caught me.... 😎