I am currently developing a conversational AI application, similar to ChatGPT, which will be integrated within a financial investment platform. To maintain context throughout the conversation, I would appreciate guidance on the optimal implementation approach using LangChain, Azure OpenAPI, and a vector database.
Since you’re developing an app with zero users, why would you want the ultimate optimal method used by big tech companies handling way more traffic than your tiny app?
Start small. I am building a chatbot as well. You can maintain context by storing recent messages, let's say last 10 messages. At the 11th message, you summarize past 10 messages, and store. Then with each request to the llm, pass this summarized context as well as the user query. If your app scales, then you can come to several other approaches of optimization.
Thank you for all the amazing content and hard work you put into your videos, Yogita! Since caching and prefetching documents used for faster response times. How can cache invalidation be handled, especially when dealing with new data or updates in the vector database?
My two favourite people in one video ❤. It was pretty good , terms like hsnw , embeddings are first time for me, also the explanation of vector db is pretty easy to understand. I loved the way of breaking down the problem into smaller problem and solving it one by one . Informative video ❤
I don't get why Vector DBs are being used here. If we're searching for some documents, checking cosine similarity between indexed documents and user queries makes sense, but not for simple sequence completion. Am I missing something? Please help. Edit: Okay, I looked it up at its fairly simple, ChatGPT allows Retrieval Augmented Generation now, which in simple terms is automated/AI-assisted prompt engineering. The Vector DB part just finds the relevant documents/chunks of documents and simply adds them to the prompt, which improves output quality. This video explains it quite well, it's actually much simpler than I expected: th-cam.com/video/Ylz779Op9Pw/w-d-xo.html&ab_channel=ShawTalebi
I did find your previous contents really helpful. But I am pretty sure Sam Altman didn't disclose the backend / system design of OpenAI to you both. As everyone knows, system design rounds are open-ended and no solution is perfect, you are just speculating what they might have thought while designing ChatGPT backend. Please backup your videos with research paper(s) or any other documentations/whitepapers before you just discuss any system design of a company like OpenAI on channel where many people come for interview preparations. - FAANG engineer
We're happy you found our previous videos helpful, but the goal of these mock interviews is to showcase a thought process, not necessarily reveal the exact design of a system.
What's this? Does he even understand what GPT is? In the name of system design and making money, you all have made theory useless. Read something before doing all these BS
Something about this mock interview seems unnatural. He proposes a design but doesn’t really go in depth, and tries to compensate for this by quoting white papers or saying “I’m choosing this Algo because it’s easy to understand for ME”. Overall, this sound like a solution that would give you a mid level engineer offer.
Not really. There are other videos you made like how to do resource estimation which I’ve liked. But there are other channels like Jordan Has No Life that go into way more depth on specific questions, and reading some of DDIA makes it obvious why this solution wouldn’t get beyond a mid level engineer offer.
First of all, one interview does not decide the level of offer, for senior roles there are mostly 2-3 system design rounds and it is okay for any candidate to do okayish in one and really good in another one. We spent time reading papers and recording a video that can provide some value to viewers. If you cannot appreciate it, there is no point being negative in comments. Even better, you can create your channel and do a better job than us. We will be cheering for you. Thanks.
Indeed he is very talented, but don't know why he wanted to go into Vector DB, and present as if Chat gpt like tool relies on Vector db only. That is not even ml or gen ai or anything. Yes the embedding model is an AI which creates vectors out of text. Actually the question regarding system design is considering the AI model as black box, and creating something like a RAG model but handling the chat history and limitation.
Awesome video on system design
Great work as always! This looks like Retrieval Augmented Generation if I am not wrong.
Great choice of problem statement
Pushing the algorithm ❤
I am currently developing a conversational AI application, similar to ChatGPT, which will be integrated within a financial investment platform. To maintain context throughout the conversation, I would appreciate guidance on the optimal implementation approach using LangChain, Azure OpenAPI, and a vector database.
Since you’re developing an app with zero users, why would you want the ultimate optimal method used by big tech companies handling way more traffic than your tiny app?
Start small. I am building a chatbot as well. You can maintain context by storing recent messages, let's say last 10 messages. At the 11th message, you summarize past 10 messages, and store. Then with each request to the llm, pass this summarized context as well as the user query. If your app scales, then you can come to several other approaches of optimization.
Very informative.
This is more of AI chatbot design not a chatGPT kinda system design. Good try as always!!
Thank you for all the amazing content and hard work you put into your videos, Yogita!
Since caching and prefetching documents used for faster response times. How can cache invalidation be handled, especially when dealing with new data or updates in the vector database?
good
thanks a lot , what kind of tool are you using draw the daigrams here ? could you please let me know
Miro
Please explain embedding . It wasnt answered..
My two favourite people in one video ❤. It was pretty good , terms like hsnw , embeddings are first time for me, also the explanation of vector db is pretty easy to understand. I loved the way of breaking down the problem into smaller problem and solving it one by one .
Informative video ❤
Bring Arpit B also.
👊
scary stuff
I don't get why Vector DBs are being used here. If we're searching for some documents, checking cosine similarity between indexed documents and user queries makes sense, but not for simple sequence completion. Am I missing something? Please help.
Edit: Okay, I looked it up at its fairly simple, ChatGPT allows Retrieval Augmented Generation now, which in simple terms is automated/AI-assisted prompt engineering. The Vector DB part just finds the relevant documents/chunks of documents and simply adds them to the prompt, which improves output quality. This video explains it quite well, it's actually much simpler than I expected: th-cam.com/video/Ylz779Op9Pw/w-d-xo.html&ab_channel=ShawTalebi
She said to assume that intelligence is provided by the model, consider it a black box. Why would you go on and talk about the crawler then?
This is kind of RAG chatbot not ChatGPT anyway nice try 👍
Sounds more like low level design. I cannot really connect with this one.
I did find your previous contents really helpful. But I am pretty sure Sam Altman didn't disclose the backend / system design of OpenAI to you both. As everyone knows, system design rounds are open-ended and no solution is perfect, you are just speculating what they might have thought while designing ChatGPT backend. Please backup your videos with research paper(s) or any other documentations/whitepapers before you just discuss any system design of a company like OpenAI on channel where many people come for interview preparations.
- FAANG engineer
We're happy you found our previous videos helpful, but the goal of these mock interviews is to showcase a thought process, not necessarily reveal the exact design of a system.
What's this? Does he even understand what GPT is? In the name of system design and making money, you all have made theory useless. Read something before doing all these BS
At least we are trying. Besides, if you would Google any of us you would know that we really don’t have to depend on money from TH-cam lol
Hnnm@@sudocode
This is खिचड़ी. these bhaiya and didis can stop the world with a single "#". haha.
not favoring anyone but who ever is the best can you teach us on utube pls i need it
@@ankushroy5606 I can do that but for it won't be free
first!!! im the first
Something about this mock interview seems unnatural. He proposes a design but doesn’t really go in depth, and tries to compensate for this by quoting white papers or saying “I’m choosing this Algo because it’s easy to understand for ME”. Overall, this sound like a solution that would give you a mid level engineer offer.
Jealous much Mr Mystiks?
Not really. There are other videos you made like how to do resource estimation which I’ve liked. But there are other channels like Jordan Has No Life that go into way more depth on specific questions, and reading some of DDIA makes it obvious why this solution wouldn’t get beyond a mid level engineer offer.
First of all, one interview does not decide the level of offer, for senior roles there are mostly 2-3 system design rounds and it is okay for any candidate to do okayish in one and really good in another one. We spent time reading papers and recording a video that can provide some value to viewers. If you cannot appreciate it, there is no point being negative in comments. Even better, you can create your channel and do a better job than us. We will be cheering for you. Thanks.
@@sudocodetry making a more in depth video next time, it’ll help your interviewing skills and the channel. read some DDIA then come back.
Indeed he is very talented, but don't know why he wanted to go into Vector DB, and present as if Chat gpt like tool relies on Vector db only.
That is not even ml or gen ai or anything. Yes the embedding model is an AI which creates vectors out of text.
Actually the question regarding system design is considering the AI model as black box, and creating something like a RAG model but handling the chat history and limitation.