Would be nice to know, if we want to do something like this we first need to request access to the Titan Embeddings G1 - Text model. This was a great tutorial but it overshadowed setting up access Key/secret on AWS as well as requesting the model on AWS. Subscribing to hopefully see more guidance on how to set up things on the AWS side. Well done
Great content! It would be nice to learn how do you mounted the credentials on docker (I tried to follow what you answerd here in the coments to another person, but seems like some steps are missing to do the mounting with AWS CLI)
On the machine where you are running this, aws cli must be configured. You can test by simply running aws s3 ls If so, there is a file in /.aws/credentials which needs to be mounted on docker by passing -v flag Thanks
Yes I don't think why not. You have 2 options: 1) If you are already using PostGreSql, then you can use pgvector extension to make it a vector data store. 2) If you do not want to use that, or not using PostGreSql and still want to vectorize your data and store in a vector index, you can certainly do that. Steps will be something like: 1. Let's assume you are using ElasticSearch as Vector store 2. Read from table(s) , sql query, get the data 3. Call an embedding model with the column(s) of data you need to generate the embedding for. 4. Create a JSON object with plain text and embedding data. 5. Make Api calls to Elastic search to index the data On querying side: 1. Convert the query into embedding using the same model that you used to create the embedding while indexing 2. Run similarity search on Elastic Search. I am going to make a tutorial on something similar soon. Stay tuned. I will add a comment here with the same when ready. Thanks
Awesome video. I implemented and worked! I have a doubt, How do you decide what model uses? There are many models available on Bedrock and you are using anthropic one
It depends on the use case, data and computing resources at your disposal. For example, some embedding models outputs vectors of length 4096, some smaller ones 1536 or even 768. 4096 will have more hyperplanes I would assume than 768 but it will be more compute intensive to index and retrieve. There could be more reasons as well, like loss of accuracy etc.
Hi Thanks for sharing this tutorial, much appreciated. I am trying to upload a custom pdf file . But it fails at step Creating the Vector Store Admin module. Error : TypeError: expected string or bytes-like object, got 'NoneType'
@@MyCloudTutorials @palgorithm 0 seconds ago Hi, my request for anthropic claude model rejected. How to use that model then? I provided random details in the use case (company, website etc...)
Let´s say I have more than one pdf but are all related on the same topic. Do you remomend to use the same fass db file for all of the vectors? or 1 fass per pdf?
I have used PineCone and OpenSearch (ElasticSearch on AWS) for multiple documents indexing. Basically for similar topic, you want to create one index and append the embeddings & content in that index. I haven't used FAISS for such case, but if it does support then I will use one db file otherwise how will I load into one searchable index (I am not sure). I might try doing something and share if I find some solutions. Thanks
Hi, I am getting the error - ValueError: not enough values to unpack (expected 2, got 1) on line st.write(get_response(llm, faiss_index, question)). Could you please help?
Why was it rejected? Could you re-apply and mention that you are evaluating the model? If not, then you may need to use llama model but have to change the request / response accordingly in user.py
@@MyCloudTutorials Seems like once we provide the details regarding why we require the model...we can't edit those details again. Initially I mentioned random information in the company details, purpose etc... So I created another account and requested the model with appropriate purpose...then it was accepted
Hi girish, thank you so much for the tutorial, I am a bit confused about the pricing for titan, it says a really small cost per 1000 tokens, will end users query on already precomputed embeddings which we store? The costs incurred will only be for the initial computation, the storage and the query processing? Please continue the gen-ai series, learning so much.
Cost will be applied every time you use any model from bedrock (or openAI or any hosting platform). So you paid initial cost for Embedding your content and stored in a vector store. Every time a user queries, you are converting that query into a vector embedding (using the same model that you initially used) so a little cost is involved, then you make the search, find the similar document and send this all together to a LLM. There will be inferencing cost for input and output tokens from LLM at this point. I hope this clarifies. Thanks
Yes, Any embedding model should work. You have to use same embedding model for creating and querying. Once you containerize the application, you can run on AWS ECS (Fargate or EC2) or EKS. Thanks
It depends on how many calls you make to bedrock for embedding, querying etc. S3 storage is cheap. Overall it costed me less than $0.50 for this, but your costs may vary. Thanks
Well, if you do not want to use RAG (like I showed in this video) then you will have to take a Foundation Model and train it with your data. So, yes, it's possible but it would require more resources, time etc (depending upon the base model you choose and your data). For continuously moving data, RAG works out better as you don't have to constantly fine tune the models as and when data is available. Another thing to think about is cost. Once you train and create a custom model, you have to host it somewhere (whether cloud or in data center) which will incur some cost. I hope it helps. Thanks
@@MyCloudTutorials thank you so much..i was able to download..but am actually getting strange errors on windows..is there anyway i contact you..not a developer per se...
Hi there! Thank you so much for this tutorial! It's been very helpful! However, I was wondering how I would do it differently if I wanted to merge the admin and user side together such that the User themselves can upload their own pdf and then query the chatbot on their own document. If so, how many clients would I need and can I still RAG? I suppose 1 S3 for storing the PDF, and 1 Bedrock for the vector embeddings. Do I need anymore clients for the LLM? I currently have an LLM I can chat with using ConversationBufferMemory and ConversationChain using predict. Thank you once again for your help!
I am glad you liked the tutorial. About the case you mentioned: You can merge both admin and user application. A little different approach can be taken. I am assuming you have the user logged in (so you can get their id) You can ask user to upload the pdf and show them "processing" kind of spinner, while you generate the embeddings and create vector index. If you need to store their vector index, you can always make a subfolder in S3 with their userid and may be assign this processing a job For example, s3://// You may want to consider saving this meta information to a Database like DynamoDB or Mysql so you know where to load the index from for this particular user. You can ask them to upload to S3 and trigger an event that creates the index. There are several possible ways to solve this, it depends on the use case and business requirements. Please let me know your thoughts! Thanks
Langchain has Contextual memory. Please check this: js.langchain.com/v0.1/docs/modules/memory/ I will make a video soon, once ready, I will put a link in this comment. Thanks
Im getting an error when im trying to run the code. ValueError: Error raised by inference endpoint: Unable to locate credentials Traceback: File "C:\Users\DELL\Desktop\bedrock-chat-with-pdf\venv_bedrock\Lib\site-packages\streamlit untime\scriptrunner\script_runner.py", line 600, in _run_script exec(code, module.__dict__) File "C:\Users\DELL\Desktop\bedrock-chat-with-pdf\Admin\admin.py", line 88, in main() File "C:\Users\DELL\Desktop\bedrock-chat-with-pdf\Admin\admin.py", line 78, in main result = create_vector_store(request_id, splitted_docs) Can you help me with this?
Hi @KeerthanaPriyaDevaraj Do you have AWS Cli installed and setup with IAM credential that has access to Bedrock? If not, Please follow AWS documentation for installing AWS Cli from here: docs.aws.amazon.com/cli/latest/userguide/getting-started-install.html If you already have an IAM user created, create access_key and secret_access_key in the IAM console and download the credentials. Use `aws configure` command to setup the CLI. Also, make sure that you have the models access enabled in the Bedrock console. I hope this helps. Thanks
Very much possible in Lambda. If you are using langchain with Lambda, you need to include that as dependency. I shall make a video on how to do it in lambda, no ETA as of now, but I will surely tag you when I post a video on that. Thanks
@@MyCloudTutorialsI'm trying, but I have an error: "Unable to import module" for "langchain community", I generated a layer in the lambda function, but it doesn't work, I'm waiting for your video, thanks
Would be nice to know, if we want to do something like this we first need to request access to the Titan Embeddings G1 - Text model. This was a great tutorial but it overshadowed setting up access Key/secret on AWS as well as requesting the model on AWS. Subscribing to hopefully see more guidance on how to set up things on the AWS side. Well done
It will be helpful if you provide the AWS setup of Bedrock in a video
Great content! It would be nice to learn how do you mounted the credentials on docker (I tried to follow what you answerd here in the coments to another person, but seems like some steps are missing to do the mounting with AWS CLI)
On the machine where you are running this, aws cli must be configured. You can test by simply running aws s3 ls
If so, there is a file in /.aws/credentials which needs to be mounted on docker by passing -v flag
Thanks
Thank you Girish. Its very very good tutorial!
You are welcome. I am glad you liked it.
Hello sir, great tutorial, can we do the exact same thing with other type of data, not only pdf, sql tables for example
Yes I don't think why not.
You have 2 options:
1) If you are already using PostGreSql, then you can use pgvector extension to make it a vector data store.
2) If you do not want to use that, or not using PostGreSql and still want to vectorize your data and store in a vector index, you can certainly do that.
Steps will be something like:
1. Let's assume you are using ElasticSearch as Vector store
2. Read from table(s) , sql query, get the data
3. Call an embedding model with the column(s) of data you need to generate the embedding for.
4. Create a JSON object with plain text and embedding data.
5. Make Api calls to Elastic search to index the data
On querying side:
1. Convert the query into embedding using the same model that you used to create the embedding while indexing
2. Run similarity search on Elastic Search.
I am going to make a tutorial on something similar soon. Stay tuned. I will add a comment here with the same when ready.
Thanks
Awesome video. I implemented and worked! I have a doubt, How do you decide what model uses? There are many models available on Bedrock and you are using anthropic one
It depends on the use case, data and computing resources at your disposal. For example, some embedding models outputs vectors of length 4096, some smaller ones 1536 or even 768. 4096 will have more hyperplanes I would assume than 768 but it will be more compute intensive to index and retrieve.
There could be more reasons as well, like loss of accuracy etc.
Hi Thanks for sharing this tutorial, much appreciated. I am trying to upload a custom pdf file . But it fails at step Creating the Vector Store Admin module. Error : TypeError: expected string or bytes-like object, got 'NoneType'
This error normally occur when the pdf reader is not able to parse pdf file. Could you try with a few different pdfs and see if it works.
Thanks
Thank you so much! it was an amazing tutorial
I am glad you liked it.Thanks
@@MyCloudTutorials
@palgorithm
0 seconds ago
Hi, my request for anthropic claude model rejected. How to use that model then? I provided random details in the use case (company, website etc...)
Let´s say I have more than one pdf but are all related on the same topic. Do you remomend to use the same fass db file for all of the vectors? or 1 fass per pdf?
I have used PineCone and OpenSearch (ElasticSearch on AWS) for multiple documents indexing. Basically for similar topic, you want to create one index and append the embeddings & content in that index.
I haven't used FAISS for such case, but if it does support then I will use one db file otherwise how will I load into one searchable index (I am not sure). I might try doing something and share if I find some solutions.
Thanks
Hi, I am getting the error - ValueError: not enough values to unpack (expected 2, got 1) on line st.write(get_response(llm, faiss_index, question)). Could you please help?
You should put some debug (print) in the get_response method and see where the issue is. May be the LLM didn't respond with data.
Thanks
@@MyCloudTutorials Thanks for the reply. I was using the same model for embeddings and the llm . I have updated that and its working now.
Sir the embedding model which you are utilizing is paid or free trial are available?
Its paid model. But it charges based on how many tokens you use.
Thanks
Hi, my request for anthropic claude model rejected. How to use that model then?
Why was it rejected? Could you re-apply and mention that you are evaluating the model?
If not, then you may need to use llama model but have to change the request / response accordingly in user.py
@@MyCloudTutorials Seems like once we provide the details regarding why we require the model...we can't edit those details again. Initially I mentioned random information in the company details, purpose etc... So I created another account and requested the model with appropriate purpose...then it was accepted
Hi girish, thank you so much for the tutorial, I am a bit confused about the pricing for titan, it says a really small cost per 1000 tokens, will end users query on already precomputed embeddings which we store? The costs incurred will only be for the initial computation, the storage and the query processing? Please continue the gen-ai series, learning so much.
Cost will be applied every time you use any model from bedrock (or openAI or any hosting platform).
So you paid initial cost for Embedding your content and stored in a vector store.
Every time a user queries, you are converting that query into a vector embedding (using the same model that you initially used) so a little cost is involved, then you make the search, find the similar document and send this all together to a LLM. There will be inferencing cost for input and output tokens from LLM at this point.
I hope this clarifies.
Thanks
Is it possible to use google llm model and embeddings for the project. and if i want to upload this project on cloud can i use AWS Fargate
Yes, Any embedding model should work. You have to use same embedding model for creating and querying. Once you containerize the application, you can run on AWS ECS (Fargate or EC2) or EKS.
Thanks
@@MyCloudTutorials thanks 👍
where can i get your medical pdf to practice along?
You can find it online, search for arthritis pdf.
I downloaded from www.versusarthritis.org/media/22726/what-is-arthritis-information-booklet.pdf
How much does it cost for this particular project on aws bedrock?
It depends on how many calls you make to bedrock for embedding, querying etc. S3 storage is cheap.
Overall it costed me less than $0.50 for this, but your costs may vary.
Thanks
Hey there, is it possible for the user to upload their own pdf and the ai model to become a chatbot
Well, if you do not want to use RAG (like I showed in this video) then you will have to take a Foundation Model and train it with your data. So, yes, it's possible but it would require more resources, time etc (depending upon the base model you choose and your data).
For continuously moving data, RAG works out better as you don't have to constantly fine tune the models as and when data is available.
Another thing to think about is cost. Once you train and create a custom model, you have to host it somewhere (whether cloud or in data center) which will incur some cost.
I hope it helps.
Thanks
Do u have code download for the same..very good tutorial..
Thank you.
The source code is available in Github:
github.com/mycloudtutorials/generative-ai-demos/tree/master/bedrock-chat-with-pdf
@@MyCloudTutorials thank you so much..i was able to download..but am actually getting strange errors on windows..is there anyway i contact you..not a developer per se...
Hi there! Thank you so much for this tutorial! It's been very helpful! However, I was wondering how I would do it differently if I wanted to merge the admin and user side together such that the User themselves can upload their own pdf and then query the chatbot on their own document.
If so, how many clients would I need and can I still RAG? I suppose 1 S3 for storing the PDF, and 1 Bedrock for the vector embeddings. Do I need anymore clients for the LLM? I currently have an LLM I can chat with using ConversationBufferMemory and ConversationChain using predict.
Thank you once again for your help!
I am glad you liked the tutorial.
About the case you mentioned: You can merge both admin and user application.
A little different approach can be taken. I am assuming you have the user logged in (so you can get their id)
You can ask user to upload the pdf and show them "processing" kind of spinner, while you generate the embeddings and create vector index. If you need to store their vector index, you can always make a subfolder in S3 with their userid and may be assign this processing a job
For example,
s3:////
You may want to consider saving this meta information to a Database like DynamoDB or Mysql so you know where to load the index from for this particular user.
You can ask them to upload to S3 and trigger an event that creates the index.
There are several possible ways to solve this, it depends on the use case and business requirements.
Please let me know your thoughts!
Thanks
Hi I need to provide memory for remembering the context in the chatbot, could you please suggest me the way how I can implement to the above code.
Langchain has Contextual memory. Please check this: js.langchain.com/v0.1/docs/modules/memory/
I will make a video soon, once ready, I will put a link in this comment.
Thanks
Im getting an error when im trying to run the code.
ValueError: Error raised by inference endpoint: Unable to locate credentials
Traceback:
File "C:\Users\DELL\Desktop\bedrock-chat-with-pdf\venv_bedrock\Lib\site-packages\streamlit
untime\scriptrunner\script_runner.py", line 600, in _run_script
exec(code, module.__dict__)
File "C:\Users\DELL\Desktop\bedrock-chat-with-pdf\Admin\admin.py", line 88, in
main()
File "C:\Users\DELL\Desktop\bedrock-chat-with-pdf\Admin\admin.py", line 78, in main
result = create_vector_store(request_id, splitted_docs)
Can you help me with this?
Hi @KeerthanaPriyaDevaraj
Do you have AWS Cli installed and setup with IAM credential that has access to Bedrock?
If not, Please follow AWS documentation for installing AWS Cli from here: docs.aws.amazon.com/cli/latest/userguide/getting-started-install.html
If you already have an IAM user created, create access_key and secret_access_key in the IAM console and download the credentials.
Use `aws configure` command to setup the CLI.
Also, make sure that you have the models access enabled in the Bedrock console.
I hope this helps.
Thanks
Is possible lambda aws ?
Very much possible in Lambda. If you are using langchain with Lambda, you need to include that as dependency. I shall make a video on how to do it in lambda, no ETA as of now, but I will surely tag you when I post a video on that.
Thanks
@@MyCloudTutorialsI'm trying, but I have an error: "Unable to import module" for "langchain community", I generated a layer in the lambda function, but it doesn't work, I'm waiting for your video, thanks