Hi Sushil, You can easily use boto3 client to save the content to S3. The code is something similar to this. import boto3 S3_RESOURCE = boto3.resource('s3') MY_BUCKET = S3_RESOURCE.Bucket('your bucket name') req = 'this is your content' MY_BUCKET.put_object(Key=''your_s3_object_key', Body= req) Let me know how it goes
@@lavanyasivaperumal Do you get any error in the cloudwatch log? Try adding 'S3FullPermission' to the role which attached to the lambda. Hope this is a permission issue
Hi @user, Glad that helps you. I am making a video for async (start_document_analysis) and will include csv part of that. Where do you stuck in terms of csv/db? If you need any help urgent, please drop a email to this johnsonp908060@gmail.com
Hi @user, This video show how to get table data into a object. Now you are good to save into any file format as you need. th-cam.com/video/BNnFfTZsmjc/w-d-xo.html
do you mean how does the lambda trigger? if so, you can add a Event notifications for the S3 bucket which can trigger the lambda. please check this for more info docs.aws.amazon.com/AmazonS3/latest/userguide/EventNotifications.html Hope this helps you. :)
@@lovetocode4486 i did the part when we use the bucket as the trigger but my code giveme an error that says that it doesnt recognices this line " file_obj = event["Records"][0]" and i was thinking it has something to be with the event of the file
while using console and tryinig to detect text for sample image i am facing this error "Textract encountered an error while processing this document." Any idea about it. Thankyou!!
Hi Yash, It is hard to say why you are getting the error without more info. Do you get any error if you use the Python code I mentioned above for the same image?
I did not get any output. I also checked the logs and there are no errors. Did we need to substitute some values in the lambda function code before triggering the process? Thanks for your help.
Hi Mate, As per below code piece, the lambda is triggered by S3 event. github.com/CodeSam621/Demo/blob/e5b5f538d41411478981008e8d9956d83d9da5e1/AWSTextract/lambda_function.py#L63 If you are triggering the lambda using the testing utility of lambda, then you need to send the event object which has " event["Records"][0]" data
@@lovetocode4486 Yeah, I triggered the lambda by placing the file in the S3 bucket but it produced no output and there were no errors in the logs. I was just curious if you have seen this issue before and if you had any suggestions what the issue might be?
@@betallyoungattractive644 , What I recommend to do are: - Add log (very beginning of the lambda function) to make sure that the lambda is trigger. Then upload a file to S3. This add some logs to the cloudWatch logs. - Once you verified above step, add necessary logs to see where it broke.
HI Saving to the S3 is easy. Just pass the bucket name, file_Id and the content(json) to below method. ``` import os import json from boto3 import resource import uuid import logging logger = logging.getLogger() logging.basicConfig() logger.setLevel(logging.INFO) S3_RESOURCE = resource('s3') def save_payload(bucket, key, content): COI_BUCKET = S3_RESOURCE.Bucket(bucket) response = COI_BUCKET.put_object(Key=key, Body=json.dumps(content)) logger.info(f'save_payload to s3. Response: {response}') return response ```
Hey mate, The source code is in GitHub. Please check this. github.com/CodeSam621/Demo/blob/de01e4e1910155fea6f913f4750f7907043515dd/AWSTextract/lambda_function.py#L48
If there is an error, it should be shown in the logs. if you send me the code and pdf/image, I can check for you. please drop a email: johnsonp908060@gmail.com
Hey mate, Do you mean that lambda logs in cloudWatch? If that is the case, seems the lambda is not triggered. How did you trigger the API Gateway? Do you get any error when trigger API Gateway?
Hi @Nancy, The Textract is more suitable and advance for this kind of work. eg: extract content from forms etc. This comes with more advance things like "QUERY" which can used to extract element. On the other hand, the Rekognition good more identify context from the image. eg: give an image and see what sort of image it is.
@@lovetocode4486 we have bio rad site from there we are getting pdf into s3 post that it wil convert into images then json file since pdf has 22 pages it will create 22images and json files now my task is to only fetch table included in the json file all the set up is done only while grepping for table I m.getting error
@@fazilatbeigh6703 @fazilat beigh The 'FeatureTypes' is string array, So you can pass other types too as below. response = client.analyze_document(Document={'S3Object': {'Bucket': bucket, "Name": key}}, FeatureTypes=['FORMS,', 'TABLES']) github.com/CodeSam621/Demo/blob/8b4c0bc16c915ee1d9746ac1002195d101e2cad4/AWSTextract/lambda_function.py#LL8C6-L8C119
@@fazilatbeigh6703 This method ("get_kv_map") returns "key_map, value_map, block_map". you can filter whatever you want with "key_map" array. github.com/CodeSam621/Demo/blob/8b4c0bc16c915ee1d9746ac1002195d101e2cad4/AWSTextract/lambda_function.py#LL5C15-L5C15
Great job, the code is amazing, I used it as a starting point for a college project, thank you very much!
Great to hear. Thanks for the comment
Great content. Thank you for this one.
Thanks for your comment @Henrique
great content, thanks much!😀
Thanks mate. Glad that helps you 🙏🙏
Great video , Thank you ☺️
Glad it was helpful!. Thanks
Great video. Thanks
Thanks
Hey, great video.
I needed some assistance.
How do I save the output in a S3 bucket?
Hi Sushil,
You can easily use boto3 client to save the content to S3. The code is something similar to this.
import boto3
S3_RESOURCE = boto3.resource('s3')
MY_BUCKET = S3_RESOURCE.Bucket('your bucket name')
req = 'this is your content'
MY_BUCKET.put_object(Key=''your_s3_object_key', Body= req)
Let me know how it goes
@@lovetocode4486 hi , I tried this but not getting the output in S3 bucket. Could you please assist me .
@@lavanyasivaperumal
Do you get any error in the cloudwatch log?
Try adding 'S3FullPermission' to the role which attached to the lambda. Hope this is a permission issue
if this doesnt work, let me know. happy to help :)
@@lovetocode4486 yes tried but still facing same issue . can you give some similar code for this .I want store multiple output in same S3 bucket
Hi , This video helped a lot.
Need help to extract the output into csv/DB. Please help
Hi @user,
Glad that helps you.
I am making a video for async (start_document_analysis) and will include csv part of that.
Where do you stuck in terms of csv/db?
If you need any help urgent, please drop a email to this johnsonp908060@gmail.com
Hi @user,
This video show how to get table data into a object. Now you are good to save into any file format as you need.
th-cam.com/video/BNnFfTZsmjc/w-d-xo.html
hello what event did you used?
do you mean how does the lambda trigger? if so, you can add a Event notifications for the S3 bucket which can trigger the lambda. please check this for more info docs.aws.amazon.com/AmazonS3/latest/userguide/EventNotifications.html
Hope this helps you. :)
@@lovetocode4486 i did the part when we use the bucket as the trigger but my code giveme an error that says that it doesnt recognices this line " file_obj = event["Records"][0]" and i was thinking it has something to be with the event of the file
@@SamuelGrabois please try to print the "event" object. Does it has "Records" prop?
Print(f'event: {event}")
while using console and tryinig to detect text for sample image i am facing this error "Textract encountered an error while processing this document." Any idea about it. Thankyou!!
Hi Yash,
It is hard to say why you are getting the error without more info. Do you get any error if you use the Python code I mentioned above for the same image?
I did not get any output. I also checked the logs and there are no errors. Did we need to substitute some values in the lambda function code before triggering the process? Thanks for your help.
Hi Mate,
As per below code piece, the lambda is triggered by S3 event.
github.com/CodeSam621/Demo/blob/e5b5f538d41411478981008e8d9956d83d9da5e1/AWSTextract/lambda_function.py#L63
If you are triggering the lambda using the testing utility of lambda, then you need to send the event object which has " event["Records"][0]" data
@@lovetocode4486 Yeah, I triggered the lambda by placing the file in the S3 bucket but it produced no output and there were no errors in the logs. I was just curious if you have seen this issue before and if you had any suggestions what the issue might be?
@@betallyoungattractive644 ,
What I recommend to do are:
- Add log (very beginning of the lambda function) to make sure that the lambda is trigger. Then upload a file to S3. This add some logs to the cloudWatch logs.
- Once you verified above step, add necessary logs to see where it broke.
@@lovetocode4486 I will try this. Thank you for promptly helping!
@@lovetocode4486 I got the output now. Thank you
can u please tell me how to merge cells using python by aws textract api
Hi, great video. I need your assistance. How to store the Output in S3?
HI
Saving to the S3 is easy. Just pass the bucket name, file_Id and the content(json) to below method.
```
import os
import json
from boto3 import resource
import uuid
import logging
logger = logging.getLogger()
logging.basicConfig()
logger.setLevel(logging.INFO)
S3_RESOURCE = resource('s3')
def save_payload(bucket, key, content):
COI_BUCKET = S3_RESOURCE.Bucket(bucket)
response = COI_BUCKET.put_object(Key=key, Body=json.dumps(content))
logger.info(f'save_payload to s3. Response: {response}')
return response
```
let me know how does this go?
I tried above code but I didn't get Output. Could you please assist me.
You have not shown what is there under get_text method.. can you please share that too
Hey mate, The source code is in GitHub. Please check this.
github.com/CodeSam621/Demo/blob/de01e4e1910155fea6f913f4750f7907043515dd/AWSTextract/lambda_function.py#L48
Getting error in code Have sent the cloud watch logs. Pl check your mail . Any help on the error will be appreciated
Sure, what is the email you sent to?
@@lovetocode4486 any update on the error?
Hello my model is not giving any output after checking in cloudwatch
Do you see any errors in the cloudwatch logs?
@@lovetocode4486 no there was no error
If there is an error, it should be shown in the logs. if you send me the code and pdf/image, I can check for you. please drop a email: johnsonp908060@gmail.com
Hi, thanks for the video! I need your assistance please, how can I reach you?
Hey ate, drop an email to here johnsonp908060@gmail.com
Hello i did not get any output in cloudWatch and it doesn't show any error also. What to do.. can you please assist me 🙏
Hey mate,
Do you mean that lambda logs in cloudWatch? If that is the case, seems the lambda is not triggered. How did you trigger the API Gateway? Do you get any error when trigger API Gateway?
Can you try doing it with rekognition?
Hi @Nancy,
The Textract is more suitable and advance for this kind of work. eg: extract content from forms etc.
This comes with more advance things like "QUERY" which can used to extract element.
On the other hand, the Rekognition good more identify context from the image. eg: give an image and see what sort of image it is.
I am struggling to get textract issue
Could you pls elaborate little bit more? Where do you stuck? What is the error messages etc
@@lovetocode4486 we have bio rad site from there we are getting pdf into s3 post that it wil convert into images then json file since pdf has 22 pages it will create 22images and json files now my task is to only fetch table included in the json file all the set up is done only while grepping for table I m.getting error
@@lovetocode4486 neither I m getting output nor error
What is the error? Is it in cloudWatch logs?
@@lovetocode4486 no error no output
change your thumblain there is lamba instead of lambda
Thanks mate :) :)
I needed some assistance.
Hi Fazilat, Sure happy to help. What is the question please
@@lovetocode4486 How to extract the tables and forms from the picture in single code as in receipt
@@lovetocode4486 and how to filter the key, value pair I want to extract specifically
@@fazilatbeigh6703
@fazilat beigh The 'FeatureTypes' is string array, So you can pass other types too as below.
response = client.analyze_document(Document={'S3Object': {'Bucket': bucket, "Name": key}}, FeatureTypes=['FORMS,', 'TABLES'])
github.com/CodeSam621/Demo/blob/8b4c0bc16c915ee1d9746ac1002195d101e2cad4/AWSTextract/lambda_function.py#LL8C6-L8C119
@@fazilatbeigh6703 This method ("get_kv_map") returns "key_map, value_map, block_map". you can filter whatever you want with "key_map" array.
github.com/CodeSam621/Demo/blob/8b4c0bc16c915ee1d9746ac1002195d101e2cad4/AWSTextract/lambda_function.py#LL5C15-L5C15