AWS Textract tutorial, Extract Forms, Tables from Image using Python

แชร์
ฝัง
  • เผยแพร่เมื่อ 21 ส.ค. 2024
  • AWS Textract using Lambda from image
    Lambda function
    #awslambda #textract #awstutorial #image #python
    Source code: github.com/Cod...
    Documentation: boto3.amazonaw...

ความคิดเห็น • 70

  • @frkael8255
    @frkael8255 10 หลายเดือนก่อน +5

    Great job, the code is amazing, I used it as a starting point for a college project, thank you very much!

    • @lovetocode4486
      @lovetocode4486  10 หลายเดือนก่อน

      Great to hear. Thanks for the comment

  • @henriquebonacelli2981
    @henriquebonacelli2981 ปีที่แล้ว +5

    Great content. Thank you for this one.

  • @lavanyasivaperumal
    @lavanyasivaperumal 10 หลายเดือนก่อน +3

    Great video , Thank you ☺️

    • @lovetocode4486
      @lovetocode4486  10 หลายเดือนก่อน

      Glad it was helpful!. Thanks

  • @SofianMW
    @SofianMW 11 หลายเดือนก่อน +3

    great content, thanks much!😀

    • @lovetocode4486
      @lovetocode4486  11 หลายเดือนก่อน

      Thanks mate. Glad that helps you 🙏🙏

  • @susank6856
    @susank6856 ปีที่แล้ว +2

    Great video. Thanks

  • @sushilsawra
    @sushilsawra ปีที่แล้ว +3

    Hey, great video.
    I needed some assistance.
    How do I save the output in a S3 bucket?

    • @lovetocode4486
      @lovetocode4486  ปีที่แล้ว +3

      Hi Sushil,
      You can easily use boto3 client to save the content to S3. The code is something similar to this.
      import boto3
      S3_RESOURCE = boto3.resource('s3')
      MY_BUCKET = S3_RESOURCE.Bucket('your bucket name')
      req = 'this is your content'
      MY_BUCKET.put_object(Key=''your_s3_object_key', Body= req)
      Let me know how it goes

    • @lavanyasivaperumal
      @lavanyasivaperumal 10 หลายเดือนก่อน +2

      ​@@lovetocode4486 hi , I tried this but not getting the output in S3 bucket. Could you please assist me .

    • @lovetocode4486
      @lovetocode4486  10 หลายเดือนก่อน

      @@lavanyasivaperumal
      Do you get any error in the cloudwatch log?
      Try adding 'S3FullPermission' to the role which attached to the lambda. Hope this is a permission issue

    • @lovetocode4486
      @lovetocode4486  10 หลายเดือนก่อน

      if this doesnt work, let me know. happy to help :)

    • @lavanyasivaperumal
      @lavanyasivaperumal 10 หลายเดือนก่อน +1

      @@lovetocode4486 yes tried but still facing same issue . can you give some similar code for this .I want store multiple output in same S3 bucket

  • @yashmahajan7897
    @yashmahajan7897 ปีที่แล้ว +2

    while using console and tryinig to detect text for sample image i am facing this error "Textract encountered an error while processing this document." Any idea about it. Thankyou!!

    • @lovetocode4486
      @lovetocode4486  ปีที่แล้ว

      Hi Yash,
      It is hard to say why you are getting the error without more info. Do you get any error if you use the Python code I mentioned above for the same image?

  • @saurabhgoreamazing9368
    @saurabhgoreamazing9368 หลายเดือนก่อน +2

    Getting error in code Have sent the cloud watch logs. Pl check your mail . Any help on the error will be appreciated

    • @lovetocode4486
      @lovetocode4486  หลายเดือนก่อน

      Sure, what is the email you sent to?

    • @saurabhgoreamazing9368
      @saurabhgoreamazing9368 หลายเดือนก่อน

      @@lovetocode4486 any update on the error?

  • @user-tx6ti4yb6j
    @user-tx6ti4yb6j ปีที่แล้ว +2

    Hi , This video helped a lot.
    Need help to extract the output into csv/DB. Please help

    • @lovetocode4486
      @lovetocode4486  ปีที่แล้ว

      Hi @user,
      Glad that helps you.
      I am making a video for async (start_document_analysis) and will include csv part of that.
      Where do you stuck in terms of csv/db?
      If you need any help urgent, please drop a email to this johnsonp908060@gmail.com

    • @lovetocode4486
      @lovetocode4486  ปีที่แล้ว

      Hi @user,
      This video show how to get table data into a object. Now you are good to save into any file format as you need.
      th-cam.com/video/BNnFfTZsmjc/w-d-xo.html

  • @betallyoungattractive644
    @betallyoungattractive644 4 หลายเดือนก่อน +1

    I did not get any output. I also checked the logs and there are no errors. Did we need to substitute some values in the lambda function code before triggering the process? Thanks for your help.

    • @lovetocode4486
      @lovetocode4486  4 หลายเดือนก่อน +1

      Hi Mate,
      As per below code piece, the lambda is triggered by S3 event.
      github.com/CodeSam621/Demo/blob/e5b5f538d41411478981008e8d9956d83d9da5e1/AWSTextract/lambda_function.py#L63
      If you are triggering the lambda using the testing utility of lambda, then you need to send the event object which has " event["Records"][0]" data

    • @betallyoungattractive644
      @betallyoungattractive644 4 หลายเดือนก่อน +1

      @@lovetocode4486 Yeah, I triggered the lambda by placing the file in the S3 bucket but it produced no output and there were no errors in the logs. I was just curious if you have seen this issue before and if you had any suggestions what the issue might be?

    • @lovetocode4486
      @lovetocode4486  4 หลายเดือนก่อน +1

      ​@@betallyoungattractive644 ,
      What I recommend to do are:
      - Add log (very beginning of the lambda function) to make sure that the lambda is trigger. Then upload a file to S3. This add some logs to the cloudWatch logs.
      - Once you verified above step, add necessary logs to see where it broke.

    • @betallyoungattractive644
      @betallyoungattractive644 4 หลายเดือนก่อน +1

      @@lovetocode4486 I will try this. Thank you for promptly helping!

    • @betallyoungattractive644
      @betallyoungattractive644 4 หลายเดือนก่อน

      @@lovetocode4486 I got the output now. Thank you

  • @manoharmanohar8118
    @manoharmanohar8118 4 หลายเดือนก่อน +1

    You have not shown what is there under get_text method.. can you please share that too

    • @lovetocode4486
      @lovetocode4486  4 หลายเดือนก่อน +1

      Hey mate, The source code is in GitHub. Please check this.
      github.com/CodeSam621/Demo/blob/de01e4e1910155fea6f913f4750f7907043515dd/AWSTextract/lambda_function.py#L48

  • @paramountFootballReports
    @paramountFootballReports ปีที่แล้ว +1

    Hello my model is not giving any output after checking in cloudwatch

    • @lovetocode4486
      @lovetocode4486  ปีที่แล้ว

      Do you see any errors in the cloudwatch logs?

    • @paramountFootballReports
      @paramountFootballReports ปีที่แล้ว

      @@lovetocode4486 no there was no error

    • @lovetocode4486
      @lovetocode4486  ปีที่แล้ว

      If there is an error, it should be shown in the logs. if you send me the code and pdf/image, I can check for you. please drop a email: johnsonp908060@gmail.com

  • @matthieuchan7642
    @matthieuchan7642 หลายเดือนก่อน +1

    Hi, thanks for the video! I need your assistance please, how can I reach you?

    • @lovetocode4486
      @lovetocode4486  หลายเดือนก่อน

      Hey ate, drop an email to here johnsonp908060@gmail.com

  • @user-od3zg3is7j
    @user-od3zg3is7j 10 หลายเดือนก่อน +2

    hello what event did you used?

    • @lovetocode4486
      @lovetocode4486  10 หลายเดือนก่อน +1

      do you mean how does the lambda trigger? if so, you can add a Event notifications for the S3 bucket which can trigger the lambda. please check this for more info docs.aws.amazon.com/AmazonS3/latest/userguide/EventNotifications.html
      Hope this helps you. :)

    • @user-od3zg3is7j
      @user-od3zg3is7j 10 หลายเดือนก่อน

      @@lovetocode4486 i did the part when we use the bucket as the trigger but my code giveme an error that says that it doesnt recognices this line " file_obj = event["Records"][0]" and i was thinking it has something to be with the event of the file

    • @lovetocode4486
      @lovetocode4486  10 หลายเดือนก่อน

      @@user-od3zg3is7j please try to print the "event" object. Does it has "Records" prop?
      Print(f'event: {event}")

  • @user-gt8iu2es8v
    @user-gt8iu2es8v ปีที่แล้ว +2

    change your thumblain there is lamba instead of lambda

  • @preetishidgirimath7707
    @preetishidgirimath7707 หลายเดือนก่อน +1

    I am struggling to get textract issue

    • @lovetocode4486
      @lovetocode4486  หลายเดือนก่อน

      Could you pls elaborate little bit more? Where do you stuck? What is the error messages etc

    • @preetishidgirimath7707
      @preetishidgirimath7707 หลายเดือนก่อน +1

      @@lovetocode4486 we have bio rad site from there we are getting pdf into s3 post that it wil convert into images then json file since pdf has 22 pages it will create 22images and json files now my task is to only fetch table included in the json file all the set up is done only while grepping for table I m.getting error

    • @preetishidgirimath7707
      @preetishidgirimath7707 หลายเดือนก่อน

      @@lovetocode4486 neither I m getting output nor error

    • @lovetocode4486
      @lovetocode4486  หลายเดือนก่อน

      What is the error? Is it in cloudWatch logs?

    • @preetishidgirimath7707
      @preetishidgirimath7707 หลายเดือนก่อน

      @@lovetocode4486 no error no output

  • @maheshm4358
    @maheshm4358 6 หลายเดือนก่อน

    Hello i did not get any output in cloudWatch and it doesn't show any error also. What to do.. can you please assist me 🙏

    • @lovetocode4486
      @lovetocode4486  6 หลายเดือนก่อน

      Hey mate,
      Do you mean that lambda logs in cloudWatch? If that is the case, seems the lambda is not triggered. How did you trigger the API Gateway? Do you get any error when trigger API Gateway?

  • @Imphanda
    @Imphanda ปีที่แล้ว +1

    Can you try doing it with rekognition?

    • @lovetocode4486
      @lovetocode4486  ปีที่แล้ว

      Hi @Nancy,
      The Textract is more suitable and advance for this kind of work. eg: extract content from forms etc.
      This comes with more advance things like "QUERY" which can used to extract element.
      On the other hand, the Rekognition good more identify context from the image. eg: give an image and see what sort of image it is.

  • @akshayavarshini1341
    @akshayavarshini1341 10 หลายเดือนก่อน

    Hi, great video. I need your assistance. How to store the Output in S3?

    • @lovetocode4486
      @lovetocode4486  10 หลายเดือนก่อน

      HI
      Saving to the S3 is easy. Just pass the bucket name, file_Id and the content(json) to below method.
      ```
      import os
      import json
      from boto3 import resource
      import uuid
      import logging
      logger = logging.getLogger()
      logging.basicConfig()
      logger.setLevel(logging.INFO)
      S3_RESOURCE = resource('s3')
      def save_payload(bucket, key, content):
      COI_BUCKET = S3_RESOURCE.Bucket(bucket)
      response = COI_BUCKET.put_object(Key=key, Body=json.dumps(content))
      logger.info(f'save_payload to s3. Response: {response}')
      return response
      ```

    • @lovetocode4486
      @lovetocode4486  10 หลายเดือนก่อน

      let me know how does this go?

    • @akshayavarshini1341
      @akshayavarshini1341 10 หลายเดือนก่อน

      I tried above code but I didn't get Output. Could you please assist me.

  • @fazilatbeigh6703
    @fazilatbeigh6703 ปีที่แล้ว +1

    I needed some assistance.

    • @lovetocode4486
      @lovetocode4486  ปีที่แล้ว

      Hi Fazilat, Sure happy to help. What is the question please

    • @fazilatbeigh6703
      @fazilatbeigh6703 ปีที่แล้ว

      @@lovetocode4486 How to extract the tables and forms from the picture in single code as in receipt

    • @fazilatbeigh6703
      @fazilatbeigh6703 ปีที่แล้ว

      @@lovetocode4486 and how to filter the key, value pair I want to extract specifically

    • @lovetocode4486
      @lovetocode4486  ปีที่แล้ว

      @@fazilatbeigh6703
      ​ @fazilat beigh The 'FeatureTypes' is string array, So you can pass other types too as below.
      response = client.analyze_document(Document={'S3Object': {'Bucket': bucket, "Name": key}}, FeatureTypes=['FORMS,', 'TABLES'])
      github.com/CodeSam621/Demo/blob/8b4c0bc16c915ee1d9746ac1002195d101e2cad4/AWSTextract/lambda_function.py#LL8C6-L8C119

    • @lovetocode4486
      @lovetocode4486  ปีที่แล้ว

      @@fazilatbeigh6703 This method ("get_kv_map") returns "key_map, value_map, block_map". you can filter whatever you want with "key_map" array.
      github.com/CodeSam621/Demo/blob/8b4c0bc16c915ee1d9746ac1002195d101e2cad4/AWSTextract/lambda_function.py#LL5C15-L5C15