AWS Glue DataBrew Demo Video For Beginners

แชร์
ฝัง
  • เผยแพร่เมื่อ 13 ธ.ค. 2024

ความคิดเห็น • 14

  • @TerranIMBA123
    @TerranIMBA123 4 ปีที่แล้ว +7

    Power Query to the rescue! Lmao great work AWS ;)

  • @dokonnjounosuke
    @dokonnjounosuke 3 ปีที่แล้ว +2

    how can we configure char-set for input file? It always garbles :(

  • @YouSureTalkAlot
    @YouSureTalkAlot 3 หลายเดือนก่อน

    exactly what I needed to know. THanks

  • @devoo778
    @devoo778 4 ปีที่แล้ว +5

    You should add all the pandas functions. I've just tested Databrew and I can't convert nanoseconds into the DateTime format. In pandas it's so simple, with DataBrew I can't... It's not saving me any time.

  • @truliapro7112
    @truliapro7112 4 ปีที่แล้ว +4

    Amazing tool.

  • @juanses
    @juanses 4 ปีที่แล้ว +2

    What it would be great is to be able to export the steps as Python code

    • @jimmyechan
      @jimmyechan 4 ปีที่แล้ว

      Check out Dropbase. It lets you export data processing steps as Python code

    • @juanses
      @juanses 4 ปีที่แล้ว +1

      @@jimmyechan I known there's tools that do it. I was just providing some tools to AWS for improvement

  • @edwinthatsnotmyname3670
    @edwinthatsnotmyname3670 4 ปีที่แล้ว

    What about unstructured data?

    • @SurbhiDangi
      @SurbhiDangi 4 ปีที่แล้ว

      What kinds of unstructured data are you thinking about?

    • @edwinthatsnotmyname3670
      @edwinthatsnotmyname3670 4 ปีที่แล้ว

      @@SurbhiDangi xml or json from scraping web sites etc. It appears every dataset has to be a table in glue.

    • @SurbhiDangi
      @SurbhiDangi 4 ปีที่แล้ว +2

      @@edwinthatsnotmyname3670 Agree, the tool supports JSON (+ nested JSON) today! Also, there is a direct S3 connector. You can also upload a file from your local disk, if you'd like. Take a look at the 3rd screenshot here: aws.amazon.com/blogs/aws/announcing-aws-glue-databrew-a-visual-data-preparation-tool-that-helps-you-clean-and-normalize-data-faster/

    • @CarlosICC
      @CarlosICC 4 ปีที่แล้ว +3

      Just in case this helps: 1. Create a Glue crawler to run on your unstructured (e.g. JSON) data (if the structure is complex, like highly nested documents, you can Grok patterns or manually do it in Athena (for example: create table... column struct