AWS Glue Crawler Tutorial with Hands On Lab | AWS Glue Tutorials | AWS Glue Hand-On Tutorial

แชร์
ฝัง
  • เผยแพร่เมื่อ 28 ก.ย. 2024

ความคิดเห็น • 79

  • @puru7190
    @puru7190 2 ปีที่แล้ว +7

    Arguably one of the best videos on Glue crawler - thanks a lot!

    • @AjayWadhara
      @AjayWadhara  2 ปีที่แล้ว

      This means a lot to me😍
      Don’t forget to subscribe, this pushes me to create more such content 🚀

  • @aniketpatil4217
    @aniketpatil4217 2 ปีที่แล้ว

    very helpful video, thanks.

    • @AjayWadhara
      @AjayWadhara  2 ปีที่แล้ว

      Don’t forget to subscribe ✅

  • @viniciusfigueiredo6740
    @viniciusfigueiredo6740 ปีที่แล้ว

    Great video!

    • @viniciusfigueiredo6740
      @viniciusfigueiredo6740 ปีที่แล้ว

      Ajay, I need some help please! I've been trying all day and I can't. I cataloged a parquet file that was saved after handling a Job with Spark. I'm doing another job to insert data from the parquet into an RDS MYSQL database, I need the data to be inserted in the same order as the parquet to ensure the primary keys, I've tried several ways, but the data is always inserted in a random way in the database table, can you tell me what I can do?

  • @Ottone84
    @Ottone84 3 ปีที่แล้ว

    Well done

  • @macklonfernandes7902
    @macklonfernandes7902 3 ปีที่แล้ว

    Can you do a video to add table from existing schema?

  • @ganeshsundar1484
    @ganeshsundar1484 2 ปีที่แล้ว +1

    Ajay, do you take online sessions on AWS Data Engineering? Pls let me know.

    • @AjayWadhara
      @AjayWadhara  2 ปีที่แล้ว +1

      Hi Ganesh, sorry I don’t take any online classes.

  • @RohitPal-lz1wf
    @RohitPal-lz1wf 2 ปีที่แล้ว

    Nice Tutorial Ajay. I have one question. I have a requirement to copy data which is of 4 million records in Dynamo (2017 version) to another table (2019 version). I don't want downtime. Can you please suggest will glue help me in this usecase? If yes then what things i have to consider.

    • @AjayWadhara
      @AjayWadhara  2 ปีที่แล้ว

      Did you check DMS service.?
      You get the option of Change Data Capture also.

  • @kishlayamourya3141
    @kishlayamourya3141 2 ปีที่แล้ว

    Awesome video!!
    I have a query: -
    I wanted to push s3 data(csv) to redshift tables.
    Can I anyhow use table schema created by crawler to create table in redshift?
    In every tutorial instructor 1st hand creates a table in redshift, then uses crawler again to create schema in glue then pushes the data to reshift...then what is the use of creating schema using crawler?

    • @AjayWadhara
      @AjayWadhara  2 ปีที่แล้ว

      Hey Kishlaya,
      You have to try this. Just search if Glue Data Catalog can be used directly in Redshift.
      I am aware that Redshift Spectrum can directly use the schema created by crawlers

  • @Sathyanathg
    @Sathyanathg 3 ปีที่แล้ว

    Hi Ajay, is there a way to automate data catalog import into Redshif Spectrum from AWS Glue ???

  • @santhoshm2197
    @santhoshm2197 2 ปีที่แล้ว

    Wonderland mm

  • @saurabhagarwal5692
    @saurabhagarwal5692 2 ปีที่แล้ว

    Hi,
    Nice explanation on AWS Glue Crawlers, which was very much helpful... Thanks for that.
    I have some queries about GLue crawler and Athena
    First try : In my S3 bucket I have put two different files one is Stock table and other is employee table and run glue crawler. Two different tables are generated but with empty data. Is it correct ?
    Second try : In my S3 bucket I have put two different files one is Stock table and other is employee table and run crawler with Exclude patterns and mentioned employee.csv after that also single table is generated but data is merged from both table. Is it correct ?
    or I have done some thing wrong. Please let me know.

    • @AjayWadhara
      @AjayWadhara  2 ปีที่แล้ว

      Hi Saurabh,
      You have to segregate the data to two different folders.
      If data is not returning from query, better check if schema is matching in Glue Catalog

  • @udayreddy3653
    @udayreddy3653 3 ปีที่แล้ว

    Hello Ajay, Your videos was very helpfull. Can we get similar videos for AWS LAMBDA. Is it possible for you to put all your videos relates to AWS(S3, ATHENA, GLUE, KINESIS, LAMBDA, EMR) in UDEMY so that we can buy it for you. Please share your thoughts on this.

    • @AjayWadhara
      @AjayWadhara  3 ปีที่แล้ว

      I am starting with Lambda series soon..First video coming today...Stay tuned..
      Don't forget to share and subscribe✅

    • @AjayWadhara
      @AjayWadhara  3 ปีที่แล้ว

      I am working on making one Udemy course..But it will take some time.

    • @udayreddy3653
      @udayreddy3653 3 ปีที่แล้ว

      @@AjayWadhara I already subscribed your series and shared with friends also.

    • @udayreddy3653
      @udayreddy3653 3 ปีที่แล้ว

      @@AjayWadhara Please have some working session also in udemy course for practise so that all our friends can buy your course.

  • @the_gamer2416
    @the_gamer2416 ปีที่แล้ว

    Why everyone is teaching in English can anyone tell me hindi channel for this??

  • @GopalGanguly-i1g
    @GopalGanguly-i1g ปีที่แล้ว

    I have upoaded a csv into S3 bucket .Crawler is creating the Data Catalog in Glue but when Im trying to view the content of the csv file in Athena using a query, its showing blank,but the columns are present without the values

    • @abhishekdubey-p9n
      @abhishekdubey-p9n ปีที่แล้ว

      yes same with me if we do with single csv file then we can se the data but when we crawl multiple files from same folder it is showing blank pls help me out if you get the solution

  • @sumanthb3280
    @sumanthb3280 ปีที่แล้ว

    Thank you so much. It's helpful...N

  • @srbhmtr
    @srbhmtr ปีที่แล้ว

    Best video about crawlers

  • @madhavpalshikar9412
    @madhavpalshikar9412 3 ปีที่แล้ว +1

    Wow.. that was really helpful video! Thanks!!

  • @RahulPatadiya
    @RahulPatadiya ปีที่แล้ว

    Nice explanation!

  • @adityawalia9
    @adityawalia9 ปีที่แล้ว

    This is extremely helpful. Thanks

  • @ganeshrajv130
    @ganeshrajv130 2 ปีที่แล้ว

    do crawler work on images ? because i tried but didnt get any data in data catalog

    • @AjayWadhara
      @AjayWadhara  2 ปีที่แล้ว

      No, there is a list that you can find on aws Docs website. Common types such as Csv, tsv, DBs, logs, json, parquet are supported.
      You can write custom crawler also but that would not cover images.
      Try using AWS Rekognition

    • @ganeshrajv130
      @ganeshrajv130 2 ปีที่แล้ว

      @@AjayWadhara okk thanks.. So even custom classifier wont support images right.

  • @chitrangsharma
    @chitrangsharma 3 ปีที่แล้ว

    Sir how will we perform etl operations ? I think pyspark is something that we use .... But do we have any other option then python?

  • @sabanaar
    @sabanaar 3 ปีที่แล้ว +1

    Excellent presentation !!!

  • @TheCloudIsEverywhere
    @TheCloudIsEverywhere 3 ปีที่แล้ว

    Hi Ajay - thanks for sharing this. Could you please all links related to Glue ( end to end hands-on flow ).

  • @bujjinazeer
    @bujjinazeer 3 ปีที่แล้ว

    how about crawling in another account S3, can you show that too

  • @5uryaprakashp1
    @5uryaprakashp1 2 ปีที่แล้ว

    Hi Ajay,
    Is there any way to automate through CI/CD, like I wanted to upload bunch of crawlers files and trigger them automatically and then store inferred schema in local file system.
    Thanks in advance.

    • @AjayWadhara
      @AjayWadhara  2 ปีที่แล้ว

      Hi Surya,
      You can CI/CD to automate things. Also consider Scheduled Lambda function. You can upload your files the either trigger or schedule the processing.
      Hope this helps!!

  • @SurendraKapkoti
    @SurendraKapkoti ปีที่แล้ว

    Very informative and useful..🙏

  • @rakeshkadulkar5219
    @rakeshkadulkar5219 3 ปีที่แล้ว

    Good and informative video for beginners.. the speed of content is very proper.. Thank you very much

  • @jagadishdamodaran7602
    @jagadishdamodaran7602 2 ปีที่แล้ว

    Great video Ajay . It's very clear and loud.

  • @Dean-Shepp
    @Dean-Shepp ปีที่แล้ว

    Thanks for the overview, this was really good.

  • @someusefullstuff
    @someusefullstuff 3 ปีที่แล้ว

    How to create grok classifier for fixed width files?

  • @chitraalavanthar3729
    @chitraalavanthar3729 3 ปีที่แล้ว

    How will you load Postgres partition table to data lake ?

  • @whatssnots
    @whatssnots 2 ปีที่แล้ว

    Great tutorial

    • @AjayWadhara
      @AjayWadhara  2 ปีที่แล้ว +1

      Glad you liked it✅

  • @ashishdalvi2868
    @ashishdalvi2868 3 ปีที่แล้ว

    Very nice explanation. Thanks !!

  • @khamkartejaswini8991
    @khamkartejaswini8991 2 ปีที่แล้ว

    Very helpful for me . Thank you

  • @PraveenKumar-ic5zo
    @PraveenKumar-ic5zo 3 ปีที่แล้ว

    everything was good except that horror sound in background..

    • @AjayWadhara
      @AjayWadhara  3 ปีที่แล้ว

      😂 I will improve my bg music choice

  • @rohanrawat5386
    @rohanrawat5386 2 ปีที่แล้ว

    Best video i have seen on Glue Crawler. Thanks for your efforts.

    • @AjayWadhara
      @AjayWadhara  2 ปีที่แล้ว

      I am glad you liked the video. Don’t forget to subscribe.✅

  • @silpadas7070
    @silpadas7070 2 ปีที่แล้ว

    can you do more advanced examples, please?

  • @ishankolapkar3100
    @ishankolapkar3100 2 ปีที่แล้ว

    Very informative video. Where is the next part of this video?

    • @AjayWadhara
      @AjayWadhara  2 ปีที่แล้ว

      I am glad you fund this informative.🙌🏻
      Please check next video in my channel..
      Don’t forget to subscribe the channel ✅✅

  • @Snkhuntia172
    @Snkhuntia172 3 ปีที่แล้ว

    Helpful video ..😀

  • @turabaliyev
    @turabaliyev 3 ปีที่แล้ว

    Hi Ajay, I uploaded the CSV file to the S3 bucket and created a crawler, I see the table in the database but trying preview data I don't see the data in the table, could you please let me know why I don't see the data?

    • @AjayWadhara
      @AjayWadhara  3 ปีที่แล้ว

      Are you querying through Athena console?

    • @turabaliyev
      @turabaliyev 3 ปีที่แล้ว

      @@AjayWadhara yes

    • @AjayWadhara
      @AjayWadhara  3 ปีที่แล้ว +1

      Check my Athena video, you will definitely get help from that

  • @Namaryop
    @Namaryop 2 ปีที่แล้ว

    Really good tutorial, keep up the good work Ajay!

    • @AjayWadhara
      @AjayWadhara  2 ปีที่แล้ว

      Thanks Namaryop😊
      Don’t forget to subscribe 🚀

  • @TheGuyDuffBand
    @TheGuyDuffBand 3 ปีที่แล้ว

    Thanks!

  • @skadam3382
    @skadam3382 2 ปีที่แล้ว

    Thanks for the detailed information bro ❣️

    • @AjayWadhara
      @AjayWadhara  2 ปีที่แล้ว

      Always welcome. Do check my latest videos. Pretty exciting videos coming soon.

  • @guesswho2306
    @guesswho2306 2 ปีที่แล้ว

    Great video perfect explanation loved it! Keep it ip bro

  • @krishnakumar-gc6hb
    @krishnakumar-gc6hb 3 ปีที่แล้ว

    Hello Ajay, I saw your LinkedIn post of aws data analytics certification
    Pls explain us the detailed learning path that yoy have taken to pass pls make a video on this out would be very helpful to anyone looking to pursue that exam

    • @AjayWadhara
      @AjayWadhara  3 ปีที่แล้ว +1

      I will upload a video on Sunday. Stay tuned for that. Subscribe ✅✅

    • @krishnakumar-gc6hb
      @krishnakumar-gc6hb 3 ปีที่แล้ว +1

      @@AjayWadhara okay Ajay

    • @krishnakumar-gc6hb
      @krishnakumar-gc6hb 3 ปีที่แล้ว

      @@AjayWadhara any chances of coming about the video

    • @AjayWadhara
      @AjayWadhara  3 ปีที่แล้ว +1

      Coming in next 3 hours

    • @krishnakumar-gc6hb
      @krishnakumar-gc6hb 3 ปีที่แล้ว

      @@AjayWadhara got it

  • @jineshwarnoraje568
    @jineshwarnoraje568 3 ปีที่แล้ว +3

    Nice explanation on AWS Glue Crawlers, which was very much helpful... Thanks for that.
    If any in between column is get deleted in newest file the the earlier file & the schema is modified by crawler, then in the earlier files the deleted column is available but the data got shifted ( as I can see the data is disturbed). So is there any configuration in crawler to validate the column names in any files available in S3 location.

    • @AjayWadhara
      @AjayWadhara  3 ปีที่แล้ว +1

      Thanks for adding that jineshwar. I did not explain that in this video.✅✅

    • @jineshwarnoraje568
      @jineshwarnoraje568 3 ปีที่แล้ว

      @@AjayWadhara Is there any configuration to achieve such scenarios