8.2 Incremental data load in Azure Data Factory

แชร์
ฝัง
  • เผยแพร่เมื่อ 10 พ.ย. 2024

ความคิดเห็น • 33

  • @shanthababu
    @shanthababu 11 หลายเดือนก่อน +1

    Excellent approach, Please explain adequately to learners.

  • @SantoshKumar-yr2md
    @SantoshKumar-yr2md 9 หลายเดือนก่อน

    you should explain from very begining like when you created table in your datasource, nice

  • @teunjacobs5005
    @teunjacobs5005 ปีที่แล้ว +3

    Hi Cloud Guru!
    First of all, thanks for the clear explanation. It really helped me in creating a incremental data load in ADF.
    Never the less I was wondering if you could make a video of making the tables which should be synced incremental more dynamic. Now I've hard set in the lookup action that it should look for a specifc table. I've more tables which should be synced incremental, so I assume that should start working with an iterate activity like a for each(?). So that in a specific SQL table I've defined which table(s) should be synced incremental and that it does that automatically. Do you have any ideas of tips and trics for things like this? Thanks in advance!

  • @saipraneeth1660
    @saipraneeth1660 2 ปีที่แล้ว +1

    thank you for the video sir.. yesterday i got same question in the interview

    • @cloudguru3178
      @cloudguru3178  2 ปีที่แล้ว

      Hope my explanation helped you Praneeth

  • @davidjasonmusic
    @davidjasonmusic 2 หลายเดือนก่อน

    Hello i need help - my data sources are APIs from BC...so what I am supposed to do with those datasets?....copy them into blob in order to filter by last modified date back in ADF? That makes no sense what about copying IN to the blob? I just have to copy the whole fact table over everytime? That cant be right. All these methods of incremental copy seem to start with a csv from blob storage. But how do I even get to that point?

  • @chandandacchufan3242
    @chandandacchufan3242 ปีที่แล้ว

    should explain how you are updating the table tbl_control

  • @souranwaris142
    @souranwaris142 ปีที่แล้ว +1

    Hello Sir.
    I have a problem with the incremental load I want to create an incremental pipeline from the Oracle on-premise server to Azure data lake(blob storage) I don't have Azure SQL. I just want to push in blob storage as a CSV file. in my case, I have confusion about where I should create the watermark table and store procedure. someone told me in your case you have to use parquet data. please help me with this I am stuck for many days.

    • @sainathreddynandyal8761
      @sainathreddynandyal8761 ปีที่แล้ว

      I encountered with same use-case brother. Can you tell me if you found a solution please?

  • @suniguha
    @suniguha 2 ปีที่แล้ว

    Very useful content. Thank you.

  • @trendy_techer2202
    @trendy_techer2202 2 ปีที่แล้ว

    Thank you so much sir for the valuable info

  • @camvinh3522
    @camvinh3522 2 ปีที่แล้ว

    why do we need control table for the last update time - since we can get it info in destination table

    • @cloudguru3178
      @cloudguru3178  2 ปีที่แล้ว

      What if loading to destination tables fails ??

    • @camvinh3522
      @camvinh3522 2 ปีที่แล้ว

      @@cloudguru3178 so it means that after we loading success to destination table then we update the control table ( if it fails we will not update the control table ) and base on the control table we can handle issue. Is it right ?

    • @camvinh3522
      @camvinh3522 2 ปีที่แล้ว +1

      @@cloudguru3178 thanks you for your support

    • @cloudguru3178
      @cloudguru3178  2 ปีที่แล้ว +1

      @@camvinh3522 We will update the control table if it is failure or success. But to get the maximum date for table you need to consider only last successful load.

  • @harrydadson7467
    @harrydadson7467 ปีที่แล้ว

    Please can someone please assist me on how I'm not able to do a Copy Activity to Stored Procedure on MySQL table in sink setting. Please help. Thanks

  • @muapatrick4174
    @muapatrick4174 ปีที่แล้ว

    is it possible to add theses incrementals on parquet files on adls?

    • @cloudguru3178
      @cloudguru3178  ปีที่แล้ว

      Hi Patrick it is not possible with the files

    • @muapatrick4174
      @muapatrick4174 ปีที่แล้ว

      @@cloudguru3178 Thanks

  • @ssbeats677
    @ssbeats677 2 ปีที่แล้ว

    Hi sir, in my source path i have a files i have created a shedule trigger to run every 1 hour. My issue is from source to sink the files are getting triggered duplucates files
    Eg : last hour i have 10 files trigger nxt in my source path i recived 5more files. When it trigger the files of last 10+5 files are getting to the sink path

    • @ssbeats677
      @ssbeats677 2 ปีที่แล้ว

      I need only latest and modified files

    • @cloudguru3178
      @cloudguru3178  2 ปีที่แล้ว

      Kindly move the file to another location once it is loaded so that there will be no duplicates

    • @cloudguru3178
      @cloudguru3178  2 ปีที่แล้ว

      If you would like to connect then call me on 9113070560

  • @swaminoonsavath5363
    @swaminoonsavath5363 2 ปีที่แล้ว

    Expression of type: 'String' does not match the field: 'additionalColumns' I am getting this error .. MY source is Servicenow and Destination is AZURE SQL DB .. Please help me on this

    • @superyan4100
      @superyan4100 2 ปีที่แล้ว

      Remove the 'first row only' checkbox.

  • @alexcarter-jones3251
    @alexcarter-jones3251 2 ปีที่แล้ว

    Great video! Have you ever set up an incremental load between an Oracle source table and an SQL sink table before? I am currently trying to do this but can only get my delta load to work when copying from SQL to SQL as you did in this video, any guidance would be greatly appreciated :)

    • @cloudguru3178
      @cloudguru3178  2 ปีที่แล้ว

      Hi Alex, please try this
      @Concat('select part of query ', 'WHERE si.schedule_date BETWEEN ' , formatDateTime({variables('LastDate')}, 'yyyyMMdd') , ' AND SYSDATE-1')

    • @alexcarter-jones3251
      @alexcarter-jones3251 2 ปีที่แล้ว +1

      Hi @@cloudguru3178, thanks for replying and apologise for the delay in my reply. I managed to write my own working solution using a query script prior to a simple copy data pipeline then using an upsert on the ID column. Query script is as follows 'select * from OracleSourceTable where LAST_UPDATE_TSTMP > (CURRENT_TIMESTAMP - 0.04)' looks very similar to your solution. I think the main issue I was having was trying to use templates that are only set up for Azure SQL DB to Azure SQL DB transfer.

  • @vasistasairam5483
    @vasistasairam5483 2 ปีที่แล้ว

    The explanation is very good, but while trying to implement the same, getting errors, especially with formulas. Kindly make videos little more detailed.

    • @cloudguru3178
      @cloudguru3178  2 ปีที่แล้ว

      Let me know what error you are facing and i shall help u