Complete Azure Data Factory CI/CD Process (DEV/UAT/PROD) with Azure Pipelines

Data Engineering With Nick

มุมมอง 23 286

เพิ่มลงใน
- เพลย์ลิสต์ของฉัน
- ดูภายหลัง
แชร์

แชร์

ฝัง

ขนาดวิดีโอ:

แสดงแผงควบคุมโปรแกรมเล่น

เล่นอัตโนมัติ

เล่นใหม่

เผยแพร่เมื่อ 5 ต.ค. 2023
This video goes over how to write code to package up and promote a DEV Azure Data Factory to a UAT Data Factory and PROD Data Factory.
Links:
-GitHub repo code: github.com/DataEngineeringWit...
-Data Factory automated publishing CI/CD documentation: learn.microsoft.com/en-us/azu...
-Npm Data Factory utilities package: www.npmjs.com/package/@micros...
วิทยาศาสตร์และเทคโนโลยี

ความคิดเห็น • 51

@axertann 9 หลายเดือนก่อน ⁺⁴
I really appreciate the layout and completeness of .yaml files with Azure Pipeline definition. It gives an idea how real-live project could implement it (unlike some toy examples that are extremely popular in these kind of tutorials)
@frankzendejas7494 3 หลายเดือนก่อน
absolutely stellar! Thank you for the clear instructions and examples. This really helped me understand!!
@UmeshMahale-kl3sl 3 หลายเดือนก่อน ⁺¹
Great efforts and explanation!!!
@oscarestorach6162 27 วันที่ผ่านมา ⁺¹
Thanks a lot for sharing your knowledge
@Ignalvarez 7 หลายเดือนก่อน ⁺¹
Great Content!
@piesogrodnika572 3 หลายเดือนก่อน ⁺¹
really great work
@fletch5264 8 หลายเดือนก่อน ⁺¹
This is awesome, I would love one for Synapse as well as things are slightly different
@mohammadbakeer4247 10 หลายเดือนก่อน ⁺¹
Yoooo thanks
@ritchielegend 4 หลายเดือนก่อน
How would you get this to work with Azure Databricks as a linked service. The connection details aren't parameterised so therefore can pass in values such as the databricks workspace URL
@cboyda 7 หลายเดือนก่อน
Curious why there is no displayName in the .yml powershell sections?
@Baxtexx 3 หลายเดือนก่อน
How do I sett pipeline arguments different in dev, uat and prod?
@shivadevisetti5588 5 หลายเดือนก่อน
can we use terraform script instead of using ARM templates?
@kapsingla 11 วันที่ผ่านมา
If I use all the yaml files and other supporting files, do I still need to create the flow under Release under pipeline section of Azure Devops or do I need to still create Environment?
@dataengineeringwithnick7532 11 วันที่ผ่านมา
No, you don’t need a separate release pipeline flow. The yaml files does all the CICD required for ADF
@yatishmahajan2778 4 หลายเดือนก่อน ⁺²
did you created that yaml files manually or using ADF
@dataengineeringwithnick7532 2 หลายเดือนก่อน
Using ADF automatically. Then you just replace your connection/linked service/global parameters in the cicd/adf-cicd template parameters files for the uat and prod environments
@asharnavya 15 วันที่ผ่านมา
Can I use the Library Groups inplace of the KeyVault for subscriptions IDs or any variable which has the highest confidentiality?
@dataengineeringwithnick7532 11 วันที่ผ่านมา
Yes, you can use secret variables or library groups or whatever you’d like.
@fumadordemakoncha 28 วันที่ผ่านมา
How to override global params when selecting "Include global parameters in ARM template"? Do I have to override parameters in AzureResourceManagerTemplateDeployment@3 somehow?
@dataengineeringwithnick7532 27 วันที่ผ่านมา
In my example I override them in the template-parameters.json files. For example, see the cicd/adf-cicd adf-prod-template-parameters.json and adf-uat-template-parameters.json files. I override the global parameters (default_properties_GL_STRING_value and default_properties_GL_NUMBER_value) in those files. Those are global parameters that are updated (different values) in each environment.
You can also override parameters in the AzureResourceManagerTemplateDeployment@3 (if you didn't want to use template-parameter files) using the overrideParameters input. For reference: learn.microsoft.com/en-us/azure/devops/pipelines/tasks/reference/azure-resource-manager-template-deployment-v3?view=azure-pipelines
@SantoshKumar-yr2md 4 หลายเดือนก่อน
please suggest if I need to run 10 pipelines everyday at 2 pm, what is approach, should we go with schedule trigger or any other approach
@dataengineeringwithnick7532 2 หลายเดือนก่อน
Schedule trigger.
@sagarkadam4554 7 หลายเดือนก่อน ⁺⁴
Yes I am interested to see how we can use linked templates for deployment. Could you please share it.
@krzysztof9376 6 หลายเดือนก่อน
+1
@dataengineeringwithnick7532 2 หลายเดือนก่อน ⁺¹
Thanks for the comment. I've seen a couple of comments for this. Currently working on a linked template deployment for Data Factory video :)
@SynonAnon-vi1ql หลายเดือนก่อน
Please! Thanks
@vaibhavvaidya8715 2 หลายเดือนก่อน
Hi, I had created this cicd pipeline long back running fine without a single bug since deployed.
Currently arm template size is more than 4mb and getting the error. Would you please do create video for that?
@dataengineeringwithnick7532 2 หลายเดือนก่อน
Yes, you'll need to use linked templates. I've seen a couple of comments about this and currently working on a video for it :)
@marwanqaiser 4 หลายเดือนก่อน
I have tried this but it doesn't create the Global Params in UAT/PROD. Any idea what I could be doing wrong?
@WLoganDowning 3 หลายเดือนก่อน ⁺¹
Did you by chance forget to click the include global parameters button in the Manage/Arm Template section of your Dev data factory instance?
@SynonAnon-vi1ql หลายเดือนก่อน
This is one of the best videos on this topic! I have one question and very high hopes that I will get my answer here.
Why is dev ADF in the picture when the build is triggered on main branch? Previously in the manual process, when the dev branch is merged with main, one would "switch" to the main branch (ie collaboration branch) in ADF to click the publish button. This ensures generation of arm templates on main branch. However, with npm package method as shown in the video has both main branch as a trigger but also referencing Dev ADF. This confused me.
The reason I would like to know this is because we have a slightly different setup. Our main goes into prod ADF. We have a feature branch for collaboration as I have more than one data engineers working on their own Dev branches. Once their codes merge into feature branch, we intend to deploy it in our test environment. Only upon successful testing, it will be merged into main which will then trigger prod deployment. Please help!
@dataengineeringwithnick7532 27 วันที่ผ่านมา ⁺¹
Great question. The build (packaging up the ADF code) is done from the repo (main branch in this example) where all of the ADF JSON files are. The DEV ADF /subscriptions/.../adf name in the code Validate And Generate ADF ARM Template And Scripts seems to only help set the default values/info for the artifacts (ARM template, etc.) from what I've seen.
For example, I've tested completely deleting the actual DEV ADF (with the ADF JSON files still in the repo main branch) and the pipeline still builds it. I've also tested using a random ADF DEV name that doesn't exist and it still builds it using the repo JSON files but will use the default value in the ARM Template and files as the ADF name that doesn't exist.
So whatever your collaboration branch is (the feature branch you mentioned for example), as long as you checkout that branch in your build pipeline, you should be fine as that's where the code is. Then in a different pipeline (or different stage in the same pipeline) you can checkout the main branch and deploy to PROD separately. Hope this helps.
@SynonAnon-vi1ql 27 วันที่ผ่านมา
@@dataengineeringwithnick7532 thank you so much for getting back. I'm going to try it shortly and will let you know.
@SynonAnon-vi1ql 26 วันที่ผ่านมา
@@dataengineeringwithnick7532 Starting to implement and stuck at one place. Do we really need to use the adf-uat-template-parameters.json and prod files? One thing that I liked about the previous manual process was I didn't have to worry about all the Linked Services etc details of my ADF (we've many ADLS, Lakehouse, SQL, KV). It autogenerated everything. I used to enter override values for the parameters on the ADO UI. Is it not possible here? How to achieve that by say totally not referencing these files? Thank you again! Please keep up the great work!
@juujoobz 7 หลายเดือนก่อน
Could you please explain how to implement parameter replace like link service sql connection string?
@dataengineeringwithnick7532 2 หลายเดือนก่อน
Look at the cicd/adf-cicd folder, the uat and prod template parameters files is where you’d replace those values.
@mainuser98 28 วันที่ผ่านมา
@@dataengineeringwithnick7532 Do i need to make variables in devops?
@mainuser98 หลายเดือนก่อน
I dont have an UAT environment, can i use this for dev and prod only?
@dataengineeringwithnick7532 27 วันที่ผ่านมา
Yes you can. You would just remove the Deploy to UAT code in the pipeline.
@mainuser98 27 วันที่ผ่านมา
@@dataengineeringwithnick7532 Thanks, did u add the variables also in the pipeline? The DevSubscriptionID and productionID?
@dataengineeringwithnick7532 27 วันที่ผ่านมา
@@mainuser98those are actually secret variables (so the values don’t get logged/show in plain text).
To create secret variables, in Azure Pipelines click on your pipeline then click edit and then variables.
@ingenierofelipeurreg 9 หลายเดือนก่อน
Do u have discord channel?
@mainuser98 28 วันที่ผ่านมา
Error: ##[warning]Can't find loc string for key: Info_GotAndMaskAuth in the pipeline
@dataengineeringwithnick7532 27 วันที่ผ่านมา
This is just a warning from the Npm@1 task and has recently been updated by the Microsoft team (just not released yet). See here: github.com/microsoft/azure-pipelines-tasks/issues/20120 and here: github.com/microsoft/azure-pipelines-tasks-common-packages/pull/345. Code update should be in the next release (github.com/microsoft/azure-pipelines-tasks-common-packages/releases). Either way, it should resolve itself and there's no code changes needed on the ADF pipeline.
@mainuser98 2 หลายเดือนก่อน
How to configure the service connection for uat and prod?
@dataengineeringwithnick7532 2 หลายเดือนก่อน
In the cicd/adf-cicd folder, there’s 2 files called adf-uat-template-parameters.json and the prod one. Replace your connection strings for each environment there. See the repo link in the description to get to the files.
@mainuser98 2 หลายเดือนก่อน
@@dataengineeringwithnick7532 should i add the service principal to data factory also as a member? Or just the connection strings in the files
@mainuser98 2 หลายเดือนก่อน
@@dataengineeringwithnick7532 Should I do something in adf with the service principal like add it as a member? Or just the connection strings in the files. I am referring to 23:15 because you didnt show in details what we should do with the azure resource manager
@dataengineeringwithnick7532 2 หลายเดือนก่อน
⁠@@mainuser98The Azure Resource Manager (via a service principal) would need to be an RBAC Contributor role (or enough permissions to deploy an ARM template to a resource group) at the resource group level. You don’t need to add the service principal info to any of the template parameter files.
@mainuser98 2 หลายเดือนก่อน
@@dataengineeringwithnick7532 can u send me documentation how to configure this

ต่อไป

เล่นอัตโนมัติ

76. Continuous integration and deployment in Azure Data Factory