Thank you for the nice video, I am currently using it as an overview to prepare for the Azure Data Scientist certification! I think a slight mistake occured in the explanation of high cardinality features at around 13:00 - from my understanding at least, high cardinality features are features with many unique values relatively to the dataset size. For example a bunch of unique IDs, which would have no predictive power and might trip up some types of ML models. Your explanation is still positioned correctly, since the relevant AutoML setting detects "high cardinality or no variance features", and your example sounded like a no/low variance feature. I'd post a reference to the relevant Microsoft doc but I don't want to risk my comment being auto-blocked for external links. 😇
Thanks for the correction. You are right that I said high cardinality (and that high cardinality is for things like GUIDs). In this case, that was an example of a "no variance" feature and even though the checkbox is the same for eliminating both, it was unclear. For reference, here are the details on automatic featurization: learn.microsoft.com/azure/machine-learning/how-to-configure-auto-features?view=azureml-api-1#automatic-featurization
These are great videos - short, concise, fast moving with no filler. Unfortunately I'm kinda struggling with Azure ML, frankly not sure what it brings to the table. Seems easier and faster to train locally and serialize the model, wrap it up in a API with Flask, and deploy as a Docker container.
That's a great question. I've added it to my backlog for a video to give it a full treatment. In the meantime, the quick version is this. Azure ML doesn't offer anything you absolutely cannot get anywhere else. What it does is, it offers what you can get elsewhere in a way which is easier to use if you are not already deeply familiar with the entire end-to-end machine learning process. If you don't have a solid data processing foundation, AML offers the concepts of datastores and data assets, letting you track and manage data. If you don't have the hardware available, AML makes it easy to build data science VMs (compute instances) and inter-linked training machines (compute clusters). We can also get access to GPU-based machines for these clusters. If you're not strong at the data science side of things, training/scoring models via automated ML or a drag-and-drop interface is available. If you're not running MLflow locally, AML can act as a model registry and packaging solution. If you're not strong at API development, AML gives you a pretty straightforward framework for that. If you're not strong at container and server administration, AML containerizes and deploys container images, and then Azure Kubernetes Service (AKS) or Azure Container Instances (ACI) can host. And if you are strong at all of those, AML gives you an interface where you can track all of this stuff in one location, rather than a dozen different tools and web portals. One thing I haven't gotten into in the series so far is that it's really easy to access an Azure ML model from Power BI, allowing you to perform inference on datasets in Power BI really easily. You can, of course, do this in other ways as well, including with your own deployed APIs, but this is an area where AML makes it much easier. There are some other integration points in Azure services (like Azure Stream Analytics, Azure Synapse Analytics, and Microsoft Fabric), making it easier to tie into those services.
Have you ever used the common data model in the azure ml studio? We recently integrated our data with the common data model (cdm) in azure data lake gen2. The docs say we can leverage azure machine learning with the cdm but I do not see an option to import the cdm in ml studio under the data assets or datastores.
I am not particularly familiar with the Common Data Model, so the best I can give you is a semi-educated guess. There's almost no information out there on how to *use* Azure ML with CDM-style data, but I could see the process working by registering the ADLS gen2 storage account as a Datastore, and then the Entity folders (using the example at learn.microsoft.com/en-us/common-data-model/data-lake) as uri_folder asset types. I'm not sure if there is an easier way to use the CDM metadata itself with Azure ML, unfortunately.
Kudos to you. very nice explaination. Can you also provide a example video for a object detection problem. I want to learn how Azure ML handles object detection problems. (Computer Vision, Images-Label, Problems)
I am a complete noob with Azure and had a hard time loading the dataset. For anyone else struggling: Select the Tabular datatype (Azure ML v1 APIs), then provide the URL at the next step. I mistakeningly tried the Table format and file as well. Could not get them working. On that note, how can I delete datasets in Azure ML? It only gives me the option to archive them. Thanks!
Thanks for the note. Yes, the v2 Table type is a bit different from v1 Tabular and it makes use of the MLTable Python SDK to generate a file in a special format. I haven't spent much time working with that format, as I tend to focus on v1 Tabular or v2 Folder types. As for deleting datasets, you typically don't do that because then, you'd lose the data associated with prior experiments and job runs. Archival is useful for hiding it away, so you don't have old versions of datasets cluttering the Data assets page. But if the concern is that you don't want to spend the money on file storage (especially if the files are large), you can delete a data asset's contents by navigating to the Data menu under Assets in Azure ML. Then, select your archived data asset--you may need to switch the "Include archived" toggle on the right-hand side of the menu bar that contains Create, Refresh, Archive, and Reset view. Select your data asset, and in the bottom-right box, labeled Data sources, there's a gray box that contains a link: "View in Azure Portal." That link will take you to where the actual file is located in Azure Blob Storage, and you can delete the file from there. Then, toggle off the "Include archived" toggle in Azure ML Studio and it'll be out of sight.
I have a video coming up on using Power BI with Azure ML and it should launch within the next few weeks. It doesn't cover automated ML endpoints but I will dig into this scenario and see if there are any differences between a regular Azure ML solution and an automated ML solution as far as Power BI is concerned. If there is a difference, I'll create a video on that scenario as well.
It does look like they've shuffled around the UI a bit, though in a quick review, the components do look to be the same (mostly) when you look across all of the dialog options for the old UI and the new one. There's an additional option in Compute for serverless compute instead of using an existing compute instance or cluster, but that appeared to be the biggest single change.
@@KevinFeasel I see. I'm a Power BI developer but a complete novice that is tinkering with this at work. I would really love a video showing the new UI features, and also a newbies guide for training a model in Automated ML and then deploying it FTR your channel is very helpful so far, by far the most comprehensive that I've come across on Azure ML Studio
Kevin is my fav channel now. thank you for teaching us
Kevin its my second video, and its awesome the way that you show step by step. Thanks a lot
Thank you Kevin for the amazing playlist on Azure ML
Nice Explanation , keep going , good luck for future
Thank you for the nice video, I am currently using it as an overview to prepare for the Azure Data Scientist certification! I think a slight mistake occured in the explanation of high cardinality features at around 13:00 - from my understanding at least, high cardinality features are features with many unique values relatively to the dataset size. For example a bunch of unique IDs, which would have no predictive power and might trip up some types of ML models. Your explanation is still positioned correctly, since the relevant AutoML setting detects "high cardinality or no variance features", and your example sounded like a no/low variance feature. I'd post a reference to the relevant Microsoft doc but I don't want to risk my comment being auto-blocked for external links. 😇
Thanks for the correction. You are right that I said high cardinality (and that high cardinality is for things like GUIDs). In this case, that was an example of a "no variance" feature and even though the checkbox is the same for eliminating both, it was unclear.
For reference, here are the details on automatic featurization: learn.microsoft.com/azure/machine-learning/how-to-configure-auto-features?view=azureml-api-1#automatic-featurization
These are great videos - short, concise, fast moving with no filler.
Unfortunately I'm kinda struggling with Azure ML, frankly not sure what it brings to the table. Seems easier and faster to train locally and serialize the model, wrap it up in a API with Flask, and deploy as a Docker container.
That's a great question. I've added it to my backlog for a video to give it a full treatment. In the meantime, the quick version is this.
Azure ML doesn't offer anything you absolutely cannot get anywhere else. What it does is, it offers what you can get elsewhere in a way which is easier to use if you are not already deeply familiar with the entire end-to-end machine learning process.
If you don't have a solid data processing foundation, AML offers the concepts of datastores and data assets, letting you track and manage data.
If you don't have the hardware available, AML makes it easy to build data science VMs (compute instances) and inter-linked training machines (compute clusters). We can also get access to GPU-based machines for these clusters.
If you're not strong at the data science side of things, training/scoring models via automated ML or a drag-and-drop interface is available.
If you're not running MLflow locally, AML can act as a model registry and packaging solution.
If you're not strong at API development, AML gives you a pretty straightforward framework for that.
If you're not strong at container and server administration, AML containerizes and deploys container images, and then Azure Kubernetes Service (AKS) or Azure Container Instances (ACI) can host.
And if you are strong at all of those, AML gives you an interface where you can track all of this stuff in one location, rather than a dozen different tools and web portals.
One thing I haven't gotten into in the series so far is that it's really easy to access an Azure ML model from Power BI, allowing you to perform inference on datasets in Power BI really easily. You can, of course, do this in other ways as well, including with your own deployed APIs, but this is an area where AML makes it much easier. There are some other integration points in Azure services (like Azure Stream Analytics, Azure Synapse Analytics, and Microsoft Fabric), making it easier to tie into those services.
@@KevinFeasel dude, you're a rockstar. Thanks for the response and I'm going to binge on the rest of your Azure ML videos this weekend.
Have you ever used the common data model in the azure ml studio? We recently integrated our data with the common data model (cdm) in azure data lake gen2. The docs say we can leverage azure machine learning with the cdm but I do not see an option to import the cdm in ml studio under the data assets or datastores.
I am not particularly familiar with the Common Data Model, so the best I can give you is a semi-educated guess. There's almost no information out there on how to *use* Azure ML with CDM-style data, but I could see the process working by registering the ADLS gen2 storage account as a Datastore, and then the Entity folders (using the example at learn.microsoft.com/en-us/common-data-model/data-lake) as uri_folder asset types. I'm not sure if there is an easier way to use the CDM metadata itself with Azure ML, unfortunately.
@@KevinFeasel thanks for the reply I was thinking the same thing, registering the cdm folder as a folder_uri asset. Thanks again
Kudos to you. very nice explaination. Can you also provide a example video for a object detection problem. I want to learn how Azure ML handles object detection problems. (Computer Vision, Images-Label, Problems)
Thank you for the comment. I do have an image classification problem coming up, though it will be a little while before we get to it.
I am a complete noob with Azure and had a hard time loading the dataset. For anyone else struggling:
Select the Tabular datatype (Azure ML v1 APIs), then provide the URL at the next step. I mistakeningly tried the Table format and file as well. Could not get them working. On that note, how can I delete datasets in Azure ML? It only gives me the option to archive them. Thanks!
Thanks for the note. Yes, the v2 Table type is a bit different from v1 Tabular and it makes use of the MLTable Python SDK to generate a file in a special format. I haven't spent much time working with that format, as I tend to focus on v1 Tabular or v2 Folder types.
As for deleting datasets, you typically don't do that because then, you'd lose the data associated with prior experiments and job runs. Archival is useful for hiding it away, so you don't have old versions of datasets cluttering the Data assets page. But if the concern is that you don't want to spend the money on file storage (especially if the files are large), you can delete a data asset's contents by navigating to the Data menu under Assets in Azure ML. Then, select your archived data asset--you may need to switch the "Include archived" toggle on the right-hand side of the menu bar that contains Create, Refresh, Archive, and Reset view. Select your data asset, and in the bottom-right box, labeled Data sources, there's a gray box that contains a link: "View in Azure Portal." That link will take you to where the actual file is located in Azure Blob Storage, and you can delete the file from there. Then, toggle off the "Include archived" toggle in Azure ML Studio and it'll be out of sight.
Awsome video, can you please show how to deploy an automated Azure ML solution to an endpoint? (to be used in Power bi for exemple)
I have a video coming up on using Power BI with Azure ML and it should launch within the next few weeks. It doesn't cover automated ML endpoints but I will dig into this scenario and see if there are any differences between a regular Azure ML solution and an automated ML solution as far as Power BI is concerned. If there is a difference, I'll create a video on that scenario as well.
Great
Oh no it seems like the setup steps have changed in Automated ML :(
It does look like they've shuffled around the UI a bit, though in a quick review, the components do look to be the same (mostly) when you look across all of the dialog options for the old UI and the new one. There's an additional option in Compute for serverless compute instead of using an existing compute instance or cluster, but that appeared to be the biggest single change.
@@KevinFeasel I see. I'm a Power BI developer but a complete novice that is tinkering with this at work. I would really love a video showing the new UI features, and also a newbies guide for training a model in Automated ML and then deploying it
FTR your channel is very helpful so far, by far the most comprehensive that I've come across on Azure ML Studio
That sounds like a good idea. I'll add it to my backlog.