This looks so powerful. Thanks for sharing! As a fellow beginner to intermediate Python enjoyer, I would definitely love a more in-depth video of The Wrangler!
We need more low-code, no-code tools! With visual step by step data preparation steps. I wish there was a tool like SAS Enterprise Guide in Microsoft basket of tools.
I love getting more low-code/no-code tooling. And also visually seeing the outcome. I'm excited to see the improvements on the Data Wrangler side of things.
@GuyInACube Last year I studied Alteryx for a while. If memory serves me, that tool is designed to perform what is being described in this video. MSFT Fabric Data Wrangler returns the code, as well as cleansed data, and I don't recall seeing that in Alteryx, and the user experience seems to be more magical and enjoyable. So impressive!
Patrick, I love the image from your camera. Sharpness, contrasts, background blur. It's perfect. It's easy to see when you're talking to other people who don't have such a pretty picture. What camera/lens are you using?
This is pretty cool but I can't help thinking this would create a slower process than what most modern analytics departments/teams would do. And maybe I am biased because I am a career software engineer and architect but also do extensive backend development and engineering. For starters, someone with little coding experience would avoid Python all together and should be directed towards more friendly tooling such as the existing PowerBI framework and utilizing Power Query. For someone supporting data scientists or the data scientists themselves they would work through Databricks or strictly Python environments because all the visuals are easily and quickly dissimilated and extrapolated through inference. I guess I don't really understand who this is marketed to or what's the advantage of the tooling here. It's kinda cool in a way but I'm up a crick without a paddle as to who the actual user would be
Well this is exactly what I was hoping for. I'm delivering DA to client's environments, and often they lack the tooling. Usually all they have is a Microsoft stack (enterprise license). So I need to deliver and a way to run my python notebooks on their environment. Power Query is often not enough for my tasks. It's been a big challenge for me. Setting up a data factory and training clien't on it is way too time consuming and doesn't lead to good results. This is great for me as it would allow me to deliver all my notebooks to clien't environment, and this step by step approach is way more palatable for a client.
As a PBI and Power Query guy the thing that jumped out for me from this video was the ability to take the created steps back into the notebook. Maybe I have missed this on another video but I wonder if we can do a similar thing now in Fabric with Power Query? By this I mean create our applied steps in Power Query and then where previously we could go and right click to get our native SQL query and copy that into a Synapse Workspace to create a view is there a menu option / way of doing that operation more streamlined inside of Fabric?? "Create view and add to lakehouse" type button would be ideal ?!
Please comment more on how this is related to Python!! Like what Python can do vs. what Wrangler can do. For all Python-noobs like me. Thank you! Also do we need a license to try this? What is the data limit with this tool? Larger than Power Query?
This is great. Can we automate data wrangler with Power Automate or use it with some orchistrator? What if I want the wrangler to run notebook everytime my data changes?
Hey Alex, your data wrangler steps would show up as notebook code, and you can just schedule the notebook to run regularly using the UI. Which would result in the effectively the same experience I believe! :)
@@justynalucznik2491 It would, but I'm just cautious of resources, as I'd need to run this notebook very often. Realistically I'd need them to run only when something in an excel file is changed by a user, which happens maybe once or twice per day at random times. However, the report needs to reflect changes asap, can't wait for an hour.
If Python has all of these ETL tools why would you use Python over PowerQuery? I'd imagine you'd use Datafactory for all ETL, then only use Python for Data Science. I'm just starting to learn Python myself so I may have misunderstood how its used on Fabric.
There are a lot of different toolings to do similar type activities. Traditionally the python approach has been used by data scientists, but the tools are there for others to use as well. I think it's an interesting opportunity for folks on the citizen developer front that could start leveraging Notebooks and Python by way of The Wrangler. 🙂 The traditional data warehousing approach would be to use Data Factory / Pipeline for your ETL. If that's what you are comfortable with, go for it! It works. I'd also say that a citizen developer persona wouldn't really know that approach either. So, different options to get to your end goal.
It's currently not on our roadmap. Next up is Spark dataframe support and generation of PySpark code. I encourage you to pitch this in the Fabric Ideas site where others can upvote your idea. We actively monitor those ideas as we further develop experiences: ideas.fabric.microsoft.com/
Why can't we just use the Power Query style interface and output any code we want from the steps, Python, SQL, M for whatever purposes we want? Line by line coding should have died in the 1980s,.Why in 2023, when we have a task to do with data we are still asking people to learn multiple coding languages. It's just dumb.
This looks so powerful. Thanks for sharing! As a fellow beginner to intermediate Python enjoyer, I would definitely love a more in-depth video of The Wrangler!
We need more low-code, no-code tools! With visual step by step data preparation steps. I wish there was a tool like SAS Enterprise Guide in Microsoft basket of tools.
I love getting more low-code/no-code tooling. And also visually seeing the outcome. I'm excited to see the improvements on the Data Wrangler side of things.
Loved the video! Definitely a useful tool that i hope to use more often, would love more videos like this.
@GuyInACube
Last year I studied Alteryx for a while. If memory serves me, that tool is designed to perform what is being described in this video. MSFT Fabric Data Wrangler returns the code, as well as cleansed data, and I don't recall seeing that in Alteryx, and the user experience seems to be more magical and enjoyable. So impressive!
Patrick, I love the image from your camera. Sharpness, contrasts, background blur. It's perfect. It's easy to see when you're talking to other people who don't have such a pretty picture. What camera/lens are you using?
Love it.
Hoping to work with Microsoft fabric in the future
This is pretty cool but I can't help thinking this would create a slower process than what most modern analytics departments/teams would do. And maybe I am biased because I am a career software engineer and architect but also do extensive backend development and engineering.
For starters, someone with little coding experience would avoid Python all together and should be directed towards more friendly tooling such as the existing PowerBI framework and utilizing Power Query.
For someone supporting data scientists or the data scientists themselves they would work through Databricks or strictly Python environments because all the visuals are easily and quickly dissimilated and extrapolated through inference.
I guess I don't really understand who this is marketed to or what's the advantage of the tooling here.
It's kinda cool in a way but I'm up a crick without a paddle as to who the actual user would be
Well this is exactly what I was hoping for. I'm delivering DA to client's environments, and often they lack the tooling. Usually all they have is a Microsoft stack (enterprise license). So I need to deliver and a way to run my python notebooks on their environment. Power Query is often not enough for my tasks. It's been a big challenge for me. Setting up a data factory and training clien't on it is way too time consuming and doesn't lead to good results. This is great for me as it would allow me to deliver all my notebooks to clien't environment, and this step by step approach is way more palatable for a client.
As a PBI and Power Query guy the thing that jumped out for me from this video was the ability to take the created steps back into the notebook. Maybe I have missed this on another video but I wonder if we can do a similar thing now in Fabric with Power Query?
By this I mean create our applied steps in Power Query and then where previously we could go and right click to get our native SQL query and copy that into a Synapse Workspace to create a view is there a menu option / way of doing that operation more streamlined inside of Fabric??
"Create view and add to lakehouse" type button would be ideal ?!
great great comment. instead of copy and pasting powerquery steps to something like a dataflow - its an instant button press?
Cool, I'm going to have to get more involved with Python.
Please comment more on how this is related to Python!! Like what Python can do vs. what Wrangler can do. For all Python-noobs like me. Thank you! Also do we need a license to try this? What is the data limit with this tool? Larger than Power Query?
This is great. Can we automate data wrangler with Power Automate or use it with some orchistrator? What if I want the wrangler to run notebook everytime my data changes?
Hey Alex, your data wrangler steps would show up as notebook code, and you can just schedule the notebook to run regularly using the UI. Which would result in the effectively the same experience I believe! :)
@@justynalucznik2491 It would, but I'm just cautious of resources, as I'd need to run this notebook very often. Realistically I'd need them to run only when something in an excel file is changed by a user, which happens maybe once or twice per day at random times. However, the report needs to reflect changes asap, can't wait for an hour.
If Python has all of these ETL tools why would you use Python over PowerQuery? I'd imagine you'd use Datafactory for all ETL, then only use Python for Data Science.
I'm just starting to learn Python myself so I may have misunderstood how its used on Fabric.
There are a lot of different toolings to do similar type activities. Traditionally the python approach has been used by data scientists, but the tools are there for others to use as well. I think it's an interesting opportunity for folks on the citizen developer front that could start leveraging Notebooks and Python by way of The Wrangler. 🙂
The traditional data warehousing approach would be to use Data Factory / Pipeline for your ETL. If that's what you are comfortable with, go for it! It works. I'd also say that a citizen developer persona wouldn't really know that approach either.
So, different options to get to your end goal.
thankyou for your reply, much appreciated :)@@GuyInACube
Any future plans to work with Polars dataframes?
It's currently not on our roadmap. Next up is Spark dataframe support and generation of PySpark code. I encourage you to pitch this in the Fabric Ideas site where others can upvote your idea. We actively monitor those ideas as we further develop experiences: ideas.fabric.microsoft.com/
@@nellieGson Polars is the fastest growing dataframe library, it's just a matter of time until it surpasses pandas.
Patrick is too cool.
Yes, could you please do & Share the detailed information video on Data wrangler?
Thanks for that! 👊
Why can't we just use the Power Query style interface and output any code we want from the steps, Python, SQL, M for whatever purposes we want? Line by line coding should have died in the 1980s,.Why in 2023, when we have a task to do with data we are still asking people to learn multiple coding languages. It's just dumb.
geez.... yet another drag and drop point and click tool....