Thank for making such a valuable video, really like it. I had a similar problem, I have used the incremental refresh feature to address this. Since sharepoint isn't a foldable query source, we just need to add a Date/Time column before all the transformation and data collection step for it to work
Having an archive file would also help with the file and folder renaming issues when each FY is saved to its own folder system. Thanks for showing these various approaches! I have a fair few files I now need to review and tweak! Thank Wyn
Very good techniques, visited few times this video. i liked the sharepoint.content, split folder, exporting via bravo, importing archive file and appending query
We do that even with Oracle as we have over millions of records and historical years are slow. So we have one that refreshes daily, history parameters are updated once every quarter. So, this is pretty neat trick even beyond SharePoint. Also, if you leverage premium everything is done and ready in one place
Great video as always Wyn! Easily my favourite Power BI / Query sub 😊 This is very timely, as I have a use case generating >25k rows of monthly data, with an archive requirement of 3yrs trend. Several constraints exist, primary being no SQL hence data source being SharePoint. I’m not sure whether DataFlow is enabled but if not, the CSV technique looks great. As you say, easy enough to perform that extraction in DAX Studio as well. Cheers Wyn and happy Easter! 🐣
When loading data, I would connect to the SharePoint site and then load the data. After loading the data, you can filter the folders in the right most column to find the folder you are looking for instead of changing to sharepoint.contents function.
Sticking with filtering (using SharePoint.Files ) results in a slower refresh. The more files and folders on your SharePoint site the longer your refresh will take using that method.
This is amazing. I am having the exact same problem as you stated. Receiving several files every day, it takes almost 20 minutes to refresh the data as the project is growin larger. Thank you so much for sharing! Amazing the data flow solution!! Is there a way to automate the process? for example to take the last five files in the desktop and everything but the last 5 in the data flows?
Thanks, you could potentially filter on modified date, it’s a bit risky though, plus you’d still need to refresh the dataflow each time so not really worth it ( I don’t think). I’d just set up a separate folder for the data flow source and put the files in there occasionally and refresh. You could use power automate to trigger the data flow refresh as soon as files are added into the folder
@@AccessAnalytic thanks for your response. I was thinking about using power automate to make the transition between folders every ten days or so. I made a copy of my desktop file, given the fact that I have too many queries to copy, and with this copied pbix, I made the same think you made in dataflows but with desktop and publishing the new dataset. Thanks for the inspiration!
Tremendous again Wyn. Diolch yn fawr always. Question: the CSV download of tables with Bravo. Does that export the table as it appears in PowerQuery, or is it the table from the model, PowerBI front end, including any calculated columns etc?
I just saw that in minute 4:08 some transformations are done, but it look you edited because it does not show how you handled the custom function. In my case I do a lot of transformations before is ready to combine tables.
What a great video! The first approach, appending historical data in PQ. When working with large datasets, it still takes a long time to load that historical data with every refresh (we disable it in it's own query, but it's being pulled through appended table). Is there a way to load historical data once, and then disable it for future refreshes? Maybe skipping Appending part would help, but then table with historical/current data is split into two tables...
With files / folders it will always pull all data in each time. Gilbert has a blog article that I’ve never tried out www.fourmoo.com/2020/06/10/how-you-can-incrementally-refresh-any-power-bi-data-source-this-example-is-a-csv-file/
@@AccessAnalytic If it is pulling all data each time then how it is doing it quickly vs one single query ? I have also faced this problem with appending queries but it did not solve the issue of heavy refresh time. Using append with DAX is the solution I ended up doing after loading both the queries in the model and disabling refresh on historical data query. Would be really interesting to know how your method is reducing the refresh time and if it will be applicable to large data files as well using append query method ?
This is to echo your approach. As i did that in my work and it saved really lots of refreshing time. (there are 100+ files with more than 10M rows of records). Ideally, incremental refresh should be the best way in my opinion... however, I just do have that in my workspace... :( So this technique is a good alternative in real life.
@@AccessAnalytic just to add one point. It works well when original data files are Excel. If original data files are already CSV, it won’t make a big difference (at least in my case).
it throws up error when consolidating sheets. Error - we didn't recognize the format of the first file(). Please filter the list of files so that it contains supported type. However if we use the files instead of content, this works. Any suggestion?
I'm definitely sharing this channel. Amazing practical, no nonsense content. Thank you for the fantastic tutorials!
Awesome, thanks Dave
Wyn is probably the closest thing to an excel wizzard that I've ever came across, the guy literally does magic with a piece of software, well done.
Hah! Too kind Anthony ☺️
This tip with the starting folder to reference on is huge!
Yeah that's a good one. Ken Puls (Skillwave) is a big proponent of that technique
Thank you for sharing this tip, I've been frustrated trying to find this solution.
I appreciate you taking the time to let me know you found it useful
Thank for making such a valuable video, really like it. I had a similar problem, I have used the incremental refresh feature to address this. Since sharepoint isn't a foldable query source, we just need to add a Date/Time column before all the transformation and data collection step for it to work
Thanks Khalid. An incremental refresh option is on on my video "to-do" list. Good one.
Having an archive file would also help with the file and folder renaming issues when each FY is saved to its own folder system. Thanks for showing these various approaches! I have a fair few files I now need to review and tweak! Thank Wyn
You’re welcome Grainne. Thanks for the ongoing support
Just getting on board with dataflows now. Oh good, I'm only a year behind... Excellent video, as always.
Don't worry, you're at least 5 years ahead of most!
I've been using your videos in my work and the practical tips surely help! They are hard to find in the internet.
Cheers
Very good techniques, visited few times this video. i liked the sharepoint.content, split folder, exporting via bravo, importing archive file and appending query
I appreciate you taking the time to let me know you found it useful
Very good solution, will definitely use the dataflows! Thanks!
Cheers
I just found your channel and its very helpful! thank you for your content.
Thank you, glad it has helped
Thanks Wyn, very helpful and detailed tutorial.
You’re welcome, thanks
Great information Wyn! Thanks for sharing. Thumbs up!!
Cheers Wayne
Amazing, thanks a lot for your video. It helped me to resolve an issue I was facing to get data using sharepoint folders.
Fantastic to hear, thanks for taking the time to let me know
amazing video as always Wyn!
Thanks Fizek
We do that even with Oracle as we have over millions of records and historical years are slow. So we have one that refreshes daily, history parameters are updated once every quarter. So, this is pretty neat trick even beyond SharePoint. Also, if you leverage premium everything is done and ready in one place
Good to know Khaled
Great, I learned much from this. Thank you so much
You’re welcome
Great video as always Wyn! Easily my favourite Power BI / Query sub 😊
This is very timely, as I have a use case generating >25k rows of monthly data, with an archive requirement of 3yrs trend. Several constraints exist, primary being no SQL hence data source being SharePoint.
I’m not sure whether DataFlow is enabled but if not, the CSV technique looks great. As you say, easy enough to perform that extraction in DAX Studio as well.
Cheers Wyn and happy Easter! 🐣
Cheers for the support 😀
When loading data, I would connect to the SharePoint site and then load the data. After loading the data, you can filter the folders in the right most column to find the folder you are looking for instead of changing to sharepoint.contents function.
Sticking with filtering (using SharePoint.Files ) results in a slower refresh. The more files and folders on your SharePoint site the longer your refresh will take using that method.
@@AccessAnalytic thanks for that info
This is amazing. I am having the exact same problem as you stated. Receiving several files every day, it takes almost 20 minutes to refresh the data as the project is growin larger.
Thank you so much for sharing! Amazing the data flow solution!!
Is there a way to automate the process? for example to take the last five files in the desktop and everything but the last 5 in the data flows?
Thanks, you could potentially filter on modified date, it’s a bit risky though, plus you’d still need to refresh the dataflow each time so not really worth it ( I don’t think). I’d just set up a separate folder for the data flow source and put the files in there occasionally and refresh. You could use power automate to trigger the data flow refresh as soon as files are added into the folder
@@AccessAnalytic thanks for your response. I was thinking about using power automate to make the transition between folders every ten days or so. I made a copy of my desktop file, given the fact that I have too many queries to copy, and with this copied pbix, I made the same think you made in dataflows but with desktop and publishing the new dataset. Thanks for the inspiration!
No worries
Tremendous again Wyn. Diolch yn fawr always.
Question: the CSV download of tables with Bravo. Does that export the table as it appears in PowerQuery, or is it the table from the model, PowerBI front end, including any calculated columns etc?
Diolch, it will be the data model and pretty sure it will include calculated columns.
Thank you for sharing. As always very helpful¡¡¡¡
You’re welcome arnau, thanks for the kind comment
I just saw that in minute 4:08 some transformations are done, but it look you edited because it does not show how you handled the custom function. In my case I do a lot of transformations before is ready to combine tables.
Hi, no editing, the helper functions get auto created. Take a look at my explanation here : th-cam.com/video/nPlrQUbEn4o/w-d-xo.html
What a great video! The first approach, appending historical data in PQ. When working with large datasets, it still takes a long time to load that historical data with every refresh (we disable it in it's own query, but it's being pulled through appended table). Is there a way to load historical data once, and then disable it for future refreshes? Maybe skipping Appending part would help, but then table with historical/current data is split into two tables...
With files / folders it will always pull all data in each time. Gilbert has a blog article that I’ve never tried out www.fourmoo.com/2020/06/10/how-you-can-incrementally-refresh-any-power-bi-data-source-this-example-is-a-csv-file/
@@AccessAnalytic If it is pulling all data each time then how it is doing it quickly vs one single query ? I have also faced this problem with appending queries but it did not solve the issue of heavy refresh time. Using append with DAX is the solution I ended up doing after loading both the queries in the model and disabling refresh on historical data query. Would be really interesting to know how your method is reducing the refresh time and if it will be applicable to large data files as well using append query method ?
I’ve not tried Gilbert’s approach so I’m not sure. DAX append sounds an interesting approach
@@AccessAnalytic We have been using DAX append for several years now. Though it bloats the model a bit but refreshes are lightening fast....
hello Wyn, do you get the same performance if you do it from onedrive?. Thaks
Hi Osiel, you should do. One Drive is your personal SharePoint
This is to echo your approach. As i did that in my work and it saved really lots of refreshing time. (there are 100+ files with more than 10M rows of records).
Ideally, incremental refresh should be the best way in my opinion... however, I just do have that in my workspace... :(
So this technique is a good alternative in real life.
Thanks for the confirmation
@@AccessAnalytic just to add one point. It works well when original data files are Excel. If original data files are already CSV, it won’t make a big difference (at least in my case).
@@wmfexcel OK, thanks
Thank you
You're welcome
it throws up error when consolidating sheets. Error - we didn't recognize the format of the first file(). Please filter the list of files so that it contains supported type. However if we use the files instead of content, this works. Any suggestion?
Not come across that. Is it purely .xlsx files in the folder?
Could you please share how to resolve "combine files --didn't recognize the format"? My files are all xlsx but it failed to upload in Power BI Query
I’d suggest posting some screenshots and explanations here aka.ms/excelcommunity for some help on that