As some others have stated, *.pbix aren't technically binary files, they are just re-badged zip files. If you change the extension and open it, basically everything inside (aside from actual binaries like images) are stored as plain text and easily diff-able. That said, there are ways to effectively use source control on pbix with some automation using pre-commit and post-commit hooks. The pre hook would unpack the pbix in to a folder of the same name, and THEN push those component pieces to source. The post-commit hook would then re-bundle what it gets from source control in to the appropriate pbix (zip) file/archive. I have been looking to implement this at my workplace for some time and haven't done so yet, but there are examples of people out there that have done it. Mostly this whole situation is infuriating to me as it means Microsoft could provide solutions to this source control problem very easily within their framework. Especially given they OWN GitHub and could VERY EASILY just add a feature (optional) to auto-unpack and re-pack PBIX files for source control.
I’ve played with a different approach in GitHub Actions. One every PBIX commit/change I parse the file and spit out a Markdown file that represents all the relevant pieces of the file in a more or less readable form. The areas that I interpret are model stats, pq, DAX measures and visuals types and positioning. Then I commit the md file along side the pbix file with the same commit message as the original change. This was the Markdown file works as a proxy for easy interpretation of what has changed in the original PBIX file. It’s enough to answer the most often asked questions, what has changed, was the change done only in the model or visuals as well etc
We have a setup where Power BI datasets and reports are stored in Azure Repo Git Repositories. We also have CI/CD pipelines for a multitenancy-like scenario where the same datasets and reports can be deployed to different customer workspaces. The TL;DR version is it can be done, but it takes A LOT of work. Outside of parameterizing the data source connections, we found it also helped to add a parameter for the number of rows you load. Saving the dataset with 0 rows loaded helps reduce the file size before you check in your code. Generating an HTML data dictionary helps a little with being able to compare differences in the model. Creating pipelines that can deploy the same datasets and reports to different customer workspaces in our case was a lot of Powershell Core scripts. I found it challenging to deploy the same source files to different workspaces that use different data source connections with their own respective on-premises data gateways. Trying to do this while preserving datasetIds and reportIds for me was really hard and I felt like I need to create "hacks" to finally get it work.
We keep the model definition in SSAS which allows us to at least separate the model definition. I design the bim in VS2019. Once ready I check the project in to DevOps which kicks off a pipeline that creates a deployment package. That is then automatically pushed to our development environment which installs the bim in SSAS, transforms the data source connections to the deployment environment etc. It even creates a job in SQLServer to refresh the cube every 15/30 mins. We also manually add our PBIX files to source control as well. Happy to share DevOps steps etc if anyone is interested.
@@sau002 yes, it's a big hammer approach to source control. That's why I put the model in SSAS using VS to build in all the detail. I've even written a translator to convert our software's internal formula language to DAX. All of which is easy to control in fine grain with source control from VS.
I know at one point, we'd saved them as templates just to have the versions without all of the data, but still can't "diff" those. I appreciate the tips here and look forward to true versioning/diffing at some point. :)
What I am wanting and waiting for the ability to do version control on the visuals and actual report components. ALM toolkit plus One Drive is fine for my model tracking and deployment.
Adam, the easiest part of PBIX to source-control is your Power Query. You can simply copy all the queries as if you were to migrate the code from PBIX to a dataflow, but paste it into a regular text file. This gives you all your queries with comments and properties, no JSON nonsense, only M script, easily accessible via any text editor, or source control platform. The best part - it's just as easy to re-integrate it back into PBIX - copy/paste either the whole thing or specific lines you need. As for binaries, I'm an old school - working directory with version numbering added to each file name, and a subdir that's called "Promoted v.X" and "My Report.pbix" in it. I rename the subdir when the newer version is published, but always publish under the same name to prevent the reset of usage history.
We check in/out the entire pbix files to tfs, it's not very efficient but in a team of six across four sites we need some way of actually locking the files so we don't end up working on the same reports at once. I love powerbi, but still amazed source control wasn't on the initial roadmap. Being able to use visual studio with reporting services fully integrated and do a diff against previous versions is such a time saver, especially when things go wrong.
Hi Adam. It's a pity we still don't have version control support for Power BI. In 21st century there's no excuse for not having plain text dev artefacts like JSON. Hopefully MS understands that and make it mandatory requirement for all new products. This is the reason why people might choose some other product instead. That said, diff/ merge is more of an enterprise requirement to support git branching for example. At that level you can probably swap Power BI Premium for Azure SSAS. And BIML should support that translation of model file into plain text. In the git you can store BIML files and expand into actual models as part of CI/CD. Plus as part of SSAS you get perspectives and partitioning, plus it can be cheaper than premium.
I am shocked that MS is not paying heed to such an important feature. Very frustrating. My company has started to explore Amazon Quicksights. Much better in this respect.
When the complexity level of reports gets real having an appropiate version control is crucial. I do not know why they haven't implemented a way of handling version control yet. Even more so when the pbix file is a zip and there are traces that pbix work with configuration files and the like...
This is a great tip. If you unzip the pbix then rename the DataMashup file to a zip as well you can access your power query m code as a .m file which is just text
I think you need to save as .pbit first then change extension to .zip to decode binary ? or can you do that with .pbix ? After unzipping there is a file called DataModelSchema.
It can also be used to modify/update images in the pbix file like logo or background as it can be tedious to do it in Power BI if you have multiple pages or images
OMG Adam thanks for showcasing the Version history button in windows! Yesterday I royally messed up one of my PBI's by adding a composite model I couldn't get rid of (why Microsoft why???). This is a huge help to go and recover the previous report w/o the composite model! Thanks!!!!!!
You can remove datasets you bring in to your model by removing them the in data source options, I looked for ages how to remove as they dont appear in power query.
for security reasons, you should never put the pbix in the source code controller as it contains the data. use the pbit. I hope that the future true microsoft solution for version management and source code control of power bi will take this into account: do not include the data.
I’ve noticed on my azure devops there is a file size limit when loading into a branch, so do you suggest that I convert from pbix to pbit and then upload the pbit file into my branch? Will all the connections, DAX formula be kept once this is converted back to a pbix ?
Very good remark Raynald. You should actually unload the data somehow. Maybe by reloading it from an empty source. But that also sucks a little bit. I now use live connection to a tabular model in the architecture so that solves the problem, but is not applicable everywhere...
Hey Adam, another option I've been doing is using powershell to archive all the pbix files of a Power Bi workspace, to an azure blob storage on a daily schedule.
Hi Keith Could you please share the powershell script to save the PBIX files to BLOB STORAGE, I am looking to do this as well and how do we automate this process of running the cmdlet on a daily basis
I work exlusive with SAS Viya and SAS 9.4 with SAS DI Studio. Some few impotent files, like csv for config is in git. I'm now searching for a BI product that actually use some form of intelligent source control, with support for branchning and diffs. It's a great frustration that BI completely lacks this way of thinking. Love to see that there are more out there, who loves version control(and onedrive etc, doesn't count)
Thoughts? Easy. As long as pbi does not have source control, using it relies on some external software/management process. Otherwise using it is really amateurish and error-prone. I mean, if anything goes wrong, but does not turn up in a dev pipeline (or there is none), it can't just be rolled back. This feels so wobbly in our daily work.
I copy the m code from the power query advanced editor and the SQL from the steps source and commit those plus the PBIX to our Azure DevOps repo. It is extra steps, but in addition to the version control I can search the SQL file easily, especially when our database team says they're changing a field or table, I can see what is affected.
Would be great if you could directly publish a PBIT file as storing the data contained in a PBIX file isn’t a great idea for a source control system. We ended up creating a process to use a “deflated” PBIX file and then using the Rest APIs automating the publish through a dev ops CI/CD process. It isn’t perfect, but has made deploys a little less manual to the UAT and Prod spaces.
Do you have an example on how we can deploy PBI report using GitHub actions? The new PBIP integration only works with DevOps. How can we customize it for GitHub users?
at minute 5:40 you show using VS Code. Is there a tool that will do the compare of BIM files but in an organized style kind of like ALM toolkit. I don't like the VS Code is just one big JSON file. I like when it is organized into tables... etc. Thoughts?
Do you think there is a way to take two different version pbix files and then zip them , extract their config, merge that together and then package the file back up? The goal would be to merge changes on the front end.
Hi guys, I'm using an Azure SQL DB with Azure AD. I can connect to this using SSMS and SSDT but when I try to connect with Power BI Desktop it won't authenticate the Azure Active Directory account (EXTERNAL_USER rôle). If I use an SQL login (SQL_USER rôle) with Power BI it works fine. Any thoughts on what I need to do to get this up and running? I now use ODBC connexion but this is not optimal and impossible to refresh data via Power BI Portal Thanks for any help
Thanks Adam, good to know there is integration with DevOps in the future. In regard to OneDrive, isnt there a way to link OneDrive to a workspace and by saving or updating the pbix file to OneDrive, it would sync to the workspace? Or did I imagine it!! At the moment we're just using sharepoint for version control.
I just want a modified timestamp of the report in the service so I can see if it is the latest version. I tell my clients to do source control typically in SharePoint Online document library with version history but this often isn't the case when they ask me to update their report.
@@guillermoleonrodriguez7858 Not really. It turned out, that multi-tenant CI/CD is a tricky thing and probably better suited for outside Pipelines like Azure DevOps, instead of using the built-on Power BI pipelines, which work great for single-tenant solutions.
All these COTS click and drag crap have the worst integration (specially granular) are terrible. And it’s sheerly moronic to develop these obvious model, view, controller parts as a single binary. MiroSUCKS stupidity! I always separate my systems in loose controllers, loose models and loose view components. Also for the sake of ultimate flexibility and cooperation with other devs. I’d never use Power BI after seeing this. It’s just not enterprise grade in my opinion.
I know at one point, we'd saved them as templates just to have the versions without all of the data, but still can't "diff" those. I appreciate the tips here and look forward to true versioning/diffing at some point. :)
I found this to be a big challenge. No easy solution. We have settled on ALM Toolkit. Considering that Microsoft has not shown any intention in simplifying the model structure to help "diff", I find that the ALM Toolkit does a decent job.
As some others have stated, *.pbix aren't technically binary files, they are just re-badged zip files. If you change the extension and open it, basically everything inside (aside from actual binaries like images) are stored as plain text and easily diff-able.
That said, there are ways to effectively use source control on pbix with some automation using pre-commit and post-commit hooks. The pre hook would unpack the pbix in to a folder of the same name, and THEN push those component pieces to source. The post-commit hook would then re-bundle what it gets from source control in to the appropriate pbix (zip) file/archive. I have been looking to implement this at my workplace for some time and haven't done so yet, but there are examples of people out there that have done it.
Mostly this whole situation is infuriating to me as it means Microsoft could provide solutions to this source control problem very easily within their framework. Especially given they OWN GitHub and could VERY EASILY just add a feature (optional) to auto-unpack and re-pack PBIX files for source control.
I’ve played with a different approach in GitHub Actions. One every PBIX commit/change I parse the file and spit out a Markdown file that represents all the relevant pieces of the file in a more or less readable form. The areas that I interpret are model stats, pq, DAX measures and visuals types and positioning. Then I commit the md file along side the pbix file with the same commit message as the original change. This was the Markdown file works as a proxy for easy interpretation of what has changed in the original PBIX file. It’s enough to answer the most often asked questions, what has changed, was the change done only in the model or visuals as well etc
We have a setup where Power BI datasets and reports are stored in Azure Repo Git Repositories. We also have CI/CD pipelines for a multitenancy-like scenario where the same datasets and reports can be deployed to different customer workspaces. The TL;DR version is it can be done, but it takes A LOT of work.
Outside of parameterizing the data source connections, we found it also helped to add a parameter for the number of rows you load. Saving the dataset with 0 rows loaded helps reduce the file size before you check in your code. Generating an HTML data dictionary helps a little with being able to compare differences in the model.
Creating pipelines that can deploy the same datasets and reports to different customer workspaces in our case was a lot of Powershell Core scripts. I found it challenging to deploy the same source files to different workspaces that use different data source connections with their own respective on-premises data gateways. Trying to do this while preserving datasetIds and reportIds for me was really hard and I felt like I need to create "hacks" to finally get it work.
This update will be very welcome! I have struggled a lot about a year ago trying to do something similar.
We keep the model definition in SSAS which allows us to at least separate the model definition. I design the bim in VS2019. Once ready I check the project in to DevOps which kicks off a pipeline that creates a deployment package. That is then automatically pushed to our development environment which installs the bim in SSAS, transforms the data source connections to the deployment environment etc. It even creates a job in SQLServer to refresh the cube every 15/30 mins. We also manually add our PBIX files to source control as well. Happy to share DevOps steps etc if anyone is interested.
Interesting. I have been thinking of the same
Do you use SSAS on-premise OR Azure ?
I am quite frustrated with PBIX/PBIT. Almost impossible to do any practical GIT integration and CI/CD or Pull Request review.
@@sau002 yes, it's a big hammer approach to source control. That's why I put the model in SSAS using VS to build in all the detail. I've even written a translator to convert our software's internal formula language to DAX. All of which is easy to control in fine grain with source control from VS.
I know at one point, we'd saved them as templates just to have the versions without all of the data, but still can't "diff" those. I appreciate the tips here and look forward to true versioning/diffing at some point. :)
What I am wanting and waiting for the ability to do version control on the visuals and actual report components. ALM toolkit plus One Drive is fine for my model tracking and deployment.
Adam, the easiest part of PBIX to source-control is your Power Query. You can simply copy all the queries as if you were to migrate the code from PBIX to a dataflow, but paste it into a regular text file. This gives you all your queries with comments and properties, no JSON nonsense, only M script, easily accessible via any text editor, or source control platform. The best part - it's just as easy to re-integrate it back into PBIX - copy/paste either the whole thing or specific lines you need. As for binaries, I'm an old school - working directory with version numbering added to each file name, and a subdir that's called "Promoted v.X" and "My Report.pbix" in it. I rename the subdir when the newer version is published, but always publish under the same name to prevent the reset of usage history.
We would love the opportunity to "diff" the files, but we can leave out the feature of "div" the files as you proposed :)) Thanks for the content!
We check in/out the entire pbix files to tfs, it's not very efficient but in a team of six across four sites we need some way of actually locking the files so we don't end up working on the same reports at once. I love powerbi, but still amazed source control wasn't on the initial roadmap. Being able to use visual studio with reporting services fully integrated and do a diff against previous versions is such a time saver, especially when things go wrong.
This is a great subject, I have been struggling a lot with version control
Hi Adam, That's a brilliant spacecraft version history maintained on your T-shirt as well 😀👌
Great video. To add on, deployment of bim file is supported in both premium per user, premium per capacity and power bi embedded
Definitely this is really required.
Using the Power BI actions available in the Marketplace for the CI/CD implementation using Azure DevOps for Power BI
Using ALM toolkit for comparing 2 PBIT files.
At 0:53 I think you meant "Diffs", but the video shows "Divs"
Thank you, this gave me a lot of confusion~
Hi Adam. It's a pity we still don't have version control support for Power BI. In 21st century there's no excuse for not having plain text dev artefacts like JSON. Hopefully MS understands that and make it mandatory requirement for all new products. This is the reason why people might choose some other product instead. That said, diff/ merge is more of an enterprise requirement to support git branching for example. At that level you can probably swap Power BI Premium for Azure SSAS. And BIML should support that translation of model file into plain text. In the git you can store BIML files and expand into actual models as part of CI/CD. Plus as part of SSAS you get perspectives and partitioning, plus it can be cheaper than premium.
I am shocked that MS is not paying heed to such an important feature. Very frustrating. My company has started to explore Amazon Quicksights. Much better in this respect.
@@sau002 does Amazon Quicksight has version control on visuals too or dataset only?
I like how the icon for "diff" is the html "div" icon < / >
When the complexity level of reports gets real having an appropiate version control is crucial. I do not know why they haven't implemented a way of handling version control yet. Even more so when the pbix file is a zip and there are traces that pbix work with configuration files and the like...
The .pbix file is a renamed .zip file. Change the extension to .zip Now you can see what is in the file. Might be able to do something in there.
This is a great tip. If you unzip the pbix then rename the DataMashup file to a zip as well you can access your power query m code as a .m file which is just text
I think you need to save as .pbit first then change extension to .zip to decode binary ? or can you do that with .pbix ? After unzipping there is a file called DataModelSchema.
@@MrBird666 I've only ever changed the .pbix file. Not sure what a .pbit file is.
It can also be used to modify/update images in the pbix file like logo or background as it can be tedious to do it in Power BI if you have multiple pages or images
OMG Adam thanks for showcasing the Version history button in windows! Yesterday I royally messed up one of my PBI's by adding a composite model I couldn't get rid of (why Microsoft why???).
This is a huge help to go and recover the previous report w/o the composite model!
Thanks!!!!!!
You can remove datasets you bring in to your model by removing them the in data source options, I looked for ages how to remove as they dont appear in power query.
This will be a great update, but I think that the PBI pipelines may support more than 3 stages.. i.e. De/QA/UAT/..../Stage/ Prod
0:57 shouldn't that be "diffs", not "divs"?
for security reasons, you should never put the pbix in the source code controller as it contains the data. use the pbit. I hope that the future true microsoft solution for version management and source code control of power bi will take this into account: do not include the data.
I’ve noticed on my azure devops there is a file size limit when loading into a branch, so do you suggest that I convert from pbix to pbit and then upload the pbit file into my branch? Will all the connections, DAX formula be kept once this is converted back to a pbix ?
Very good remark Raynald. You should actually unload the data somehow. Maybe by reloading it from an empty source. But that also sucks a little bit.
I now use live connection to a tabular model in the architecture so that solves the problem, but is not applicable everywhere...
Hey Adam, another option I've been doing is using powershell to archive all the pbix files of a Power Bi workspace, to an azure blob storage on a daily schedule.
Hi Keith
Could you please share the powershell script to save the PBIX files to BLOB STORAGE, I am looking to do this as well and how do we automate this process of running the cmdlet on a daily basis
There is also Synapse. Which has a workspace and GIT integration. Same limitations to Azure DevOps.
You said diffs, and I was also thinking fetch /merge as in a git action. But the video showed "Divs".
I work exlusive with SAS Viya and SAS 9.4 with SAS DI Studio. Some few impotent files, like csv for config is in git. I'm now searching for a BI product that actually use some form of intelligent source control, with support for branchning and diffs. It's a great frustration that BI completely lacks this way of thinking. Love to see that there are more out there, who loves version control(and onedrive etc, doesn't count)
0:52 - 's or diffs??
Extract templates and those can also help in version control
Thoughts? Easy.
As long as pbi does not have source control, using it relies on some external software/management process. Otherwise using it is really amateurish and error-prone.
I mean, if anything goes wrong, but does not turn up in a dev pipeline (or there is none), it can't just be rolled back.
This feels so wobbly in our daily work.
Great video. Thanks Adam.
Which is the visual to get that USA Map in the report? Thanks
Rename your .pbix file to .zip and bingo! Once you extract that zip you can source control the different files and components that make up the pbix.
It would be possible to register a diff viewer for the pbix file extension which would unzip both sides and then do a text diff.
@@JanekBogucki also, you can use Power BI Premium to view workspace differences to help migrate reports.
I copy the m code from the power query advanced editor and the SQL from the steps source and commit those plus the PBIX to our Azure DevOps repo. It is extra steps, but in addition to the version control I can search the SQL file easily, especially when our database team says they're changing a field or table, I can see what is affected.
Hi Adam, have you had the chance to try one of the Azure Dev Ops Extensions to get the Power BI CI/CD working?
Would be great if you could directly publish a PBIT file as storing the data contained in a PBIX file isn’t a great idea for a source control system. We ended up creating a process to use a “deflated” PBIX file and then using the Rest APIs automating the publish through a dev ops CI/CD process. It isn’t perfect, but has made deploys a little less manual to the UAT and Prod spaces.
I share your sentiments. I am frustrated too.
Please, could you elaborate on your REST apis approach?
Do you have an example on how we can deploy PBI report using GitHub actions? The new PBIP integration only works with DevOps. How can we customize it for GitHub users?
at minute 5:40 you show using VS Code. Is there a tool that will do the compare of BIM files but in an organized style kind of like ALM toolkit. I don't like the VS Code is just one big JSON file. I like when it is organized into tables... etc. Thoughts?
Can Tabular Editor be used for extracting a BIM file from a Power BI template (PBIT) file?
is there a way that can actually track changes with the report design of .pbix?
Do you think there is a way to take two different version pbix files and then zip them , extract their config, merge that together and then package the file back up? The goal would be to merge changes on the front end.
Hi guys,
I'm using an Azure SQL DB with Azure AD. I can connect to this using SSMS and SSDT but when I try to connect with Power BI Desktop it won't authenticate the Azure Active Directory account (EXTERNAL_USER rôle). If I use an SQL login (SQL_USER rôle) with Power BI it works fine. Any thoughts on what I need to do to get this up and running?
I now use ODBC connexion but this is not optimal and impossible to refresh data via Power BI Portal
Thanks for any help
What about checking in the .pbit? We keep our .pbit in devops so we don't end up checking in all that data.
Has anyone found a solution to get the conversation id outlook? can someone help me?
Thanks Adam, good to know there is integration with DevOps in the future. In regard to OneDrive, isnt there a way to link OneDrive to a workspace and by saving or updating the pbix file to OneDrive, it would sync to the workspace? Or did I imagine it!! At the moment we're just using sharepoint for version control.
How manage power bi reports on Github?
I just want a modified timestamp of the report in the service so I can see if it is the latest version. I tell my clients to do source control typically in SharePoint Online document library with version history but this often isn't the case when they ask me to update their report.
You could put a V1.03 type of thing in a corner. Just make sure you update it if you make a change.
Hey, is this still the current state of CI/CD in Power BI? Are there any significant releases around this since you made this video?
Hi Sebastian, did you have some answers since the last time you posted this?. Thanks in advance.
@@guillermoleonrodriguez7858 Not really. It turned out, that multi-tenant CI/CD is a tricky thing and probably better suited for outside Pipelines like Azure DevOps, instead of using the built-on Power BI pipelines, which work great for single-tenant solutions.
Thanks!. Appreciate your thoughts.
In the mean time, Tabular Editor is licenced software that you have to buy.
Power BI is a great product, but the oppertunities for ALM and version control is embarrassing, not only pbix file, even more for dataflows!
What is the point of voting for an idea for developers when Microsoft clearly focuses on ideas for end users ?
Did your editor fucked up by confusing div with diff?
haha, you mean "diffs" not "divs" right?
All these COTS click and drag crap have the worst integration (specially granular) are terrible. And it’s sheerly moronic to develop these obvious model, view, controller parts as a single binary. MiroSUCKS stupidity! I always separate my systems in loose controllers, loose models and loose view components.
Also for the sake of ultimate flexibility and cooperation with other devs. I’d never use Power BI after seeing this. It’s just not enterprise grade in my opinion.
I know at one point, we'd saved them as templates just to have the versions without all of the data, but still can't "diff" those. I appreciate the tips here and look forward to true versioning/diffing at some point. :)
I found this to be a big challenge. No easy solution. We have settled on ALM Toolkit. Considering that Microsoft has not shown any intention in simplifying the model structure to help "diff", I find that the ALM Toolkit does a decent job.