The correlation plot looks even better with one small tweak. Add a diverging setting to the conditional formatting and set it to a white color. This way, non-correlating items will be white instead of purple. The PBIX download already has this change.
I was thinking the same thing while watching the video. Nice job on this. It's amazing what you can do with some patience and understanding of how to make things work the way you want them to.
Great video! If anyone is following this, I would highly suggest removing outliers if you can. The Pearson r correlation is very sensitive to outliers and you might drown out your results if you don't remove them. A good way to remove outliers is to calculate your average and standard deviation for each attribute and then remove anything that isn't within (above or below) 3 standard deviations of that average. This can be quite tedious with larger datasets but it's worth it for finding those relationships between your variables.
Bro... you saved me with this, huge thanks! Otherwise I would have to learn Python to build this, and I am waaay to busy to tackle that right now. Besides, I am not even a data analyst, learning Python is above my paygrade.
I'm trying to get this to work I followed your video but when I add the correlation coefficient it's only displaying a number where x = y, is there a way to get it to display the calculation for everything else as well?
@@khushaalchaudhary6955 I got it to work by looking at the relationships between the tables/attributes tables. If you download the example file and ensure your relationships between the attributes tables/correlation tables are the same it should work!
Very helpful, thank you! (I haven't used virtual tables much and am still getting a grip on passing values in and out of variables so this was an awesome example to play with)
This could not have been better timed. I will be using it tomorrow to create a simple matrix like this. After that, if the results are as expected, I will be venturing into Python land to make a differen depiction where "r" will be controlling the size of the box. It's very hard to perceive color gradient to evaluate the correlation, but size makes a better "right in your face" representation.
Hi, I created all the fields but I'm running into issues with fields with text data. Ideally I do want to find correlation between different text paramters between different fields. Example - is there a positive correlation between 3 types of glasses Vs 8 types of cocktails. How would you recommend using the correlation formula for qualitative data?
Great technique and knowledge.Thank you. Btw,Why should we use MAX function in a CALCULATE function. I succeeded making the matrix but still have a question.Does anyone have an idea.
Hello, thank you so much to present this solution. Please, assuming that this matrix are with the same results spitted into two triangles, could you tell me how can we display only the one of those? Best, Rafael
looks great! I added a slicer for a hidden column that has a date_month but it doesn't seem to be filtering the data, do I have to adjust the measures to take the filter?
Great video. Thanks for sharing. However, my correlation coefficient return abnormal number, same attribute in X and Y do not return 1 but 0.xx instead. Has anyone face similar situation?
Hey man - I'm trying to make a tooltip on top of this. I've made a tooltip page in order to achieve it, and managed to get a header that (when hovering on a specific cell) displays the "[x variable] v [y variable]".. But the scatter plot within the tooltip itself ends up empty (unless it is the cell where x is the same as y). Any ideas on what my measures for X and Y should be on the scatter plot that simply needs to display the filtered data? Thanks so much.. My x measure so far is this btw (but it doesn't work!) Xall = VAR CurrentX = SELECTEDVALUE(Name2[Column1]) RETURN CALCULATETABLE( SELECTCOLUMNS( FILTER( 'Peers', 'Peers'[ISIN] = CurrentX ), "pChg", 'Peers'[pChg] ) )
Hi, Thanks for the video. Please, may you share the source file for the dataset used in this video, because the link you share on youtube description does not includes any excel files.
I'm somewhat new to this so my question is...is there any reason why this couldn't all be done in one measure using more variables instead of each measure creating the virtual table and iterating over it?
I just re-created a matrix following the steps and it didn't work because of the 'behind-the-scenes' linkages. If the matrix feels that it is presenting two sets of related data (related using joins) it only presents the diagonal axis. I had to delete the table associations for the matrix to work. I think, therefore, that using the same set for both axis would give the same result. It's an incontinence of PowerBI but one we can get around fortunately.
@@TemiAnData the main power bi screen had 3 icons in the left hand side. Report view, Data view and Table relationships. Go into table relationships and you'll notice your generated category tables are linked back to the table they were generated from. Delete the two associations. Done!
Hi i have implemented similiar code, somehow i end up getting only one's in the diagonal but no values anywhere else, not even zeros, did some one get , where exactly i might be making a mistake?? Thanks.
Unbelievable that you have to roll this yourself in Power BI! This is a one-liner in Python's Pandas library. df.corr().style.background_gradient(cmap='RdBu', vmin=-1, vmax=1)
thanks for the video.... i am actually needing some solution similar to this.... the video looks promising, but its really not working for me. I am being left with "1.00" acros the whole matrix
Hello sir, very helpful video thanks for that. Also i have a question, im looking for this matrix visual that you made but i need to make row and column makes effect from separate date filters i mean, i need 1 date filter effect rows and 1 date filter effects columns, how can i do it ?
Fantastic video! I tried to follow its steps, but I couldn't because it didn't show all the calculations of the "Cars Pivot" (Index, Attribute, Sum Value). Could you send the formulas for each "Cars Pivot"? Thank you very much!
Reviewing the video, I understood that a virtual table was created with the attribute and index, but I couldn't understand the measure of sum "value". Could you inform the calculation?
Yes,you have to work on the data modelling part..just download the corelation calculation file & follow the same..you will get the desired result..let me know if you able to solve it or not 😊😊😊
@@BIElite Pleeeeease if you find a way let us know, this is the best video I've seen about correlation matrix in Power BI, but I could't use it since all my data is in direct query mode, I'm so sad.
The correlation plot looks even better with one small tweak. Add a diverging setting to the conditional formatting and set it to a white color. This way, non-correlating items will be white instead of purple. The PBIX download already has this change.
Hi why don't you use zoom? :)
Can I get to filter by index?
I was thinking the same thing while watching the video.
Nice job on this. It's amazing what you can do with some patience and understanding of how to make things work the way you want them to.
XY =
VAR CurrentX = SELECTEDVALUE(Attributes2[Attribute])
VAR CurrentY = SELECTEDVALUE(Attributes[Attribute])
VAR Virtual =
SUMMARIZE(
IBMPivot,
IBMPivot[Index],
"X",CALCULATE(MAX('IBMPivot'[Value]),'IBMPivot'[Attribute] = CurrentX),
"Y",CALCULATE(MAX('IBMPivot'[Value]),'IBMPivot'[Attribute] = CurrentY)
)
RETURN
SUMX(
Virtual,
[X]*[Y]
)
Great video! If anyone is following this, I would highly suggest removing outliers if you can. The Pearson r correlation is very sensitive to outliers and you might drown out your results if you don't remove them. A good way to remove outliers is to calculate your average and standard deviation for each attribute and then remove anything that isn't within (above or below) 3 standard deviations of that average.
This can be quite tedious with larger datasets but it's worth it for finding those relationships between your variables.
Bro... you saved me with this, huge thanks! Otherwise I would have to learn Python to build this, and I am waaay to busy to tackle that right now. Besides, I am not even a data analyst, learning Python is above my paygrade.
A huge amount of effort for something that comes out of the box with some simple python commands
That´s really sick! Hope to see more data scienc-ish content in future.
I'll keep this in mind!
@@BIElite another good tutorial would be to how to integrate PowerBI and Python into a workstream
This was extremely useful. Thank you for this
I'm trying to get this to work I followed your video but when I add the correlation coefficient it's only displaying a number where x = y, is there a way to get it to display the calculation for everything else as well?
you should disable relationships between attributes tables and main data table. they get auto connected in PBI by "Attribute" column
I'm facing the same issue, did you found a solution?
@@khushaalchaudhary6955 I got it to work by looking at the relationships between the tables/attributes tables. If you download the example file and ensure your relationships between the attributes tables/correlation tables are the same it should work!
@@ilyastrelkov4246 OMG, I used it for my own dataset and had the same problem. Thanks a lot for advice!!!
Thank you! @@ilyastrelkov4246
Very helpful, thank you! (I haven't used virtual tables much and am still getting a grip on passing values in and out of variables so this was an awesome example to play with)
This could not have been better timed. I will be using it tomorrow to create a simple matrix like this. After that, if the results are as expected, I will be venturing into Python land to make a differen depiction where "r" will be controlling the size of the box. It's very hard to perceive color gradient to evaluate the correlation, but size makes a better "right in your face" representation.
Dude, awesome to hear Vlad!
Its awesome ! Thanks for sharing Parker..
No problem, Abhishek!
I liked the idea when I got it. great job
Amazing, thanks for this!
That's super amazing. Thanks for sharing.
Awesome! Looks good
Thank you!
Hi, I created all the fields but I'm running into issues with fields with text data. Ideally I do want to find correlation between different text paramters between different fields. Example - is there a positive correlation between 3 types of glasses Vs 8 types of cocktails. How would you recommend using the correlation formula for qualitative data?
Great technique and knowledge.Thank you.
Btw,Why should we use MAX function in a CALCULATE function.
I succeeded making the matrix but still have a question.Does anyone have an idea.
Hello, thank you so much to present this solution. Please, assuming that this matrix are with the same results spitted into two triangles, could you tell me how can we display only the one of those? Best, Rafael
looks great! I added a slicer for a hidden column that has a date_month but it doesn't seem to be filtering the data, do I have to adjust the measures to take the filter?
omg I was doing with with an R script, but it is not as interactive, thanks! :D
Great video. Thanks for sharing. However, my correlation coefficient return abnormal number, same attribute in X and Y do not return 1 but 0.xx instead. Has anyone face similar situation?
Hey man - I'm trying to make a tooltip on top of this. I've made a tooltip page in order to achieve it, and managed to get a header that (when hovering on a specific cell) displays the "[x variable] v [y variable]".. But the scatter plot within the tooltip itself ends up empty (unless it is the cell where x is the same as y). Any ideas on what my measures for X and Y should be on the scatter plot that simply needs to display the filtered data? Thanks so much.. My x measure so far is this btw (but it doesn't work!)
Xall =
VAR CurrentX = SELECTEDVALUE(Name2[Column1])
RETURN
CALCULATETABLE(
SELECTCOLUMNS(
FILTER(
'Peers',
'Peers'[ISIN] = CurrentX
),
"pChg", 'Peers'[pChg]
)
)
Hi,
Thanks for the video.
Please, may you share the source file for the dataset used in this video, because the link you share on youtube description does not includes any excel files.
Great trick, but it will be even better if you can show only the upper triangle or the lower one. Any ideas?!!
I thought about this briefly but couldn’t come up with a solution. If I ever do, I’ll add an update video or blog post.
Amazing!
Thanks!
I'm somewhat new to this so my question is...is there any reason why this couldn't all be done in one measure using more variables instead of each measure creating the virtual table and iterating over it?
I just re-created a matrix following the steps and it didn't work because of the 'behind-the-scenes' linkages.
If the matrix feels that it is presenting two sets of related data (related using joins) it only presents the diagonal axis.
I had to delete the table associations for the matrix to work. I think, therefore, that using the same set for both axis would give the same result. It's an incontinence of PowerBI but one we can get around fortunately.
What do you mean by delete table association cause I’m having the same issue too
@@TemiAnData the main power bi screen had 3 icons in the left hand side. Report view, Data view and Table relationships.
Go into table relationships and you'll notice your generated category tables are linked back to the table they were generated from.
Delete the two associations. Done!
Hi
i have implemented similiar code, somehow i end up getting only one's in the diagonal but no values anywhere else, not even zeros, did some one get , where exactly i might be making a mistake??
Thanks.
Unbelievable that you have to roll this yourself in Power BI! This is a one-liner in Python's Pandas library.
df.corr().style.background_gradient(cmap='RdBu', vmin=-1, vmax=1)
thanks for the video.... i am actually needing some solution similar to this.... the video looks promising, but its really not working for me. I am being left with "1.00" acros the whole matrix
Hello sir, very helpful video thanks for that. Also i have a question, im looking for this matrix visual that you made but i need to make row and column makes effect from separate date filters i mean, i need 1 date filter effect rows and 1 date filter effects columns, how can i do it ?
Fantastic video!
I tried to follow its steps, but I couldn't because it didn't show all the calculations of the "Cars Pivot" (Index, Attribute, Sum Value).
Could you send the formulas for each "Cars Pivot"?
Thank you very much!
Reviewing the video, I understood that a virtual table was created with the attribute and index, but I couldn't understand the measure of sum "value".
Could you inform the calculation?
so awesome and a lot of work. i may just run excel and get the numbers :-(
Is this pearson or spearman? What correlation coefficient is used?
Please tell me the software you use for
Screen recording
Zoom in and out
Camtasia 2020
@@BIElite thanks a lot
Top video, Parker (as usual). Would this work with Direct Query reports?
Interested in trying through DQ as well. Did it end up working?
Only getting values for matching varriables...☹️☹️
Same here. Have you discovered the problem?
Yes,you have to work on the data modelling part..just download the corelation calculation file & follow the same..you will get the desired result..let me know if you able to solve it or not 😊😊😊
@@sutapaghosh2226 thanks for the quick reply! :) What do you mean by the "correlation calc file"? I think i am missing something :)
@@sutapaghosh2226 I have found it! ! thanks again
There is a file attached in the description..calculation file..follow that one
Can this method applied to categorical data?
its being applied to categorical data
I think there's a quick measure for correlation coefficient nowadays
I couldn’t get this to work with multiple variables at a time. Have you?
@@BIElite true.
@@BIElite Pleeeeease if you find a way let us know, this is the best video I've seen about correlation matrix in Power BI, but I could't use it since all my data is in direct query mode, I'm so sad.
OMG! That's too complicated and lengthy. In Python, the `Panda` library can do this for you in a half line of very simple code!