Increasing compression in Power BI and Analysis Services - Unplugged #15

แชร์
ฝัง
  • เผยแพร่เมื่อ 27 ธ.ค. 2024

ความคิดเห็น • 48

  • @singhrakeshr
    @singhrakeshr 3 ปีที่แล้ว +5

    For someone like me dealing with a pbix file with 8 facts taking in more than a million rows per day and with 25 dims and watch the size of the model grow each day wondering when it will be large enough not to fit in memory, these tips are Priceless..

  • @JBartlett_DMU
    @JBartlett_DMU 3 ปีที่แล้ว +2

    You just completely blew my mind, Marco. Outstanding work, as usual!

  • @caronturner8009
    @caronturner8009 3 ปีที่แล้ว +1

    Gold! Thanks Marco. Excellent content and demo.

  • @1sefirot9
    @1sefirot9 3 ปีที่แล้ว +1

    Cool, already knew about the fixed decimal (also from sqlbi), but now we have added insights about tabular editor and the hierarchy sizes.

  • @mcnater
    @mcnater 3 ปีที่แล้ว +2

    Awesome stuff Marco. Love these sessions.

  • @rerangelt
    @rerangelt 3 ปีที่แล้ว +1

    The 5 minutes to wow seems to far away. As always, great content and explained so well. Thanks.

  • @thomasstroebel2266
    @thomasstroebel2266 2 ปีที่แล้ว +1

    Really great video !
    I use this at work !
    However, I did the same operation on a decimal column that’s encoded in HASH.
    After this operator, the column is encoded in HASH one more time.
    I don’t really understand.
    I ask myself if someone has an explanation or theory about this.
    Thanks a lot

  • @alexandredelim1
    @alexandredelim1 3 ปีที่แล้ว +1

    Amazing explanation and approach. Congrats :)

  • @mshparber
    @mshparber 3 ปีที่แล้ว +1

    Awesome! Thanks, Marco!

  • @natolira
    @natolira 3 ปีที่แล้ว +1

    Great as always!
    One thing I want to add is that I do a step further after deleting the unnecessary columns. If you do that from the Power BI interface might introduce duplicate rows on the fact table.
    So I prefer to do it in Power Query and group by so I guarantee that I don't have duplicate rows.
    So in the end I might have a even more compressed model. But the main reason I do that is because of context transition problem when you call it over a table that contains duplicate rows.
    Especially with the new composite models, because you can't always control how other people will create measures!

    • @AntrikshSharma
      @AntrikshSharma 3 ปีที่แล้ว +1

      Primary Key in a Fact Table in a Tabular model is bad. If you are worried about the duplicate values in case of context transition, then just remember never initiate Context Transition on a Fact Table. :)

  • @johansantacruz6464
    @johansantacruz6464 3 ปีที่แล้ว +1

    This is great! Thanks for your books and videos.

  • @sricharang9537
    @sricharang9537 3 ปีที่แล้ว +3

    Hi Marco,
    This is brilliant as usual!
    Does disabling the MDX property on all foreign keys and Dimkeys have any side effects, as these keys mostly will not be used in measures/filters?

    • @SQLBI
      @SQLBI  3 ปีที่แล้ว +5

      There should be no side effects, as long as you don't use them to filter data (filters on other columns of the same dimensions are not affected).

    • @joelguay4452
      @joelguay4452 3 ปีที่แล้ว

      @@SQLBI Will this affect the performance of crossfiltering though relationships tho ?

    • @SQLBI
      @SQLBI  3 ปีที่แล้ว +2

      no, no effects on cross filter.

  • @sbakkis
    @sbakkis 3 ปีที่แล้ว +1

    Great video, Marco. Will there be a video also for for sorting fact table, segmentation (DefaultSegmentRowCount) for maximizing data compression instead of the dictionary?

    • @SQLBI
      @SQLBI  3 ปีที่แล้ว

      It's too specific for very large databases (> 10GB) and usually it does not worth the effort. I don't see much value for that, it's hard to replicate in small models.

  • @baskarvijayan3282
    @baskarvijayan3282 2 ปีที่แล้ว

    Hi Marco,
    I have changed this column encoding to Value from HASH with help of tabular editor,
    but this result surprised me , tha data size no getting double while comparing with previous size.

    • @SQLBI
      @SQLBI  2 ปีที่แล้ว

      Indeed, the engine usually does the best choice!

  • @kalinduSekara
    @kalinduSekara 3 ปีที่แล้ว +1

    Valuable lesson Marco. Thanks

  • @CharudattaAmbekar
    @CharudattaAmbekar 3 ปีที่แล้ว

    If we had already selected data type as fixed decimals already then would the VertiPaq have chosen the correct encoding mode (Value type) for these columns?

    • @SQLBI
      @SQLBI  3 ปีที่แล้ว

      It depends on data distribution. Integer and fixed decimals number can be compressed using hash encoding if the algorithm establishes that it would be convenient. Because the algorithm decides based on the first rows of the table, it might be wrong. Encoding hints could be useful in that case.

  • @Obizzy8
    @Obizzy8 3 ปีที่แล้ว +1

    Excellent Marco!

  • @lucillagiannetti715
    @lucillagiannetti715 2 ปีที่แล้ว

    Minute 2 of the video: Is there any way to identify which columns are not used in the model so that I can remove them?

  • @premsagar3254
    @premsagar3254 2 ปีที่แล้ว

    What about the columns with string data type , can we change them from hash to value

    • @SQLBI
      @SQLBI  2 ปีที่แล้ว

      You cannot, string must be hash.

  • @csgexec
    @csgexec 3 ปีที่แล้ว +1

    Great info Marco.

  • @sylvainrobakowski5050
    @sylvainrobakowski5050 3 ปีที่แล้ว

    Thank you for this valuable insight! I tried to apply this tips on a high cardinality column with hash encoding. I set the column to fixed decimal and hint to value. Then I refresh data. But after refreshing DAX Studio, I still have hash encoding. I guess the engine still thinks it’s better but how to know when hint will work or not without trying it for each column? Is there a way to force value encoding in order to see compression and performance impacts? Numerical values shouldn’t always been encoded with value?
    Thanks for any reply.

    • @SQLBI
      @SQLBI  3 ปีที่แล้ว +1

      A hint is a hint, not an enforcement. For example, if a column (int or currency) has values that cannot fit in 32 bit, the encoding is always hash. For floating point, if the number cannot be represented in a 32-bit integer applying simple transformations applying power of 10, you get always hash encoding.

    • @tangtom2478
      @tangtom2478 3 ปีที่แล้ว

      Hi, Marco, want to second this comment after trying out in my data model. I am just wondering if there’s any safe way to enforce the encoding to be VALUE? Thanks! 🙏🏻

  • @tarun007
    @tarun007 3 ปีที่แล้ว

    The column in the data model is not getting refreshed. You will have to close the pbix and then reopen it for it to work. This is a bug in power bi which came with the march update.

    • @SQLBI
      @SQLBI  3 ปีที่แล้ว

      Which column? Did you try with April 2021 update?

    • @tarun007
      @tarun007 3 ปีที่แล้ว

      @@SQLBI Please load any new table into the existing pbix and go to the Data model tab, you will see that the table is not present in the data model until you close and open the pbix once again. Please check.

  • @QuLaZz
    @QuLaZz 3 ปีที่แล้ว

    Can the property availableInMDX = false, be used for sort by columns?

    • @SQLBI
      @SQLBI  3 ปีที่แล้ว

      What do you mean? Please, can you clarify the question?

    • @QuLaZz
      @QuLaZz 3 ปีที่แล้ว +1

      @@SQLBI In SSAS tabular, I got an error when changing this property for a column that is used to sort another column.
      Error massage is: Column in table has the SortByColumn property set to an invalid column ID.

  • @emirtuncer
    @emirtuncer 3 ปีที่แล้ว +1

    Thanks for the excellent video 👏

  • @visheshjjain
    @visheshjjain 3 ปีที่แล้ว

    Would it have helped/worked, if instead of changing the Unit/Net Price to Fixed Decimal Number we keep it as Decimal Number only and reduce it to 2 decimal places?

    • @SQLBI
      @SQLBI  3 ปีที่แล้ว

      Yes, the result should be similar in terms of granularity, but if you need to remove the dictionary you are usually more effective by using the Fixed Decimal Number.

    • @visheshjjain
      @visheshjjain 3 ปีที่แล้ว

      @@SQLBI in that case would you suggest to always use Fixed Decimal for all numeric value columns or only currency columns?

    • @SQLBI
      @SQLBI  3 ปีที่แล้ว

      It depends. Fixed Decimal Numbers have a limited range compared to Decimal Number. If the behavior of Fixed Decimal Number fits your requirements, of course you should use them. But pay attention to the limitations. See this: www.sqlbi.com/articles/choosing-numeric-data-types-in-dax

  • @ivanmezev
    @ivanmezev 3 ปีที่แล้ว

    Encoding Hint optiion not available when connected to Analisys Services.

    • @SQLBI
      @SQLBI  3 ปีที่แล้ว

      It depends on the version of SSAS but it has been there for a while. What is the version you are using?

    • @ivanmezev
      @ivanmezev 3 ปีที่แล้ว

      @@SQLBI standard edition, version 13

    • @SQLBI
      @SQLBI  3 ปีที่แล้ว +2

      It's AS2016, you need at least AS2017 (version 14).

    • @ivanmezev
      @ivanmezev 3 ปีที่แล้ว +1

      I see. Thanks

  • @janaykon
    @janaykon 3 ปีที่แล้ว +1

    Immortal...!!! Ex..