

Performance implications of using calculated columns in composite models on Powe...
source link: https://blog.crossjoin.co.uk/2024/02/04/performance-implications-of-using-calculated-columns-in-composite-models-on-power-bi-semantic-models/
Go to the source link to view the article. You can view the picture content, updated content and better typesetting reading experience. If the link is broken, please click the button below to view the snapshot at that time.

Performance Implications Of Using Calculated Columns In Composite Models On Power BI Semantic Models
I don’t have anything against the use of calculated columns in Power BI semantic models in general but you do need to be careful using them with DirectQuery mode. In particular when you have a DirectQuery connection to another Power BI semantic model – also known as a composite model on a Power BI semantic model – it’s very easy to cause serious performance problems with calculated columns. Let’s see a simple example of why this is.
Let’s say you have an Import mode semantic model called Source Model containing sales data:
Here’s the contents of the Sales table:
And here’s the definition of the Sales Amount measure:
Sales Amount = SUM(Sales[Sales])
Now, let’s say this semantic model gets published to the Service and you open Power BI Desktop, create a Live connection to this model and then click the “Make changes to this model” button to create a composite model. You now have a local composite model in Power BI Desktop that is connected to the Source Model semantic model.
Next, you decide to create a calculated column on the Sales table and after reading Marco and Alberto’s article on circular dependencies (because you’ll definitely run into circular dependency errors) come up with the following code:
Tax =
CALCULATE (
[Sales Amount] * 0.1,
ALLEXCEPT ( Sales, Sales[TransactionNumber] )
)
Great! You can now build a report that looks like this:
This is where the first warning comes though. In Import mode semantic models calculated columns are evaluated when the model is refreshed, but in DirectQuery mode they are evaluated at query time if the query needs to use them. Performance Analyzer shows two DAX queries generated for the visual above. The first is the query that the visual runs against the local semantic model in Power BI Desktop, which looks like this:
DEFINE
VAR __DS0Core =
SUMMARIZECOLUMNS(
ROLLUPADDISSUBTOTAL(ROLLUPGROUP('Country'[Country], 'Product'[Product]), "IsGrandTotalRowTotal"),
"Sales_Amount", 'Sales'[Sales Amount],
"SumTax", CALCULATE(SUM('Sales'[Tax]))
)
VAR __DS0PrimaryWindowed =
TOPN(502, __DS0Core, [IsGrandTotalRowTotal], 0, 'Country'[Country], 1, 'Product'[Product], 1)
EVALUATE
__DS0PrimaryWindowed
ORDER BY
[IsGrandTotalRowTotal] DESC, 'Country'[Country], 'Product'[Product]
There’s nothing much interesting here. However, there’s a second DAX query generated: the one generated by the local model to get the data it needs from the Source Model semantic model in the Service. This query looks like this:
Define
COLUMN 'Sales'[ASDQ_Tax] = CALCULATE([Sales Amount] * 0.1, ALLEXCEPT(Sales, Sales[TransactionNumber]))
var ASDQ___DS0Core = SUMMARIZECOLUMNS( ROLLUPADDISSUBTOTAL(ROLLUPGROUP('Country'[Country], 'Product'[Product]), "IsGrandTotalRowTotal"),
"Sales_Amount", [Sales Amount],
"SumTax", CALCULATE(SUM('Sales'[ASDQ_Tax]))
)
var ASDQ___DS0PrimaryWindowed = TOPN(502, ASDQ___DS0Core, [IsGrandTotalRowTotal], 0, 'Country'[Country], 1, 'Product'[Product], 1)
EVALUATE
ASDQ___DS0PrimaryWindowed
ORDER BY
[IsGrandTotalRowTotal] DESC, 'Country'[Country], 'Product'[Product]
Notice, right at the top of the query, a DEFINE statement that defines the Tax calculated column. Every time this visual is rendered in your Power BI report the calculated column will be re-evaluated and that will have a performance overhead. If the calculated column had been created in Source Model (which is Import mode remember) it would have been evaluated when the model was refreshed and any queries that used it would probably be faster as a result.
There’s something else to watch out for though. Let’s say you define a second calculated column in the local model like so:
Apple Sales =
CALCULATE (
[Sales Amount],
FILTER ( ALLNOBLANKROW ( 'Sales'[Product] ), 'Sales'[Product] = "Apples" )
)
You can now create a table that looks like this, which does include the Apple Sales column but does not include the Tax column:
The DAX query sent by the visual to the local model is not worth looking at. However, the DAX query sent by the local model to Source Model does have something surprising in it:
Define
COLUMN 'Sales'[ASDQ_Apple Sales] = CALCULATE([Sales Amount], FILTER(ALLNOBLANKROW('Sales'[Product]), 'Sales'[Product]="Apples"))
COLUMN 'Sales'[ASDQ_Tax] = CALCULATE([Sales Amount] * 0.1, ALLEXCEPT(Sales, Sales[TransactionNumber]))
var ASDQ___DS0Core = SUMMARIZECOLUMNS( ROLLUPADDISSUBTOTAL(ROLLUPGROUP('Country'[Country], 'Product'[Product]), "IsGrandTotalRowTotal"),
"Sales_Amount", [Sales Amount],
"SumApple_Sales", CALCULATE(SUM('Sales'[ASDQ_Apple Sales]))
)
var ASDQ___DS0PrimaryWindowed = TOPN(502, ASDQ___DS0Core, [IsGrandTotalRowTotal], 0, 'Country'[Country], 1, 'Product'[Product], 1)
EVALUATE
ASDQ___DS0PrimaryWindowed
ORDER BY
[IsGrandTotalRowTotal] DESC, 'Country'[Country], 'Product'[Product]
There’s a DEFINE statement for the Apple Sales column as you would expect. However there is also a DEFINE statement for the Tax column, which means it is evaluated too even though it’s not shown in the visual and it seems as though the Apple Sales column doesn’t reference it – or does it? I’m not going to try to explain what’s going on here (again, this is best left to Marco and Alberto) but the important point is that both calculated columns are now being evaluated at query time which means the query will be even slower. If you have large data volumes and/or your calculated columns contain some complex DAX you could end up with some very slow reports. If you’re using Premium then you could also be putting a lot of extra load on your capacity.
Composite models are an important part of Power BI’s self-service story and it’s inevitable that at some point your developers will want to use build calculated columns in them. As always you need to monitor the performance of your reports carefully, both in terms of duration and CPU usage, while you are building them to make sure they are well optimised and keep the use of calculated columns in composite models to a minimum.
Share this:
Published by Chris Webb
My name is Chris Webb, and I work on the Fabric CAT team at Microsoft. I blog about Power BI, Power Query, SQL Server Analysis Services, Azure Analysis Services and Excel. View all posts by Chris Webb
Recommend
-
38
Recently I was working on an MDX calculation that calculated a value at a low granularity and aggregated the result – basically what I described in myblog post here. Here’s a simplified version of the calculation written...
-
14
PowerBI & Big Data – Using pre-calculated Aggregations of Semi- and Non-Additive Measures Calculating and visualizing semi- and non-additive measures like distinct count in Power BI is usually not a big deal. H...
-
3
Creating Calculated Columns in...
-
9
Increase Database Performance By Using Multiple Columns Last updated 07/11/2021 By thinking outside of the box, you can often get much faster database performance by splitting values across multiple database col...
-
10
Measures and calculated columns both use DAX expressions. You can compute values using calculated columns or using measures. This blog post will explain the difference between a calculated column and a measure and when you should use each one...
-
3
Composite models in Power BI — the Ultimate GuideDo you need to combine the data from multiple different Direct Query data sources? Or, mix imported data with DirectQuery? The composite models fe...
-
9
A Rails model has several ActiveRecord methods that make interacting with the database very simple. Models can be created, updated and destroyed very easily. However, in some situations, one might want to mark a model as “readonly”. Mean...
-
7
YOUSUF VAID 1 hour ago Changing the crosstab header and supressing the calculated columns 5 Views ...
-
5
December 9, 2022 ...
-
11
Perceptually uniform color models and their implicationsMax BoSoftware Engineer @ Canva | music, color theory, Australian democracy, situa...
About Joyk
Aggregate valuable and interesting links.
Joyk means Joy of geeK