Uncategorized

The internal behavior of Calculated Columns, Row Contexts and the EARLIER() Function in Power BI

Evaluation context is the heart of the DAX language in Power BI  and it is essential to learn them in order to understand the internal behavior of the calculation engine in tabular data models.

In this article, We will closely understand how calculated columns are created, its relationship to the row context and the EARLIER function.

 

What are Calculated Columns?

With calculated columns, you can add new data to a table already in your model. But instead of querying and loading values into your new column from a data source, you create a Data Analysis Expressions (DAX) formula that defines the column’s values.

In Power BI Desktop, calculated columns are created by using the New Column feature in Report View. Calculated Columns are visible in the data model just like other columns of the table and the purpose of it is to extend the data model & support the advance calculations.

Let’s say that you have a very simple Item table like this:

Item 1

And you want to add a third column that is the total for each item:

item 2

This is the combination of magic between FILTER, Row Context and EARLIER functions. If you don’t really understand row context, you can simply follow the pattern here to add that third column for the moment being.

So the question is

what’s going on in that subtotal calculated column:

Let’s understand this step by step:

Since there are ten rows in the table, the calculated column formula needs to iterate over the table rows 10 times—once to calculate each row’s resulting value (in this case, 50, 90,120, 110, and 70, as pictured above). Within each one of those ten steps, a reference to a column will evaluate to the value of the specified column from that current row. So, for instance, in the second row of the table, a reference to Item[Amount] will return 2. Just like you want it to.

On the off chance that this seems complex already, we want to illustrate that there is nothing complex here yet because this is precisely the way you already understand things. Now say that you write another calculated column, this time with the following formula:

=CONCATENATE(Item[ItemName], Item[Amount])

You get, of course, precisely what you expect:

Row by Row iteration

item 3.png

This is because, at each one of the ten iterations in the calculated column (one step per row), a reference to a column is essentially limited to the current row. It would be quite unusual if, in the first row, a reference to the [Amount] column evaluated to 2! Of course you want it to evaluate to 1. ( Row by Row Iteration)

As a calculated column evaluates, once for each row in the table, the DAX term for the concept of “current row at each step/ Row by Row iteration”. It’s called row context in DAX.

Now Let’s comeback to the original formula to dig deep into this context:

Per Item Totals = CALCULATE(SUM(Item[Amount]),

                            FILTER(Item,

                            Item[ItemName]=EARLIER(Item[ItemName])

                             ))

The formula for the Per Item Totals column contains a FILTER() function, and

FILTER() is an iterator function in DAX. Just as the calculated column formula itself is evaluated once per each row context, FILTER(‘Table’,…) also steps through each row in Table, one at a time, much like the calculated column formula itself!

One of the key characteristics of FILTER() is that it ignores the row context generated by the calculated column and operates on the full table. This makes it simultaneously confusing (at first) and awesome (once you grasp it).

Since Item Table has ten rows, every time FILTER(‘Table’,…) is evaluated, FILTER() has its own ten steps to process. At each step, it performs the check “Hey, should I keep this row or filter it out?” All in all, then, there are 100 steps involved in evaluating the calculated column: (Something like CROSSJOIN in SQL)

 

So there are ten calculated column row contexts,” and tucked inside each of those 10 filter row contexts – an outer loop of ten row contexts and an inner loop of ten row contexts.

Now, within a FILTER() function, a “naked” reference to a column will evaluate according to the row context of the FILTER() function itself, not the calculated column row context. Inside FILTER(), the reference to ‘Table'[Column1] will be evaluated according to that “inner” row context generated by FILTER() function – completely ignoring the “outer” row context of calculated columns

(Remember that FILTER() has to iterate over every row in the table to see if it should be kept or discarded.)  So even when the formula is evaluating in the outer row context (where [ItemName]=”Item 1″), the “naked” reference to ‘Item'[ItemName] inside FILTER() will iterate over to evaluate “Item 2” – in the iteration of 100 steps.

Within FILTER(), how do you “get back” to the outer loop and see what is the value of [ItemName] in Row 1? You use the EARLIER() function!

EARLIER() basically says “pop out of the current row context and look at the outer (earlier) row context instead, when evaluating the value of the specified column.” EARLIER allows you, within FILTER(), to examine every row in the table and compare it to the current row of the calculated column. So it is focusing on a given row of a table and as well as going back at all the rows in the table, saying to each row and asking “Hey, are you similar to me? If so, I want to count you.”

And yes, it is possible (but very rare) to end up in situations where you have more than two nested loops of row contexts. That’s why EARLIER() has the optional second argument Number—because you can specify how many loops “outward” you want to step. When you omit that argument, it defaults to 1. This is also why there is an EARLIEST() function: Its purpose is to get back to the absolute outermost row context, no matter how deeply nested you might find yourself.

APPLICATIONS OF EARLIER PATTERNS

How do I reference the row immediately ‘above’/’below’ this one?

 

For the calculation of Running Total in Calculated column

Item 7.png

So what is happening here:

FILTER creates inner loop and EARLIER grabs the outer loop row context that is generated by the calculated column asking “Hey, are you similar to me? If so, I want to count you.”

Item 8.png

Pattern 2

item 6.png

Pattern 3

Day Before or Week Before

Item 9.png

Pattern 4

EARLIER in Measures

RunningTotal = CALCULATE(
SUM( 'Item'[Amount] ),
FILTER( ALL('Item') ,
SUMX( FILTER( 'Item', EARLIER( 'Item'[date] ) <= 'Item'[date] ), 
                     'Item'[amount] )
))

 

EARLIER is one of the complex functions in the DAX to understand and we will keep explore it the upcoming posts.

 

Categories: Uncategorized

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s