Traditionally, business intelligence projects tend to use a star (Kimball) or snowflake(Inmon) schema; a design approach that has become the de facto design standard for data warehouses and cubes over the last few decades. Below is an example of a star schema:
The center of the star is called the fact table. It describes the measurements, facts, or metrics of a business process. In this case, the fact table, Sales, contains sales records; one row for each sales transaction. The center of the star is surrounded by dimensions. Each dimension is a descriptive table that describes the attributes of a fact. Here we have, Dates, Customers, Products and Regions, these dimension tables provide more details about the facts.
This arrangement is based on keys inside the tables; for example, the sales table contains ProductKey, and the Product table also contains ProductKey. The ProductKey from the sales table is called a foreign key, and the ProductKey from the Product table a primary key. One single unique product has many different sales for the same product. This is known as a one-to-many relationship.
In a report, the data from the fact table is usually aggregated and sliced by fields from one or more of the dimensions. For example, we might want to look at the sum of Amount by Year and Region. If we visualize how data flows in the schema, we’ll see that the Sales table is filtered to show only values for the year and location selected.
This is the default behavior of filtering in Power BI and is also available in SQL Server Analysis Services Tabular Models and Excel Power Pivot.
If you look at the relationships in the Power BI model, you’ll see arrows on the relationships that indicate how tables will get filtered. The Power BI data model a.k.a Analysis Services tabular model by default only filters from the 1 to the Many side of the relationship. If the relationship is bidirectional (that is, it has two arrows on it), then the filtering happens also from the many-side to the one-side. This is not supported in the Excel Power Pivot and older versions of SSAS tabular models.
It is also important to understand that in the tabular data models, relationships can be created on single columns only. Multiple column relationships are not supported by the engine.
Let’s understand this by taking an example that might help you understand this behavior better.
Figure 1 The Power BI Data Model
Let’s create a matrix visual based on the data model shown in figure 1, with the years from Dates table on the rows and Sum of SalesAmount and you will see the result shown in Figure 2.
Figure 2 The Effect of Filtering Across Multiple Tables in Action
The Row Labels contain the years—that is, a column from the Dates table. Dates Table is on the one-side of the relationship with the FactSales table. So when you put the Sum
of SalesAmount in the matrix visual, the tabular engine filters Sales based on the year.
If you now modify the matrix visual by putting the Color on the rows and adding the Count of Date in the values area, the result is somewhat harder to understand,
You get the value 3652 for all the colors. So what happened. Figure 3 shows that if bidirectional filtering is not active, tables are not getting filtered.
Figure 3 The Unfiltered table
Now let’s make the relationship between the FactSales and Dates table bidirectional and you will get the filtered rows for every color of the product in the Products table as shown in the Figure 4. As you can see, the numbers are now different, reflecting the number of days on which at least one product of the particular color was sold. In technical terms, the filter on the Product propagates to the Dates table using a chain of relationships.
Figure 4 The Bi Directional filtering Effect
So the question is why we are getting the wrong number for Count of Dates when the bidirectional filtering is not on. In fact, it always shows the same value for all the rows—and by the way, this number is the total number of rows in the Dates table.
The reason why the filter coming from the Color column does not propagate to Date is that the relationship between Dates and FactSales has a single arrow, pointing from Dates to FactSales. Thus, even if FactSales has an active filter on it, the filter cannot propagate to Dates, because the type of relationship prevents it.
If you change the relationship between Dates and FactSales to enable bidirectional filtering, then the result will be the one shown in Figure 4.
As shown in Figure 5 If you enable bidirectional filtering, then the Dates table is filtered using the Color column.
Figure 5 Bi Directional Filtering Active
At first sight, it might look as if all the relationships should be defined as
bidirectional, so as to let the filter propagate in any direction and always return meaningful results. However, this is not always the correct way of designing a data model. In fact, depending on the scenario you are working with, you will choose the
correct propagation of relationships.
But business problems are usually never this clear-cut and there are some well-known patterns that are not easily solvable with filters flowing into just one direction or bi directional.
In the next few blogs, I ’ll introduce you to some of those complex scenarios and show you how tabular models handles bidirectional cross-filtering in those situations. We will also learn, how ambiguity in the data model is created if you turn on the bi directional filtering for all the relationships and what should be the solution for those ambiguous data models.