What is data modelling in Power BI?
It’s the process of defining relationships between the different tables within our database. It entails relationship definition, considering cardinality and direction, as well as defining our facts and dimension table. In this blog, we shall look into how to create and design a Power BI model.
Fact table and Dimension table
A fact table stores information about metrics, measurements, or facts related to business entities. It contains two types of information: related to facts and information about foreign keys. A dimension table, on the other hand, contains information about business entities and is usually descriptive in nature.
Relationships and Cardinality in Power BI: In Power BI the directionality of the relationship plays an important role. There are four types of relationships in Power BI.
One to many (1: *) or many to one (*:1): Each value in the column of one table (one side) is related to multiple values in the column of another table. (many side) This is the most common type of directionality in Power BI.
In the image above, we can take a closer look at how such a relationship can be identified in Power BI.
- The (*) star next to the sales table indicates that the sales table is the many sides of the relationship, aka that the sales table has multiple instances of the same area code.
- The (1) one next to the market table indicates that the market table lies on one side of the relationship, meaning that the area code is uniquely identifying each row in the table.
- The arrow between the two tables indicates the directionality of filtering. Since the arrow is facing the sales side of the table, it indicates that the market table is currently filtering the sales side of the table. And the cross-filter direction is set to single only.
One to one (1:1): Each value in each column from one table is uniquely related to a unique value in another column in another table.
Many to many (*): Multiple values in one table can be related to multiple values in another table. It’s uncommon to use this kind of relationship in a data model since it introduces a lot of ambiguity.
Defining Granularity
Granularity defines the level of precision and detail in our data. For example, a sales table may list each customer’s name each time a transaction happens, thus having repeated customer names, whereas a customer table may only have each unique customer listed once. In this case, the sales table is considered to be more granular than the customer table.
Star schema
It’s a type of dimensional model that identifies one table as a fact table and the other tables as dimension tables. The dimension tables are used to filter the fact table. It’s considered to be the most optimized data model for Power BI. A fact table consists of the metrics and measures that we are interested in; in other words, it contains information about business events. A dimension table, on the other hand, contains information about business entities; these are then used to filter on the different measures within our main fact table. For example: A sales table may have information such as unit price, quantity, and revenue. Each row contains information about each individual transaction.
Snowflake Schema
It consists of one fact table that is connected to dimension tables; however, these dimension tables can also be related to each other. It is an extension of a star schema.
Role-playing dimensions
Role-Playing Dimensions in Power BI refer to a single dimension table that can be used multiple times in the same model, typically to serve different roles based on the context of the analysis.
For example, in a sales model, a date dimension can play multiple roles:
- Order Date: Used to track when an order was placed.
- Ship Date: Used to analyze when an order was shipped.
- Delivery Date: Used to understand when the order was delivered.
Instead of creating multiple identical dimension tables for each role, you create one date table and use it in various roles by creating relationships to different columns in the fact table (like order date, ship date, etc.).
However, since there can only be one active relationship between two tables, we need to use the USERELATIONSHIP function in Dax to filter the fact table on the ship date or delivery date (if the active relationship exists between the order date and the date dimension).
Another way to solve this issue is by using calculated tables for each of the ship date and delivery date and then creating active relationships between these new tables and the fact tables.
Editing Relationships
Power BI automatically detects relationships between your tables, but you can modify and create relationships manually using the “Manage Relationships” feature in the Model view. Here, you can add new relationships or edit existing ones. By clicking “Autodetect,” Power BI will automatically identify relationships between your tables. Alternatively, you can create a new relationship by selecting “New Relationship.” This allows you to choose the related tables, define the cardinality, specify the cross-filter direction, and mark the relationship as active if needed.