Microsoft announced the public preview availability of Datamarts in May 2022. When authoring this chapter, Datamarts are still in preview, so using them in production environments is not recommended. The Datamarts capability opens endless possibilities for organizations to achieve their data analytics goals on the Power BI platform. Before we look into the Power BI Datamarts, let us take a step back and understand the meaning of a Datamart.
This article is an excerpt from the book Expert Data Modeling with Power BI, Third Edition by Soheil Bakhshi, a completely updated and revised edition of the bestselling guide to Power BI and data modeling.
What is a Datamart?
A quick search on the Internet provides multiple definitions by technology-leading companies such as IBM, Amazon, and Oracle. They all agree that a Datamart is a subject-oriented subset of a data warehouse focusing on a particular business unit, department, subject area, or business functionality. The Datamart’s data is usually stored in databases containing a moving frame required for data analysis, not the full history of data. For instance, organizations store the historical and current data of various business sectors, such as sales, inventory, marketing, finance, etc., in an enterprise data warehouse. They then create a Datamart for social marketing for the past 5 years. The following diagram shows a simplistic architecture of Datamarts of top of an enterprise data warehouse:
In the preceding diagram, the first ETL arrow (the left arrow) shows the processes of extracting the data from the source systems, transforming and loading it into an enterprise data warehouse that keeps a full history of data. Then we have some other ETL processes to constantly land the past 5 years of data into the Datamarts.
In the real world, the Business Intelligence (BI) or the Information Technology (IT) departments are responsible for implementing and maintaining the ETL processes, the enterprise data warehouse and Datamarts, which often take a reasonably long time to deliver the solution to the end users. This lengthy process usually leads the end-users to try to solve the problem themselves by creating Datamart-like solutions, often created in Excel files stored in their personal space. From a data governance perspective, this is a massive risk to organizations by exposing them to the whole laundry of privacy and security breaches.
Now that we have a better understanding of a Datamart and its challenges, let us look at it through the Power BI lens.
What is Power BI Datamarts?
As mentioned earlier, one of the weaknesses of traditional data warehousing and building datamarts is that they require a lot of knowledge to create and maintain, making the process lengthy and costly. The industry answer to this challenge was self-service BI, which usually suffers from a lack of governance. As we all know, one of the many strengths of Power BI is the ability to package self-service BI capabilities and governance altogether. With the Power BI Datamarts technology, Microsoft provides a governed self-service datamart capability in the Power BI Service. Power BI Datamarts provide no-code/low-code datamart capabilities using Azure SQL Database technology in the background. The Power BI Datamarts support sensitivity labels, endorsement, discovery, and Row-Level Security (RLS), which help protect and manage the data according to the business requirements and compliance needs. Therefore, the whole process of data integration, transformation, preparation, storage, modeling, and visualization is governed by the organization. To recap, Power BI Datamarts can be used in any or all of the following scenarios:
- Creating self-service solutions with fully managed relational database capability, backed by Azure SQL Database without requiring an Azure subscription.
- No-code/low-code experience using a diagram view in the data preparation layer similar to Dataflows.
- Building business-focussed semantic layers in the cloud (the Power BI Service) with data modeling capabilities, such as managing relationships, creating measures, defining incremental refresh, and creating and managing RLS.
Now that we have a good understanding of what Datamarts are in Power BI, let us see what they are not.
Demystifying Power BI Datamart misunderstandings
Whenever we start using a new tool, we first have to understand what it is and in what situations it is meant to be used. The same principle applies to Power BI Datamarts. So far, we have learned what Power BI Datamarts are, but it is also important to know what they are not. This subsection aims to demystify some misunderstandings and misconceptions. A Power BI Datamart is not:
- A replacement for Dataflows. As we learned in the previous section, a Dataflow is a self-service ETL and data preparation layer connecting to various data sources, transforming the data and storing the results in CSV format in Azure Data Lake Gen 2 (ADLS Gen2). A Datamart is a self-service BI solution containing a self-service data preparation (or ETL) layer and a data model (or semantic layer). While the Datamart includes the data preparation layer, the results are stored in an Azure SQL database for different requirements than Dataflows. The Dataflows will still be used across the organization for self-service data preparation and enabling organizational users to reuse the results. Therefore, Datamarts are not a replacement for Dataflows.
- A replacement for datasets. While Datamarts contain datasets, we cannot consider them as a replacement for datasets. As mentioned earlier, there are currently many data modeling restrictions in Datamarts. We will discuss the Datamart’s current restrictions in a separate subsection, but just to name some, we cannot currently create calculated columns or calculated tables in a Datamart. We also cannot define aggregations or define Object-level security.
- A replacement for or an alternative to data warehouses. As discussed earlier, a datamart is a business-centric subset of a data warehouse that does not contain the full history of data. Indeed, data warehouses are the best candidates to be used as a source for a Power BI Datamart; therefore, they are not meant to replace data warehouses.
Lastly, the Datamart Editor is not a replacement for Power BI Desktop. While we can implement data preparation and data model in a Datamart, this capability is not mature enough to replace the need to develop solutions in Power BI Desktop. At the time of writing this book, there are so many restrictions, especially for data modeling, making the Datamart capability far behind Power BI Desktop. Besides, this capability is a Premium feature, while Power BI Desktop is free. Moreover, Power BI Desktop supports much more data sources than Datamarts. Therefore, they are not replacing Power BI Desktop.
In summary, we briefly introduced the idea of Datamarts, and we’ve considered some different aspects that make up a Datamart. We’ve also looked into the power of Datamarts, which includes some of the benefits Datamarts have over traditional warehousing systems. Finally, we demystified some common misconceptions and misunderstandings that plague Datamarts.