We will walk through the basic features of Excel, PowerPivot, Power Query, and PowerBI to better understand these tools and how they can best help your organization. Along the way, we will build our vocabulary, including ETL, Data Types, Formulas, Aggregations, the role of a few languages used in Analysis (SQL, M, DAX), the benefits of visualizations vs. paginated reports, and the power of transactional vs. analytical databases. With a firm understanding of these core concepts and capabilities, you and your team will be better prepared to plan out the proper roadmap for your organization and outsource some or all of it to empower your team.
This article will give you a solid understanding of the core principles of these technologies to better engage in discussions about how best to use these tools to benefit your organization. This article is not to replace a knowledgeable and experienced resource to assist with the development and deployment of your solution. With monthly or even more frequent updates in the toolsets, it would soon be out of date with all the features available.
Visualizations vs. Paginated Reports
To better view transaction data, we can leverage charts and conditional formatting. One of the challenges of working with data is that extracting meaningful information from the data is difficult. Charts help the human eye to gather information from data quickly. Selecting the exemplary diagram makes a difference in understanding the data more effectively.
In the table below, we can see sales listed by customers. A simple table with rows and columns typically prints as a one-plus list called a paginated report (Excel has limited ability for paginated reports). We can prepare a list and provide each page with a header and footer. We would likely use one of Microsoft’s more advanced reporting tools for more complex paginated reports – SQL Server Reporting Services or Power BI premium.
|Customer Locations||Sum of Sales|
|Alpine Ski House||8,696.44|
|A. Datum Corporation||5,018.33|
|City Power & Light||5,350.02|
|Blue Yonder Airlines||5,276.26|
In the chart below, we see the small list of 10 customers presented in a Pie Chart. It is much easier to see which customers contribute the most to sales. We can see in this example Alpine Ski House is the top contributor.
A pie chart helps us see the parts of a whole. The total attributes to a single entry are made clear with illustrations.
A line graph chart tracks changes over time, especially when the changes are minor and within a single group.
Conditional formatting, like charts, helps us quickly make sense of data in a transactional list. With conditional formatting, we can more readily see anomalies as well as groups of similar data. Some common scenarios are to highlight all negative numbers or all numbers above a baseline or threshold.
In the below example, we highlighted any growth rate above 10% green and less than 8% red.
|Alpine Ski House||36.8%||15.7%|
|A. Datum Corporation||10.9%||8.7%|
|City Power & Light||9.9%||8.1%|
|Blue Yonder Airlines||10.2%||8.3%|
Analytical vs. Transactional Reporting
Now let’s familiarize ourselves with the difference between transactional reporting and analytical reports. Common transactional reports are Sales for the Week, Customers by Region, Payments by Bank Account. Each of these reports shows a list of transactions that are available for review. By contrast, an analytical report helps us understand how data has changed over time. For example, what are the average sales per month over the past 12 months? Analytical reports help us see trends, correlations, and sometimes causation.
In the above example, the list of customers is a transactional report. The conditional formatting example containing prior year and current year rates of change could be analytical. We arrived at the numbers using Excel’s PowerPivot. We will learn more about that in the following example.
Analyze Data with Power Pivot
Power Pivot enables us to slice and dice data. If we have ten columns, we can pivot some of those columns to row to get aggregations or totals – such as average, minimum, or maximum.
We ran a transactional report in our accounting software to show the sales for each customer for each year.
|Alpine Ski House||2018||1,521.96|
|A. Datum Corporation||2018||1,112.40|
|City Power & Light||2018||1,202.31|
|Blue Yonder Airlines||2018||1,181.18|
|Alpine Ski House||2019||2,081.96|
|A. Datum Corporation||2019||1,233.40|
|City Power & Light||2019||1,321.31|
|Blue Yonder Airlines||2019||1,302.18|
|Alpine Ski House||2020||2,409.44|
|A. Datum Corporation||2020||1,340.88|
|City Power & Light||2020||1,428.79|
|Blue Yonder Airlines||2020||1,409.66|
We started with a row for each and then pivoted to see the trends over time.
|Sum of Sales||Years|
|Customer Locations||2018||2019||2020||Grand Total|
|A. Datum Corporation||1,112.40||1,233.40||1,340.88||3,686.67|
|Alpine Ski House||1,521.96||2,081.96||2,409.44||6,013.36|
|Blue Yonder Airlines||1,181.18||1,302.18||1,409.66||3,893.01|
|City Power & Light||1,202.31||1,321.31||1,428.79||3,952.42|
An important caveat before we proceed: TechHouse has worked with PowerPivot since its first release, and we have found that some people get it, and some do not. A common challenge occurs when someone in the organization understands PivotTables well but has difficulty seeing that some others on the team do not.
Suppose you have consumers of the data who are not creating the reports. In that case, you may want to consider leveraging the conditional formatting and charts like above to help. Another option is to take advantage of PowerBI, which separates the data design from the information consumption – making it more accessible to a broader user set.
To try out PowerPivot, go to Insert on Excel, then click on “PivotTable.”
Select your data range and click “insert pivot table.” You can insert a new sheet by default or select an existing location. Once there, you will see an empty table to build your pivot. You can select fields from your data set and drag them into the rows and columns to start visualizing your data on the right side. You can add data to the upper left to filter.
PowerQuery – Extract Transform and Load Data (ETL)
ETL is a common acronym in data analysis that refers to Extract Transform and Load. The idea is to grab data from another system, clean it up as needed, and then load it into your reporting tool. With Excel, we can go to the Data tab to extract data from the various Cloud and on-premises locations, including Azure SQL, Dynamics 365, many third-party applications, and more. The transform feature enables us to cleanse the data. Once the data is cleansed, we can load it back into a worksheet.
When you connect to the data, you can then transform it using PowerQuery. PowerQuery enables more advanced calculations leveraging a language called DAX. Even without knowing DAX, we can get some essential common transformations done, for example, changing dates that are strings to a date field or numbers that are strings to number fields. Then select transform and load the data. You can load data to a separate sheet for each data source.
To get started with PowerQuery, click on “get data” to connect to a data source.
Once the data is loaded, you can create visualizations like the charts, and conditional formatting discussed earlier, and you can also analyze the data with PivotTables.
Data loaded into spreadsheets have some limitations, for example, around protection and combining data. Excel has a “protection” feature that allows you to password protect some or all of a sheet. The password has to be managed elsewhere and does not tie into the overall security model of the organization. For example, it is not possible to say only analysts in finance or only owners and executives can modify or see certain parts of the data.
Excel also has limitations for combining data from multiple sources. We can use the “link data” feature to connect data from one sheet to another. xLookup and the related lookup functions can relate data from one data source to another. We can even format columns to look at different data types if we extend our transformations from Excel functions to include PowerQuery’s transformational tools. However, we cannot read data on the fly, aggregate data dynamically for analysis, or leverage more powerful modeling languages like M. To go to that next stage, we need to step outside of Excel and engage with PowerBI.
PowerBI’s Foundation as an Analysis Engine for Real Information
Power BI, sold separately from Microsoft Excel, extends the tools above in a more feature-rich platform built for reporting. Whereas PowerPivot and later PowerQuery were added to Excel to open the Excel Spreadsheet tool into reporting capabilities. PowerBI’s foundation came from Business Intelligence tools, like ProClarity, that Microsoft purchased in 2006. These were built specifically to leverage Microsoft’s Analysis Services, part of Microsoft’s SQL server product focused on data analytics and processing.
It helps to understand why PowerBI is separate from Excel, unlike PowerQuery and Pivot Tables, to understand better the different scenarios they serve. Microsoft released Analysis Server (based on a prior purchased OLAP product) in 2000. In 2006 they bought ProClarity with its comprehensive visualization tools. Out of these investments came Performance Point, which was a server-based portal with advanced visualizations. The ETL tool at this time was another SQL Server product, SSIS or SQL Server Integration Services. In the past few years, all of these capabilities have been made available in the Microsoft Cloud via Azure SQL, Azure Data Factory, and Azure Synapse. But when working on more straightforward projects, leveraging multiple Azure resources is a barrier to entry for many teams without dedicated Azure experts. That is where PowerBI comes in. PowerBI is part of Microsoft’s effort to democratize IT, ensuring everyone can benefit from today’s technological advances, including data analytics.
PowerBI brings some of these capabilities into a single platform for getting started scenarios and allows for connections to the more robust tools when needed. For example, PowerBI Services (the Cloud version) can contain datasets that are up to 1GB. More than that, you will need to connect to an external data source like Azure Analytics Services.
PowerBI provides a desktop tool that is freely downloadable and can connect to a variety of data sources. This free version is intended for analysts to develop visualizations. It includes all the features required for a design to create a report. The visualizations are similar to charts in Excel but far more vivid. The calculation interface uses DAX, like in PowerQuery in Excel, but has more wizards and data design capabilities. These include connecting data sources via 1:M and M:M relationships. Robust data extraction and transformation capabilities, including the M modeling language, are also present.
The free desktop version does not include sharing that information for self-service reporting by others. Part of the PowerBI Cloud is a paid service available per user or capacity model. With the PowerBI Cloud, the author can publish their PowerBI Desktop reports into the Cloud. PowerBI Cloud service includes integrated security with the overall Microsoft tenant, notifications, dashboards, and workspaces. It also provides integration with the broader PowerPlatform Cloud services like PowerAutomate.
You do not have to purchase the Cloud service to use the free desktop tool for reporting. However, in our scenario for this article, we assume the analyst will create reports in PowerBI Desktop and then publish those reports to the PowerBI Cloud service so that other organization members can view them. The desktop tool is used for authoring the reports and dashboards. The Cloud service then provides those reports to users for self-service reporting and analytics.
PowerBI for Reporting Authoring and Self-Service Analytics — Team Roles and Knowledge Requirements.
PowerBI desktop has three layers. The user layer is available through PowerBI Cloud services. The user can log in to the service at powerbi.com or click on the PowerBI icon on their Office 365 portal. Once there, they will see the reports and dashboards made available to them through a series of workspaces. They can subscribe to have reports delivered by email to quickly see the dashboard visualization without even logging in to PowerBI. If they want to explore the data more, they can click on the PowerBI service from the email. Typically, a user will require training on the data provided in the report to understand the author’s intention and some training on the role of filters. The necessary amount of training can be reduced by keeping the charts simple and the data model consistent with the language and experience of the users.
The middle layer is a visualization layer designed for non-IT users to conduct their analysis. It does require analytical skills for all but the most basic reporting. The designer can select which filters, charts, and sorts will be available to the end-users. We often see problems occur when someone who is not knowledgeable or experienced in data design has been asked to create visualizations. We have seen charts that have no programming errors but contain false information because they are based on flawed data models. We call this an implicit error. An error that is inherent in the design but does not display any warnings or error codes. It is essential to have a knowledgeable and experienced resource build out the initial model and maintain that model as changes occur with the data schema or with the variables affecting that data. TechHouse’s Enhancer product can help here by keeping your model up to date and accurate.
Analytical reports are focused on aggregated data. There are wizards available to create standard calculations and aggregations. Here is one of the areas where it is easy to limit yourself without knowledge as the groupings will significantly affect the totals.
The skill sets you are looking for with visualizations include design skills. The designer will have sample data to help them understand the potential for the data and the most helpful visualizations. We recommend a quick start with our EmpowerMyData team to develop the initial dashboard with your organization’s branding and the standard filters and visualizations that your team uses. You can then have a team member use that dashboard as a template for future designs.
It is also good to make the most of the built-in AI tools to identify correlations that may not be obvious.
The back layer connects to the data sources. It is responsible for the ETL – extracting the data from the source file, transforming or cleansing it as needed with M language, and loading it into the analytical data store. In our example, we can connect to the Azure SQL database and two links to Excel spreadsheets in SharePoint online to assist with categorizing that data. We create the connection to each data source and transform the data with various features like removing duplicate rows, filtering the data based on criteria such as data ranges, and more. The individual creating these connections must have a strong understanding of data principles, including the schema and characteristics of the source systems, the unique identifiers of each query, 1 to Many vs. Many to Many relationships, and the appropriate relationships between the resultant questions the visual designers will reference.
How to get started without building out a data science team? We strongly recommend quick starts whenever an organization is introducing new technology. If you are not using PowerPivot extensively today, start with a project that uses PowerPivot or PowerBI, but have an experienced team develop a quick start for you. The process of working through that quick start will further extend your working knowledge of tools, their capabilities, and how they can benefit your organization.