By: Hadi Fadlallah | Updated: 2024-09-26 | Comments (1) | Related: > Power BI
Problem
One of the best business intelligence tools is Microsoft's Power BI, which makes it possible for organizations to interact with, understand, and make decisions using its data. It connects different sources of data (SQL Server on-premises, cloud databases, Salesforce, etc.), changes raw information into reports, apps and dashboards, and enables data analysis on this information. The significance of Power BI lies in its capacity to simplify complex data so that every employee at any level can work with it, from line managers to supervisors to chief executive officers, thereby promoting a culture of well-informed real-time decisions. It is an essential tool for businesses that wish to use data as an asset by utilizing a user-friendly interface and integrating other Microsoft products (Azure, Microsoft Fabric, Excel, Office 365, SharePoint, etc.) with strong security features.
Solution
The goal of this tutorial is to give data analysts practical knowledge on how to leverage Power BI to transform and visualize data through hands-on exploration, making them ready for real-world challenges in managing information.
We'll examine and discuss some of Power BI's most important components and functionalities. We'll start by looking at and navigating through the Power BI Desktop interface. Next, we'll address how to import data from a CSV file. Then, we will learn how to preprocess and handle data quality issues using the Microsoft Power Editor. Finally, we will explore how to create an interactive dashboard to visually analyze COVID-19 data.
A free version of Power BI, Power BI Desktop is a Windows-based application serving as the primary tool for building and designing reports and dashboards. The desktop version can be downloaded directly from the official website: Power BI Desktop Download page.
Report Editor
To create your first report, click on New Report on the Power BI desktop welcome screen.
This will open a new blank report editor. As shown in the image below, there are six main parts of the Power BI report editor:
- View Menu Bar: Allows users to switch between different
Power BI environment views with each view focusing on specific report development
aspects:
- Report View – Primarily creating your reports. In it, you can drag and drop visuals, add interactivity, and rearrange data to generate engaging visualizations.
- Table View – Offers a more detailed look at the actual data underneath, allowing you to investigate raw data in table form. This is helpful for ensuring that the data to be visualized is correct. Moreover, this table allows editing columns' metadata and adding new measures.
- Model View – To establish and edit relationships between tables of your data. The relations may define how tables are linked to one another ensuring accurate graphics and optimal performance of the DAX model.
- Menu Bar: Gives access to different commands and functions.
- Report Body: This is your main work canvas, where the report comes alive; it's an area for adding visuals such as diagrams, charts, tables as well as other components that hold data.
- Visualizations Pane: This is the pane in which you can select and modify various types of visuals to include in the report. Furthermore, this panel allows you to adjust the properties of each visual, such as colors, data labels, and format settings.
- Data Pane: An overview of datasets, as well as fields, that can be used in your report is given in this pane. It will show all the tables, columns, and calculated fields that you can drag into the report or create new visuals. This list is important to select the exact data you intend to analyze visually.
- Report Pages: To structure your report into several pages, use this report pages area. Each page can be dedicated to a different aspect of your analysis or have different sets of visuals displayed on it. This function enhances complex reports, therefore making them less confusing and more navigable.
Connecting to a Data Source
The next step is to connect to a data source. In the menu strip, click on Get data to open its dialog. Here, all available data source types (File, Database, Microsoft Fabric, Power Platform, Azure, Online Services and Other) are listed with connectors.
Or, to only list the commonly used data sources, click on the drop-down menu below the button.
In this tutorial, we will use a subset of the COVID-19 dataset provided by "Our World in Data" as a CSV file. This dataset contains information about the COVID-19 pandemic in Lebanon.
After selecting the CSV file, the data preview dialog appears showing the top rows of the CSV file. Here, the user can change the file encoding and delimiter and choose to import data directly into the data model (click Load) or perform some data preprocessing (click Transform data).
For now, we will load the data directly. Then, we will navigate to the "Table View" in the left menu bar.
As shown below, the Table View is composed of four main parts:
- Data Viewer: Where data is previewed.
- Specialized Top Menu: New choices appear in the top menu bar that allows the user to add measures and columns and modify the table metadata.
- Data Pane: Where data hierarchy and measures are shown.
- Bottom Bar: Where the table name and total row count are displayed
If the user clicks on any column, the column metadata is displayed on the top menu, as shown in the following image. It is important to note that these changes are done within the Power BI data model, not the original data source.
If we click on the Data Model View in the left menu, we can display the tables imported into the current Power BI desktop file. In our tutorial, we will only see one table. The Data Model View is useful when several tables exist, and we need to manage their relationships.
Use Power BI to Transform Data
Now, let's navigate to the home menu bar > Queries > transform data. This will open the Power Query Editor.
Since the COVID-19 counter in this dataset started on 22 January 2020, we will remove all rows before that date. The image below shows that the first 51 rows contain null values in the total cases column.
This is why we'll need to remove the top 51 rows. To delete them, click on Remove Rows in the top menu ribbon.
Then, we need to specify 51 as the number of rows we need to remove. Click OK.
Note: After each transformation, a step is added to the APPLIED STEPS window, which allows the user to undo any specific transformation.
Now, we need to replace the null values in the total deaths and total cases columns with zeros. Right-click on each column header and choose Replace Values...
Let's specify "null" as the value to find and "0" to replace it with. Click OK.
Finally, we need to remove all unnecessary columns in our dataset. Select all the columns needed by holding the "Ctrl" keyboard button and clicking on the columns' headers. Then, right-click and choose Remove Other Columns.
In our tutorial, we kept the following columns: date, new cases, new cases per million, new deaths, new deaths per million, total cases, total cases per million, total deaths, and total deaths per million.
After preprocessing data, click Close & Apply on the top left corner.
This will save the applied steps and reapply them each time we refresh or update the data source in our Power BI model.
Once back to the Report Editor, note that the column count is reduced in the right Data pane.
Building Our First Dashboard
The final phase of this tutorial is adding visualization to our dashboard.
As we are working with a time-series dataset, let's start by adding a line chart. We can simply drag it from the visualization pane and drop it within the report body.
We will notice that an empty visualization is added, as shown below.
We need to configure the X- and Y-axis from the visualization pane after clicking on the added visual. Drag the date field within the X-Axis property and the Total cases field into the Y-Axis property.
Since we need to visualize the evolution of total cases with time, let's disable the date hierarchy from our visual (may be explained in a later article). This can be done by clicking on the drop-down menu in the date field (X-axis property) and choosing "date" rather the "Date Hierarchy".
Note: The values visualized in the line chart are pictured correctly right now.
As noted, the total_cases column is renamed to Sum of total_cases automatically after dragging it into the Y-axis property. To change this name, click on the drop-down menu, then Rename for this visual. We will change the value to "Total cases".
Now, we will expand the line chart to fit the whole report width.
To format the visual colors and font, navigate to the Format your visual tab on the Visualization pane, where many options can be specified, such as the line color, background, font, scale…
Next, we will add four card visuals to our report.
The card visual only has one data property.
Using those four cards, we will visualize the new cases, new deaths, total deaths, and total cases columns.
From the Visualization pane, we can configure the background colors of each card.
Now, it is easier to distinguish between each card in the data visualization.
Note: Power BI dashboards are interactive. So, by clicking on a single data point in the line chart, each card value changes to match the specific date point selected to simplify analysis and decision-making.
Next Steps
- You can refer to many other Microsoft Power BI tips previously published on MSSQLTips.com:
Learn more about Power BI in this 3 hour training course.
About the author
This author pledges the content of this article is based on professional experience and not AI generated.
View all my tips
Article Last Updated: 2024-09-26