By: Fikrat Azizov | Updated: 2022-07-08 | Comments (2) | Related: 1 | 2 | 3 | 4 | 5 | 6 | 7 | 8 | 9 | 10 | 11 | 12 | 13 | 14 | 15 | 16 | 17 | > Azure Synapse Analytics
Problem
This tip is part of the series of posts dedicated to the building of end-to-end Lakehouse solutions leveraging Azure Synapse Analytics. In the previous posts, we've explored ways to build ETL pipelines to ingest, transform and load data into Lakehouse. In this tip, we'll be exploring the ways to build reporting based on the Lakehouse data.
Solution
The Lakehouse concept we've been discussing so far is based on Delta Lake technology. You can query Delta Lake tables directly using their storage locations. Here's an example query that reads from the SalesOrderHeader Delta table:
select * from delta.`/delta/silver/SalesOrderHeader/`
Building semantic layer on the Lakehouse
Alternatively, you can create a semantic layer over Delta tables that includes creating databases, external tables, views, etc. You can leverage Spark pools or Synapse serverless to create a semantic layer.
In this tip, I'm going to use Synapse serverless to build Lakehouse objects that can further be queried by various reporting tools.
The semantic layer objects can be created using the T-SQL language. Synapse GUI can also be used to generate certain commands. I will demonstrate usage of the Synapse GUI methods here, as that's the easiest method for beginners, in my view. I'll create the following Delta objects:
- Database
- Schema
- External table
Let's open Synapse Studio, select the Data tab, then the Workspace tab, click Add button then select the SQL database command, as follows:
Figure 1
Select the Serverless option, enter the database name and click Create button, as follows:
Figure 2
Once created, you'll see the new EDW database object under the workspace tab:
Figure 3
Next select the Schemas command, followed by the New schema command, as follows:
Figure 4
This will add a template script to create a schema. Let's tweak it to create two schemas, as follows:
CREATE SCHEMA [dim]; GO CREATE SCHEMA [fact]; GO
The table creation requires a few steps like creating file format, data source, etc. You can follow the steps similar to creating schemas using Synapse GUI, in which case you'll need to add the required code manually. However, I'll share an easier path that would help us generate these objects automatically.
Let's, switch to the Linked tab, open the default Azure Data Lake Storage account, navigate to the location of the Gold Delta root folder, select the DimCustomer folder, right-click and select New SQL script, then Create external table commands, as follows:
Figure 5
Select the database, and enter the schema/table name, as follows:
Figure 6
This will generate a SQL script with all necessary commands. Let's change the context to the EDW database, and tweak the schema name in that script, as follows:
Figure 7
Note that the script also contains a select statement to browse the data in the table. Let's run the script and examine the results, as follows:
Figure 8
Now that we've
created a dimension table, let's follow a similar process to create [fact].[FactSalesOrderHeader]
table.
Once you've got the required tables, you can also create views based on those tables, just like you'd do in a relational database.
Creating Power BI dashboards
Let's open the Power BI Desktop and use the Get data command to select the Azure Synapse Analytics connection:
Figure 9
Enter your Synapse workspace connection string in the following format: WORKSPACE-ondemand.sql.azuresynapse.net
Optionally, enter the Delta database name.
Here's my screenshot:
Figure 10
Select and load the tables we've created, as follows:
Figure 11
Once the data is uploaded, the rest should be straightforward. Here's a simple dashboard I've created, that displays total sales by customers:
Figure 12
Next Steps
Read these related articles:
- Common Data Warehouse Development Challenges
- Query Delta Lake files using serverless SQL pool in Azure Synapse Analytics
- Store query results to storage using serverless SQL pool in Azure Synapse Analytics
- Connect to serverless SQL pool with Power BI Professional
About the author
This author pledges the content of this article is based on professional experience and not AI generated.
View all my tips
Article Last Updated: 2022-07-08