Using SQL Server Integration Services to Generate Excel Files Based on Criteria

By:   |   Updated: 2019-01-04   |   Comments (5)   |   Related: More > Integration Services Excel


Problem

I have a SQL Server table containing a large dataset and I am interested in creating one Excel file for every product category name using SQL Server Integration Services (SSIS).

Solution

Let's start by taking a look at my sample dataset which is taken from AdventureworksDW2016CTP3. When I count the rows, I have approximately 60K records. The column that I am particularly interested in is 'EnglishProductCategoryName' which contains the Category group Name. This is the column that I am interested in grouping my data by and having SSIS generate an individual file for each category.

query results

On a Distinct Count, I can see that there are 3 Product Categories (Accessories, Bikes and Clothing), which means that I will want 3 Excel files with all of the aggregated Detail Data listed above.

query results

Getting Started with the SSIS Package

I'll start by creating a new SSIS Project in Visual Studio SSDT called 'Using SSIS to Generate Excel Files'.

Next, I'll need to set the Run64BitRuntime Debugging property to 'False' by right-clicking on the SSIS project and navigating to 'Properties'.

create ssis package
ssis package run64bitruntime

This will prevent execution errors later since we are generating files using Excel and need a 32bit runtime.

Secondly, I'll need to download and Install the Microsoft Access Database Engine Redistributable (Note: I selected the 32bit version for my solution).

Designing the SSIS Package

I'll start by adding a new OLE DB data source by right-clicking the connection manager section and specifying my connection as 'AdventureworksDW2016CTP3'.

ssis new ole db connection

Next, I'll create an Excel connection by right clicking on the connection manager and pointing to the folder where my excel file resides:

ssis excel connection

The Excel connection will have to contain the following Expression for the connection string property so that the connection is updated dynamically at run-time. I'll do this by right clicking the newly set up Excel connection and clicking properties.

excel connection properties

Under Properties, navigate to Expressions and click the … icon. Then select Connection String in the Property Expressions Editor and add the user variables below containing the Folder Path and client Id. This will create a dynamic file and folder path that is set at run time based on the query parameter.

ssis expression editor
ssis expression editor

I'll then create the following Variables, by clicking the variables icon in the top right-hand corner:

ssis variables
ssis variables

Next, I'll add an Execute SQL Task to Retrieve the Top Level which will basically contain a distinct count of the Product Category Names. I'll do this by dragging and dropping an Execute SQL Task onto the Control Flow canvas. I'll then double click the Execute SQL Task, set the connection to my AdventureWorksDW2016CTP3 OLE DB Source, click the … near SQL Statement and then enter the Distinct Select statement below.

enter sql query

I'll also add the following Result Set Variable name to the user variable named TopLevelClientID. This is critical because it will store the distinct EnglishProdutCategoryNames from the SQL Query into a variable which will be used in further processing steps of the SSIS package.

execute sql task editor

Once this is completed, I will add a For Each Loop Container to Enumerate the Individual Top Level. From the SSIS toolbox, drag and drop a For Each Loop container to the control flow canvas:

ssis package flow

Double click the For Each Loop container. Under Collection, I made sure to specify the enumerator as a Foreach ADO Enumerator and selected the appropriate User Variable.

ssis foreach loop editor

Under Variable Mappings, I added the User:ClientID variable. This is important because the package will process every ClientID within the For Each loop container which will ensure that when the Excel files are generated, every file will have the correct records in the file that are grouped by and associated with the distinct EnglishProductCategoryNames that we initially specified in our DISTINCT SQL query:

ssis foreach loop editor

Within the For Each Enumerator Loop, I added an Execute SQL Task to Create a new Excel File with the following details for the Connection Type and SQL Statement (Note that the columns must match the OLE DB Source columns exactly for the mapping to be accurate). Additionally, I specified my data types as VARCHAR(255) for standardization and simplicity. This may be altered as you see fit for your specific data source and types.

execute sql task editor
CREATE TABLE 'ProductCategories'
(
   'ProductCategoryName' VARCHAR(255),
   'Model' VARCHAR(255),
   'CustomerKey' VARCHAR(255),
   'Region' VARCHAR(255),
   'Age' VARCHAR(255),
   'IncomeGroup' VARCHAR(255),
   'CalendarYear' VARCHAR(255),
   'FiscalYear' VARCHAR(255),
   'Month' VARCHAR(255),
   'OrderNumber' VARCHAR(255),
   'LineNumber' VARCHAR(255),
   'Quantity' VARCHAR(255),
   'Amount' VARCHAR(255)
)

The last step of the package design is to add a Data Flow task that will create the Excel Files from my initial 60k+ record set. To do this, I will drag and drop an OLEDB Source task from the SSIS Toolbox onto the Data Flow Task canvas. I will also drag and drop an Excel Destination task from the SSIS Toolbox onto the Data Flow Task canvas. I will then connect the OLEDB Source to the Excel Destination.

package flow

Lastly, I will double-click the OLEDB Source Task labeled 'Data_WithSourceInfo_OLEDBSource' and enter the following SQL command as the source with a Where clause =?. The ? is to allow a Parameterized Query. The parameterized query is to allow a type-specific value when replacing the ? with their respective value. In our scenario, the value for 'EnglishProductCategoryName' is the value that we will be passing into this query.

SELECT
   [EnglishProductCategoryName],
   [Model],
   [CustomerKey],
   [Region],
   [Age],
   [IncomeGroup],
   [CalendarYear],
   [FiscalYear],
   [Month],
   [OrderNumber],
   [LineNumber],
   [Quantity],
   [Amount],
FROM [AdventureworksDW2016CTP3].[dbo].[vDMPrep]
WHERE [EnglishProductCategoryName] = ?
ole db source editor

Next, I click Parameters and add the following Input Parameter mappings with the appropriate User Variable as follows:

set query parameters

I then double-click the Excel Destination Task to open it and Select the Excel Sheet 'ProductCategories$':

excel destination editor

Finally, I will map the input columns to the destination. Note that the destination columns will not match the Input Column headers, but they must be mapped in the correct sort order.

excel destination editor mapping

Running the SSIS Package

We've now covered all the design elements of this SSIS package. It's now time to run the package by right-clicking the SSIS Package in the Solution Explorer and clicking 'Execute Package'.

run ssis package

Here is what the completed and successfully executed control flow will look like:

ssis package flow

Exploring the Excel Files

After the package runs successfully, I will now have 3 files in my folder, which was specified in the variables section of the SSIS package.

ssis variables

I now have one Excel file for each product category.

excel files

When I open the Accessories Product Category file, I now see all the records from my original SQL Query grouped by the Product Category.

For example, Accessories has 36,092 records in both the SQL Query as well as the Excel File:

query results
excel file output
Next Steps
  • In this article, we developed an SSIS package that takes a SQL Server table containing a category name and we created an Excel file containing the records for each category by using variables, dynamic connection strings, and for each loop in SSIS.
  • Remember to download the Microsoft Access Database Engine Redistributable to prevent any issues with generating Excel files using SSIS.


sql server categories

sql server webinars

subscribe to mssqltips

sql server tutorials

sql server white papers

next tip



About the author
MSSQLTips author Ron L'Esteve Ron L'Esteve is a trusted information technology thought leader and professional Author residing in Illinois. He brings over 20 years of IT experience and is well-known for his impactful books and article publications on Data & AI Architecture, Engineering, and Cloud Leadership. Ron completed his Master�s in Business Administration and Finance from Loyola University in Chicago. Ron brings deep tec

This author pledges the content of this article is based on professional experience and not AI generated.

View all my tips


Article Last Updated: 2019-01-04

Comments For This Article




Wednesday, May 13, 2020 - 2:37:52 PM - Phil Back To Top (85645)

There is no overall view of what the mappings are.  I can't tell what is in the 'ForEach' box.  This tip seems to be for advanced users.


Friday, March 6, 2020 - 11:29:35 AM - Pranab Kumar Pattnayak Back To Top (84972)

good articale able to generate the excel files


Thursday, October 3, 2019 - 12:23:20 PM - Mario Back To Top (82659)

khlaed mahmoud:

The first Execute SQL Task needs to have Full Result Set as its Result Set property, this resolved the TopLevelID case for me. In my case I had created the excel file with excel data types (longtext, long), the rest worked well for me.


Wednesday, August 28, 2019 - 12:11:52 PM - khaled mahmoud Back To Top (82170)

Hi

I finshed this example but i have error 

Error: Variable "User::ClientID" does not contain a valid data object


Friday, January 4, 2019 - 11:33:28 AM - Salam Back To Top (78629)

 Hi Ron, can you please share the files as I am having some issues for configuring some items. Thanks in advance















get free sql tips
agree to terms