What is an SSIS Package

By:   |   Updated: 2021-04-05   |   Comments   |   Related: More > Integration Services Development


Problem

This tip will teach you what a Microsoft SQL Server Integration Services package is and what you can do with it for data integration projects when working with SQL Server, Oracle, Excel, CSV files or flat files, XML, stored procedures, etc. data sources.

Solution

A general introduction to the SQL Server Integration Services (SSIS) service is given in the tip What is Integration Services. There we explained SSIS is the ETL tool of the SQL Server stack, but in essence it's a workflow orchestrator often used for data warehousing or data integration projects. The unit of work in an SSIS project is a "package". The following screenshot shows a bunch of packages inside an SSIS project:

packages in a project

Why use an SSIS Package?

A single package in an Integration Services Project is supposed to do one single task, such as: load data into a fact table in the data warehouse, transfer data from one server to another or take a backup of database. Some data-related tasks that you want executed. SSIS packages can easily be scheduled to run frequently, it's also possible to create one-shot packages that execute a specific task just once. For example, import some sample data into a database.

Because SSIS packages are visual, it's often easier to quickly create an SSIS package instead of writing code that accomplishes the same task.

sample package

Even though you can create packages that do everything, it's a development best practice to create packages that have one single task. If a package tries to do too many things, it becomes harder to maintain, more difficult to understand and if it fails mid-execution, harder to restart without doing everything from scratch again.

Creating an SSIS Package

To create an SSIS package, you need Visual Studio. In the earlier versions of SSIS, you could download a free shell of Visual Studio if you didn't have a Visual Studio license. If you did have the full version of Visual Studio, you needed to install the business intelligence project templates for developing SSIS, SSAS and SSRS solutions. An historical overview of all the name changes:

  • SQL Server 2005 – 2008R2: Business Intelligence Development Studio (BIDS), either as a shell or as templates for Visual Studio.
  • SQL Server 2010: SQL Server Data Tools for Business Intelligence (SSDT-BI), not to be confused with SSDT for database projects. Either a shell or the templates.
  • SQL Server 2012 – 2016: SQL Server Data Tools, either as a shell or as templates.
  • SQL Server 2017: SQL Server Data Tools. Database projects are included in the installation of Visual Studio. For business intelligence projects, you can either install separate Visual Studio extensions or the stand-alone SSDT installer which installs a shell of Visual Studio.
  • SQL Server 2019: SQL Server Data Tools. Again, database projects are included in the installation of Visual Studio. SSIS, SSAS and SSRS are only available as extensions. This means you need a full version of Visual Studio, there's no shell available. If you only use Visual Studio for the business intelligence projects, you can use the free Community Edition.

You can find more info in SQL Server Integration Services SSIS Versions and Tools. For SSIS 2019 (the latest version at the time of writing), you can installation info in the tip Install SQL Server Integration Services in Visual Studio 2019. Once you have the correct tooling for your version of SQL Server, you can create a new project, as explained in Create a SQL Server Integration Services SSIS Project. A new package will be automatically been added to the project:

solution explorer with new package

The Different Parts of an SSIS Package

Control Flow

The control flow is the most important part of an SSIS package. It's used to configure all the tasks. The control flow is a visual canvas where tasks are connected with each other through precedence constraints. If tasks are not connected to each other, they run in parallel when the package is executed.

When the control flow tab is selected, the SSIS toolbox will contain all the different tasks and containers that can be used:

control flow with ssis toolbox

More information about the control flow can be found in the following tips:

Data Flow

The data flow is a special task in the control flow. This task can load data from a data source in memory, transform the data using various available transformations and then write it to a destination. Since this needs additional configuration, a separate tab is available in the data flow. It's possible to have multiple data flows in one control flow. An SSIS package has only one control flow.

When you go to the data flow tab, the SSIS toolbox will show available sources, transformations and destinations. An example:

ssis data flow example

Keep in mind all data read from the source is read into memory. Some transformations – like the Aggregate and Sort transformation – need all data to be read into memory before they can start outputting rows. This might have a serious performance impact.

You can find more info about the data flow in the following tips:

Parameters and Variables

If you want to make your package more flexible, you can use variables and parameters. The big difference between the two is that parameters are used as input when the package starts executing. They cannot change value during execution, while variables can change value.

Package parameters have their own tab:

package params

Project parameters (which can be reused between different packages) are located in the Solution Explorer window:

project parameters

Variables have their own window:

variables tab

If you cannot find the variables window or tab, you can go to Extensions (in Visual Studio 2019) > SSIS > Variables.

where to find the variables menu

The value of a variable can change due to an expression that has been defined, or because of the result of a task, such as an Execute SQL Task or a ForEach Loop Container.

More information about variables and parameters:

Event Handlers

Event handlers are an option in SSIS to execute additional tasks when a certain type of event occurs. Often the OnError event handler is used to take some actions in case the package fails, such as logging the error into a table or sending an email.

event handlers

The tip Capturing SQL Server Integration Services Package Errors Using OnError Event Handlers has more info about event handlers.

Package Explorer

The package explorer tab has a tree view of all the objects inside the SSIS package:

package explorer

Debugging an SSIS Package

You can start debugging an SSIS package by hitting F5 or clicking the green arrow in the toolbar.

debug package

SSIS will first validate the package and then it will start executing. During execution, a progress tab will be added:

package validation

Once the execution is finished, the tab will be renamed to "Execution Results". While the package is executing, you can put breakpoints on different tasks. When a breakpoint is hit, the execution will pause which will allow you to inspect the state of the package and for example the value of different variables.

breakpoints

It's also possible to set breakpoints inside script tasks and script components. It's not possible to set a breakpoint on a transformation in the data flow, but you can add data viewers. When data flows through a path where a data viewer is set up, a window will pop-up which will allow you to inspect the data that is flowing through.

data viewer

More about debugging:

Logging

When an SSIS package runs, you typically want to log some information about package execution. When did the package start? How long did it execute? Were there any errors or warnings? When you use the SSIS project deployment model, all information is automatically logged in the SSIS catalog (see the tip What is Integration Services for more info). But sometimes you need to log information yourself. You can either use an Execute SQL Task to write some data to a table, or a data flow to write data to a file, but you can also use the integrated logging of the package itself.

configure logs

If you use the package deployment model, this is the only logging option there's available.

To get started, check out these tips:

Next Steps


sql server categories

sql server webinars

subscribe to mssqltips

sql server tutorials

sql server white papers

next tip



About the author
MSSQLTips author Koen Verbeeck Koen Verbeeck is a seasoned business intelligence consultant at AE. He has over a decade of experience with the Microsoft Data Platform in numerous industries. He holds several certifications and is a prolific writer contributing content about SSIS, ADF, SSAS, SSRS, MDS, Power BI, Snowflake and Azure services. He has spoken at PASS, SQLBits, dataMinds Connect and delivers webinars on MSSQLTips.com. Koen has been awarded the Microsoft MVP data platform award for many years.

This author pledges the content of this article is based on professional experience and not AI generated.

View all my tips


Article Last Updated: 2021-04-05

Comments For This Article

















get free sql tips
agree to terms