By: Nat Sundar | Updated: 2017-02-14 | Comments | Related: More > Integration Services Control Flow Transformations
Problem
I have a requirement to search for multiple file extensions in a folder. However, in SQL Server Integration Services (SSIS) there is a limitation that the "Foreach File Enumerator" can filter only one type of file. Are there any workarounds to search for multiple file extensions?
Solution
Unfortunately it's a limitation of SSIS and the "Foreach File Enumerator" supports only one type of file.
However this limitation can be overcome with a simple script task and a few lines of C# or VB.NET code. In this tip, I will walk through the solution in detail.
Solution Overview
I developed a simple package to explain the solution in detail. This package has a script task and this script task will loop thru the given folder for multiple file extensions. In addition, the script task will build an array to list the identified files. The details of the files will be stored as an object variable.
A "Foreach Loop" container has been created with "variable Enumerator". This will help us loop thru each lookup file. To showcase the functionality, I have simply added a script task to print the name of the file. So when you run this package, it is expected to search for files with many extensions and print the full path and name.
Solution Details
The below picture represents the package structure at a very high level.
The below variables have been defined in the package:
- DataFolder - Location of the folder where the SSIS package will look for files
- FileFilters - Multiple file extensions or patterns separated by a pipe symbol
- FileList - Internal variable used by the SSIS package to store the list of files (object variable)
- FileName - Internal variable used by the SSIS package to store the value of each file name
SSIS Script Task
The script task will search for the files in the data folder based on the file filters. The variables "DataFolder" and "FileFilters" have been defined as ReadOnly variables for the script task. The script task will search for files and store the list of files in the object variable "FileList". Hence that variable "FileList" has been defined as a ReadWrite variable.
The below picture represents the variable configurations for the script task.
Code Inside the SSIS Script Task
It is mandatory to refer to the below namespaces. This will help us to access the directory and file related functions with ease.
using System.Collections; using System.IO;
The below represents the actual code inside the script task. The code has been written using C#. The same functionality can be achieved by using VB.NET.
// TODO: Add your code here string DataFolder, FileFilters; DataFolder = Dts.Variables["User::DataFolder"].Value.ToString(); FileFilters = Dts.Variables["User::FileFilters"].Value.ToString(); // ArrayList will store multiple file names ArrayList LookupFiles = new ArrayList(); // Delimited file filteres will be extracted and stored as a individual item in a string array string[] strFileFilters = FileFilters.Split('|'); // for each filter find matching file names foreach (string FileFilter in strFileFilters) { // Adding found files in the array list for each file filter LookupFiles.AddRange(Directory.GetFiles(DataFolder, FileFilter, System.IO.Searchoption.AllDirectories)); } //copy the values of multiple files to the object variable. Dts.Variables["User::FileList"].value = LookupFiles; Dts.TaskResult = (int)ScriptResults.Success;
Here is a screenshot of the code as well.
In the first section of the code we are storing the values of the DataFolder and the FileFilters to local variables. As the fileFilter has multiple extensions, all the individual items have to be stored as an array using the split method. From this array the individual items can be looped thru using a foreach loop. Now the individual files can be searched using the Directory.GetFiles method. This method accepts a folder location, file pattern and an option as parameters. This method returns the available files as a string array. This string array can be added to an ArrayList. Finally the array list can be stored in the SSIS object variable. In our example the array list will be stored in the FileList object variable.
Foreach From Variable Enumerator
Now a "Foreach Container" has been used to extract the name of the individual file from the FileList object variable. The enumerator has been set as "Foreach From Variable Enumerator". The object variable FileList has been set in the Enumerator configuration. The "FileName" variable has been set in the variable mappings section. This foreach container will loop thru each item in the FileList object variable and assign the individual file name to the FileName variable.
The below pictures represent the configurations for the "Foreach Loop" container.
Script Task Inside the Foreach Container
A script task has been placed inside the foreach container to print the name of the individual file. The variable FileName will be passed to the script task. The script task will be able to print the filename with the full path using the messagebox.show function.
The below picture represents the configurations for the script task (Inside the Foreach Loop).
The below picture represents the code inside the script task to print the file name.
Execution Results
Now the package can be executed to see the results. The below pictures confirm the solution has correctly identified the files with csv and txt extensions in the given folder.
Summary
I recommend trying this option in your development environment with a reasonable number of data files. If satisfactory, then deploy and test the solution in a test environment.
Next Steps
About the author
This author pledges the content of this article is based on professional experience and not AI generated.
View all my tips
Article Last Updated: 2017-02-14