SQL Server Integration Services package to delete files from a Network or Local path based on date

By:   |   Updated: 2013-06-13   |   Comments (10)   |   Related: More > Integration Services Development


Problem

We have a requirement to delete a group of files that are older than the specified number of days from the company file share. This file share stores sensitive clients extracts (produced by DBAs as part of client's SQL Server Agent extract jobs), database backups, scanned emails, etc. Moreover, due to the complex folder hierarchy and delicate nature of the data stored in these files, this task has to be originated from SQL Server. However, due to company security policy, and based on SQL Server security best practices, we blocked access to OLE Automation stored procedures, CLR features, and xp_cmdshell.  Is there any way to accomplish this task without using these features?  Check out this tip to learn more.

Solution

Well, this requirement can be fulfilled quiet easily by using the following SSIS toolbox components: Script Task, Foreach Loop container, and the File System Task. The following are the steps, which you can follow to accomplish this task using SSIS.

Steps to Build the SSIS Package for File Deletion

  • Launch the SQL Server Data Tools, then from the File menu, choose New, and click Project. This will launch the New Project dialog box.
  • In the New Project dialog box, select Integration Services Project from the templates pane. In the Name box, change the default name to DeleteRedundantFiles. Also, specify the project location, and once properly configured, click OK to create this new SSIS project.
  • By default, SQL Server Integration Service Project creates an empty package, named Package.dtsx, which is added to your project.  In the Solution Explorer, right-click on this package and rename it to DeleteRedundantFilesFromNetwork.dtsx.
using the following SSIS toolbox components

1: Defining Package Variables in SSIS

Next, add the following five variables to your SSIS package:

Variable Name: varFileAgeLimit
Variable Data Type: Int32
Description: This variable stores the age limit variable for the file. This variable is used within the Script Task, and accepts both positive and negative values. When a negative value is assigned to this variable, the code within the Script Task searches for files that are older than a specified number of days.  When the value is positive, the code within the Script Task searches the files that are newer than specified number of days.

Variable Name: varFileFQN
Variable Data Type: String
Description: This variable stores the file name with the path. For example: C:\DataFolder\Text.txt or \\MyServer\MyDataFolder\Test.txt.

Variable Name: varFilePattern
Variable Data Type: String
Description: The purpose of this variable is to store file name pattern. For example: *.* or *.bak or DB1_Extract.dat

Variable Name: varNetworkPath
Variable Data Type: String
Description: Stores either the Network location or the Local path, which is the code within the Script Task that will be used to search files in that location and within child directories of this location. For example: C:\MyLocation or \\MyServer\MyLocation.

Variable Name: varFileList
Variable Data Type: Object
Description: Stores the list of files that will be deleted.

2: Defining Package Tasks

Next, add and configure the following SSIS package tasks:

2.1: Configuring "Get File List - ST" Script Task

The first task you need for this solution is the Script Task. To add it to your package, simply drag it from SSIS Toolbox to design surface of the Control Flow tab. Now, on the Control Flow design surface, right-click on the newly added Script Task, then click Rename, and change the name from default name to Get File List - ST

Next, double-click the Script Task, to open the Script Task Editor, and then configure the following properties:

  • Set the ScriptLanguage property to Microsoft Visual Studio C# 2010.
  • Add the varFileAgeLimit, varFilePattern and varNetworkPath package variables in the Script task's ReadOnlyVariables property.
  • Add the varFileList package variable in the Script task's ReadWriteVariables property.
  • Next, click the Edit Script button on the Script Task Editor page. This will open the VSTA integrated development environment (IDE) window. Copy the code below because this will search and return the list of files that are older than a specified number of days in a specified location, and within the child directories of this specified location. Once done, click the OK button on the Script Task Editor to save changes to the Script Task.
 double-click Script Task, to open Script Task Editor

Microsoft Visual Studio C# 2010 Code to Delete Files in the File System

#region Help:  Introduction to the script task
/* The Script Task search files in the specified location and within child directories of the specified location */
#endregion
#region Namespaces
using System;
using System.Data;
using Microsoft.SqlServer.Dts.Runtime;
using System.Windows.Forms;
using System.IO; // Import for Directory class
using System.Collections; // Import for ArrayList class
#endregion
namespace ST_e10d8186ebb34debbc31bb734b0e29a5
{
[Microsoft.SqlServer.Dts.Tasks.ScriptTask.SSISScriptTaskEntryPointAttribute]
public partial class ScriptMain : 
 Microsoft.SqlServer.Dts.Tasks.ScriptTask.VSTARTScriptObjectModelBase
{
   private string NETWORK_PATH;
   private string FILE_PATTREN;
   private bool isCheckForNewer = true;
   int fileAgeLimit;
   private ArrayList listForEnumerator = new ArrayList();
   private void GetFilesInFolder(string folderPath)
   {
      string[] AllFiles;
      DateTime fileChangeDate;
      TimeSpan fileAge;
      int fileAgeInDays;
      try
      {
         AllFiles = Directory.GetFiles(folderPath, FILE_PATTREN);
         foreach (string fileName in AllFiles)
            {
               fileChangeDate = File.GetLastWriteTime(fileName);
               fileAge = DateTime.Now.Subtract(fileChangeDate);
               fileAgeInDays = fileAge.Days;
               CheckAgeOfFile(fileName, fileAgeInDays); 
            }
         if (Directory.GetDirectories(folderPath).Length > 0)
            {
              foreach (string childFolder in Directory.GetDirectories(folderPath))
               {
                  GetFilesInFolder(childFolder);
               }
            }
      }
      catch (Exception e)
      {
        System.Windows.Forms.MessageBox.Show("Exception caught: " + e.ToString(), "Results",
            MessageBoxButtons.OK, MessageBoxIcon.Error);
      }
   }
   private void CheckAgeOfFile(string fileName, int fileAgeInDays)
   {
     if (isCheckForNewer)
      {
        if (fileAgeInDays <= fileAgeLimit)
          {
             listForEnumerator.Add(fileName);
          }
      }
      else
      {
        if (fileAgeInDays > fileAgeLimit)
          {
              listForEnumerator.Add(fileName);
          }
      }
   }
   public void Main()
   {
     // Initializing class variables with package variables
     fileAgeLimit = (int)(Dts.Variables["User::varFileAgeLimit"].Value);
     NETWORK_PATH = (string)(Dts.Variables["User::varNetworkPath"].Value);
     FILE_PATTREN = (string)(Dts.Variables["User::varFilePattern"].Value); ;
     if (fileAgeLimit < 0)
     {
       isCheckForNewer = false;
     }
     fileAgeLimit = Math.Abs(fileAgeLimit);
     GetFilesInFolder(NETWORK_PATH);
     // Return the list of files to the variable
     // for later use by the Foreach from Variable enumerator.
     Dts.Variables["User::varFileList"].Value = listForEnumerator;
     Dts.TaskResult = (int)ScriptResults.Success;
   }
   #region ScriptResults declaration
   enum ScriptResults
    {
      Success = Microsoft.SqlServer.Dts.Runtime.DTSExecResult.Success,
      Failure = Microsoft.SqlServer.Dts.Runtime.DTSExecResult.Failure
    };
    #endregion
  }
}

2.2: Configuring "Get Individual File FQN - FLC" Foreach Loop container

The next task you need is the Foreach Loop container. To add it to your package, drag it from the SSIS Toolbox to design the surface of the Control Flow tab. Now, on the Control Flow design surface, right-click the Foreach Loop container, then click Rename, and change the name from default name to Get Individual File FQN - FLC

Next, double-click the Foreach Loop container, to open the Foreach Loop Editor. On left side of Foreach Loop Editor, click on the Collection, and then configure its properties:

  • Change Enumerator property to Foreach From Variable Enumerator.
  • Specify varFileList package variable as Enumerator configuration variable
Configuring

Now, click on Variable Mappings and select varFileFQN package variable to map to the collection value.

click on Variable Mappings and select varFileFQN package variable

Once done, connect the Script task (Get File List - ST) with the Foreach Loop container (Get Individual File FQN - FLC).

2.3: Configuring "Delete Files on Remote Directory - FST" File System Task

Finally, add the File System Task to the Foreach Loop container (Get Individual File FQN - FLC). To do that, simply drag it from the SSIS Toolbox to the design surface of the Loop container (Get Individual File FQN - FLC). Then, right-click on the newly added File System Task, then click Rename, and change the name from default name to Delete Files on Remote Directory - FST

Next, double-click on the File System Task, to open File System Task Editor, and then configure the following properties on General page:

  • Set Operation property to Delete file.
  • Set IsSourcePathVariable property to True.
  • Specify varFileFQN package variable as SourceVariable.
double-click File System Task, to open File System Task Editor

All done, our package is successfully configured, and it should look similar to figure below:

 our package is successfully configured

Testing

To test the package, simply assign values to the package variables, and then execute the package. For example, I specified the following values to package variables, to delete all files from \\JW02410\Temp\ shared folder that are older than 2 days.

assign values to package variables, and then execute the package

I specified the following values to package variables, to delete all files from \\JW02410\Temp\ shared folder that are older than 2 days

When I executed the package, it deleted all files from this location and within child directory of this location that are older than 2 days.

it deleted all files from this location

within child directory of this location that are older than 2 days
Next Steps


sql server categories

sql server webinars

subscribe to mssqltips

sql server tutorials

sql server white papers

next tip



About the author
MSSQLTips author Basit Farooq Basit Farooq is a Senior Database Administrator and has worked in the IT industry for 11+ years.

This author pledges the content of this article is based on professional experience and not AI generated.

View all my tips


Article Last Updated: 2013-06-13

Comments For This Article




Wednesday, August 10, 2022 - 6:21:08 AM - Jatin Back To Top (90368)
Hi Thanks for providing such a detail description of problem.

In case there is no file available then Object variable will be empty and no data will be passed to Foreach container, then in that case loop will fail. Any solution to that?

Wednesday, December 19, 2018 - 12:31:26 AM - Chenthil Back To Top (78525)

Why can't you delete the files inside the script task itself? 

Why we need FELC?


Wednesday, February 1, 2017 - 8:16:12 AM - pavithra Back To Top (45778)

 

 I want to get the file name list which are deleted,how can I achieve that.
Please advise


Monday, June 6, 2016 - 9:15:55 AM - Mark Back To Top (41617)

Excellent article thank-you for sharing...


Wednesday, November 12, 2014 - 10:24:10 AM - David Potter Back To Top (35267)

I have SQL Studio 2008, and SQL Server 2008 R2.. when I copy the code into "Get File List - ST" it is giving me "The binary code for the script is not found." Do you have a C# version that works with 2008?


Thursday, November 21, 2013 - 10:25:42 AM - Raj Back To Top (27571)

 

Getting the follwoing error. Could anyone help me to resolve.

 

File I wouldlike to delete  and typed inside varFileFQN : C:\Temp2\Detele.txt

ve-2E82.tmp
ve-4C04.tmp
ve-4C35.tmp

Error: The object in the variable "User::varFileList" does not contain an enumerator.

Error: The GetEnumerator method of the ForEach Enumerator has failed with error 0x80131509 "(null)". This occurs when the ForEach Enumerator cannot enumerate.


Friday, July 5, 2013 - 2:44:00 PM - Ravi Singh Back To Top (25730)

Getting error after setup:

Error: The object in the variable "User::varFileList" does not contain an enumerator.

Error: The GetEnumerator method of the ForEach Enumerator has failed with error 0x80131509 "(null)". This occurs when the ForEach Enumerator cannot enumerate.


Thursday, June 27, 2013 - 8:00:22 AM - Phil Hegedusich Back To Top (25594)

I've always felt that the prohibitions on xp_cmdshell, OLE and the like could be boiled down to "you're not allowed to perform file system operations from the SQL engine." So how would this approach be any different, or satisfy the "spirit of the law"? It's a specious argument that begins with "if an attacker were to gain access to the instance, they could use X to do Y." Sounds to me like there's a security problem, not a configuration problem.

I manage my file retention chores with a quiet .NET console app that runs from a Windows scheduled task. I don't see the need to involve the SQL engine here.


Tuesday, June 18, 2013 - 2:22:26 PM - Ray Back To Top (25472)

Excellent Article. Really very useful


Monday, June 17, 2013 - 5:29:01 PM - Scott Coleman Back To Top (25460)

If your goal is to delete files, and you're starting with a script task to identify the files, why wouldn't you just delete the file in the script rather than passing string collections around so you can end up at a File System Task?  Just because we can create programs in SSIS by wiring tasks together and filling in properties instead of writing some C# code, doesn't mean that is the correct approach.

First you should define the file age cutoff as a datetime value that can be directly compared to the file datetime attributes, and avoid calculating each file's age in days.  Then your innermost loop can be reduced to: (error handling removed for clarity)

    private void CheckFolder(DirectoryInfo folder)
    {
        foreach (DirectoryInfo subfolder in folder.EnumerateDirectories())
        {
            CheckFolder(subfolder);
        }
        foreach (FileInfo file in folder.EnumerateFiles(filePattern)) 
        {
            if ((file.LastWriteTime < ageLimit) ^ isCheckForNewer)
                file.Delete();
        }
    }
 














get free sql tips
agree to terms