Options to Improve SQL Server Bulk Load Performance

By: Ben Snaidero | Updated: 2014-12-31 | Comments (6) | Related: More > Import and Export

Problem

As a DBA I am always looking for ways to improve SQL Server performance. SQL Server bulk loading data while it seems like a pretty straightforward task does have some different options that can be implemented to both improve the speed at which data is loaded as well as decrease the total amount of resources used to perform the data load. This tip will look at a few of those options.

Solution

In this tip we are going to focus our performance test on options using the BULK INSERT T-SQL command. More information on other methods of doing bulk loads can be found in this tip on Minimally logging bulk load insert into SQL Server.

Sample Table Setup for Bulk Insert Testing

In order to perform this test we are going to need some sample data to load so let's first create a simple table with a few indexes on it and load some sample data into it. The T-SQL code to perform these tasks is below.

-- Create sample table and indexes
CREATE TABLE testtable ([col1] [int] NOT NULL primary key clustered,
                        [col2] [int] NULL,
                        [col3] [int] NULL,
                        [col4] [varchar](50) NULL); 

CREATE INDEX idx_testtable_col2 on testtable (col2 asc);
CREATE INDEX idx_testtable_col3 on testtable (col3 asc);

-- Load sample data into table
DECLARE @val INT
SELECT @val=1
WHILE @val < 5000000
BEGIN  
   INSERT INTO testtable (col1, col2, col3, col4) 
       VALUES (@val,@val % 10,@val,'TEST' + CAST(@val AS VARCHAR))
   SELECT @val=@val+1
END
GO

Since in most bulk load cases you are pulling in data from a flat file let's use the bcp command to move the data we created above out to the file system. To perform this task we will also require a format file. The commands to do these steps are as follows.

-- Create a format file
bcp -S . testdb.dbo.testtable format nul -c -t, -f c:\testtable_formatfile.fmt -T

-- Create sample data file
bcp testdb.dbo.testtable out c:\testtable.dat -c -t, -f c:\testtable_formatfile.fmt -T -S .

Now we just have one last step to perform in order to complete our preparation. We need to empty the table that is in our database so we can reload data into it and to do this we will use the TRUNCATE command. Note: You will need to perform this step before testing each load scenario.

TRUNCATE TABLE testdb.dbo.testtable

Test Scenarios for SQL Server Bulk Insert

For this performance test we will look at the following 4 scenarios. Each scenario builds on the previous by adding a new option which will hopefully speed up performance.

BULK load
BULK load with tablock
BULK load with tablock and drop/recreate indexes
BULK load with tablock and drop/recreate indexes and change recovery model

The following is the code to perform each scenario. Remember as noted above that you will need to TRUNCATE the target table before executing each scenario.

--BULK load
BULK INSERT testdb.dbo.testtable
FROM 'C:\testtable.dat'
WITH (FORMATFILE = 'C:\testtable_formatfile.fmt');


--BULK load with tablock
BULK INSERT testdb.dbo.testtable
FROM 'C:\testtable.dat'
WITH (FORMATFILE = 'C:\testtable_formatfile.fmt', TABLOCK);


--BULK load with tablock and drop/recreate indexes
DROP INDEX testtable.idx_testtable_col2;
DROP INDEX testtable.idx_testtable_col3;

BULK INSERT testdb.dbo.testtable
FROM 'C:\testtable.dat'
WITH (FORMATFILE = 'C:\testtable_formatfile.fmt', TABLOCK);

CREATE INDEX idx_testtable_col2 on testtable (col2 asc);
CREATE INDEX idx_testtable_col3 on testtable (col3 asc);


--BULK load with tablock and drop/recreate indexes and change recovery model
ALTER DATABASE testdb SET RECOVERY SIMPLE;

DROP INDEX testtable.idx_testtable_col2;
DROP INDEX testtable.idx_testtable_col3;

BULK INSERT testdb.dbo.testtable
FROM 'C:\testtable.dat'
WITH (FORMATFILE = 'C:\testtable_formatfile.fmt', TABLOCK);

CREATE INDEX idx_testtable_col2 on testtable (col2 asc);
CREATE INDEX idx_testtable_col3 on testtable (col3 asc);
 
ALTER DATABASE testdb SET RECOVERY FULL;

Test Results

Let's take a look at the results after running each scenario. You can capture your statistics using either SQL Profiler or SET STATISTICS. I used SQL Profiler in this case.

Test Scenario	CPU (ms)	Reads	Writes	Duration (ms)
1	52291	75138341	57704	142978
2	30763	15165138	57572	96844
3	20763	195589	39629	67510
4	53006	195916	39651	62766

Looking at the results above as you would expect the performance gets better with each scenario. The one interesting thing to note (and I tested this multiple times to make sure it was accurate) is that although the last scenario completed faster than any of the other options it did use just as much CPU as the original option and more than options 2 and 3.

Next Steps

Perform similar performance tests using other bulk load methods - SSIS, bcp, etc...
Read more on Other options for importing data into SQL Server

About the author

Ben Snaidero has been a SQL Server and Oracle DBA for over 10 years and focuses on performance tuning.

This author pledges the content of this article is based on professional experience and not AI generated.

View all my tips

Article Last Updated: 2014-12-31

Thursday, January 31, 2019 - 9:38:06 AM - David Plaut	Back To Top (78924)
I have found it helpful to disable and rebuild indexes rather than drop/recreate.

Wednesday, January 14, 2015 - 12:41:32 PM - JeremyH	Back To Top (35941)
Any theory as to why the writes didn't really go for test scenario 4? I would expect a lot less writes to the t-log since it should have been minimally logged. Were you only capturing data file writes/reads?

Wednesday, January 14, 2015 - 11:05:53 AM - logitestus	Back To Top (35939)
I noticed that you set the recovery model to SIMPLE why not to BULK_LOGGED?