By: Tim Smith | Updated: 2018-02-19 | Comments (1) | Related: More > Import and Export
Problem
We’re new to the bulk insert command in T-SQL and we have files that specify new rows and columns without using a common structure, like commas or vertical bars. We also sometimes need to insert a bulk of data from a file without specifying any row or column delimiters. Can we use this command in these cases, or should be come up with a dynamic direct insert by marking the data around apostrophes?
Solution
The bulk insert command in SQL Server allows us to configure how we want to parse data from files to fit with our data schema. It also provides us with tools that we can use to skip a first or last row, in case we receive files with output on the file one or two lines that’s meaningless for us to use or that does not fit our data structure.
In this tip, we’ll look at a few examples with data from files that we’ll seldom see and provide a clear example of how useful this tool can be in SQL Server for obtaining data from data delineated files.
Text Files for Input
We’ll create three text files that we will be inserting into a table.
In the first file, we’ll create three lines with the below information. This will be saved as tsp1.txt.
1t2t3t4t5t 0t2t4t6t8t 9t7t5t3t1t
In the second file, we’ll create the file with the below one line. This will be saved as tsp2.txt.
11.13.15.17.19,10.10.12.10.10,25.35.45.55.65
Finally, we’ll create the third file with a load of information in the form of a sentence posted together at least ten times (you can copy and paste more if you want to test more data). This will be saved as tsp3.txt.
SQL Tables for Data Storage
Now, we’ll create two tables – the first table we’ll use on the first two files and the final table we’ll use on the last file. The first table will have five tiny integer columns, matching the two files with five integers separated by different characters. These files have different ways of separating new rows – one file separates the new rows by a new line character while the other file separates the rows by a comma. In the first table, each column is named with Column and its ordered number. In the second table, we only have one column with a maximum number of varchar characters allowed.
CREATE TABLE tbIns1( ColumnOne TINYINT, ColumnTwo TINYINT, ColumnThree TINYINT, ColumnFour TINYINT, ColumnFive TINYINT ) CREATE TABLE tbIns2( ColumnOne VARCHAR(MAX) )
Using Bulk Insert
With these tables created, we will now insert the first file's data into the first table with five tiny integer columns. We specify that the column terminator is a t and the row terminator is a new line character (0x0a is the hexadecimal for new line character).
BULK INSERT tbIns1 FROM 'E:\ETLFiles\tsp1.txt' WITH ( FIELDTERMINATOR = 't' ,ROWTERMINATOR = '0x0a' ) SELECT * FROM tbIns1
We see that the bulk insert code inserted the correct number of rows and columns of data – five columns of tiny integers with three rows. Without specifying the first row of data, T-SQL will default the insert to the first row. We’ll run the below code, truncating the table first, then specifying the starting row only as an example – note how it skips the first row. This will be useful if we have character separated files which have headers in the first row.
TRUNCATE TABLE tbIns1 BULK INSERT tbIns1 FROM 'E:\ETLFiles\tsp1.txt' WITH ( FIELDTERMINATOR = 't' ,ROWTERMINATOR = '0x0a' ,FIRSTROW=2 ) SELECT * FROM tbIns1
Without specifying the first row, T-SQL will default to inserting everything from the file. This is important because we can specify both the first and last row with files that may have headers or end of file lines with data we want to skip (they may throw an error if they don’t match our data schema).
In the next example, we’ll truncate our table and insert the second file. In the second file, we do not specify new rows by new lines, each row of data is terminated by a comma (thus a comma is separating the rows), whereas new columns appear after a period. While these type of files are rare, we will run into these files sometimes where a new line is not the row terminator (the end of the row).
TRUNCATE TABLE tbIns1 BULK INSERT tbIns1 FROM 'E:\ETLFiles\tsp2.txt' WITH ( FIELDTERMINATOR = '.' ,ROWTERMINATOR = ',' ) SELECT * FROM tbIns1
In the final example, we’ll insert the ten sentences into a table with a column of maximum varchar characters. Notice that we don’t specify any delimiter, as we want to insert all the data into one column. If we have a delimiter, like the above examples, we would specify it. In this case, we want all data in one row, and we don’t specify anything. All data are inserted into the row:
BULK INSERT tbIns2 FROM 'E:\ETLFiles\tsp3.txt' SELECT * FROM tbIns2
When inserting data and thinking about what separates columns and rows, it’s important to remember that with bulk insert the option fieldterminator specifies what separates the columns (also known as fields). The rowterminator separates the rows. In the third example, we didn’t have any rows separate – in rare cases, we may insert a large selection of data that we intend to parse, or that is designed for output through an application in its form (think of a disclaimer or block of html code).
Next Steps
- The most common separated files usually are commas or vertical bars. In rare cases with numeric data, we may see alpha characters separating values because the commas are involved in the decimal values and should be retained within the column.
- We may run into situations where extra separators appear erroneously, causing our insert to fail. In this tip we look at a way in which we can find these lines ahead of time before inserting the data and remove them. We can choose to further investigate them, or we may discover these are erroneous values.
- Let’s consider that bulk insert is one tool that we can use for ETL involving file extraction. We can also use tools like SSIS and C# (or .NET with PowerShell) for extracting file data.
About the author
This author pledges the content of this article is based on professional experience and not AI generated.
View all my tips
Article Last Updated: 2018-02-19