SQL Server XML Bulk Loading Example

By:   |   Updated: 2010-03-05   |   Comments (4)   |   Related: 1 | 2 | 3 | 4 | > XML


Problem

The institution I work with receives data files in a variety of formats for import into a central SQL Server 2000 repository. We have a good handle on importing the files; however, some of the providers want to send data using XML files. Is there a simple process I can create to import these files?

Solution

XML is one of the most common mechanisms for data transfer, so it's no surprise that XML files are being used. While being flexible in how it can be developed, it is rigid in its acceptance of data on the receiving end. Thankfully there is a method available to assist with bulk importing XML files-that method is using SQLXMLBulkLoad. SQLXMLBulkLoad is part of SQLXML, which was originally created after the release of SQL Server 2000 to provide a bridge between SQL Server and XML data. In concept bulk loading XML files is fairly simple, although you must be familiar with the structure of XML files, XML schema language, and Visual Basic Scripting. Let's review the steps required to develop an XML Bulk Load strategy.

Get To Know The XML Files
You no doubt already know the structure of your database, as well as the format of the files you've previously uploaded. XML files follow a format similar to HTML in that data is enclosed within a hierarchical tag structure. Since XML technically stands for "eXtensible Markup Language", tags can be named pretty much whatever you want them to be named. But both the file creator and file recipient must use, and be looking for, consistent tag naming. In order for an XML file to be readable it must be what is called "well-formed". That means:

  • Although not required, there is usually a declaration in the first part indicating the XML version and encoding
  • There must be a ROOT tag that encloses the dataset. This tag can simply be called "ROOT" or other meaningful name
  • There are nodes that can be one level or more
  • Each tag must have an opening and closing tag. How you provide the opening and closing tag is up to you and the file creator

As an example we will import an XML file consisting of simple computer inventory data into a SQL Server instance. To house the information we create the following table and XML file:

CREATE TABLE dbo.tbl_ComputerInventory
(
     ComputerName VARCHAR(50)
     , IPAddress VARCHAR(15)
     , MACAddress VARCHAR(17)
     , ComputerType VARCHAR(20)
     , ModelNumber VARCHAR(20)
)
Click here for a sample XML file for the table above

Develop The XML Schema Definition file
The XML Schema Definition file (XSD) is simply a blueprint of the XML file structure and is recommended by the World Wide Web Consortium (W3C) as a standard for defining XML files. It defines the expectations of both the preparer and recipient, so an attempt to validate an improperly formatted XML file against a XSD file will result in failure. This applies to both tag structure as well as data incompatibilities (i.e., an alphabetic character in an integer field).

If you are familiar with XML formats you can create the schema in Notepad; if not you can use XML Notepad 2007 to create the schema file. In XML schema files there are two types of data types: complex and simple. To simplify the concept there is only one difference between simple and complex types: simple types cannot have child elements or attributes, whereas complex types can. In cases where you're bulk-loading XML data you will use the complex type. Also in the file is a declaration that provides information on how the XML data is related to the SQL database. Each document instance is then mapped to a table using the "sql:relation" phrase. In this case the "InventoryItem" document instance is mapped to the "tbl_ComputerInventory" table. Click here for a sample XML Schema Definition file based on the XML files above.

Using Visual Basic Script to Create the SQLXMLBulkLoad Object
Now that we have the XML file to test and the XML Schema file created, the next step is to create the Visual Basic script to bulk load the files. This Visual Basic script can be either a .vbs file or an ActiveX Script Task in a DTS package. The first section of the script is creation of a SQLXMLBulkLoad object. Within the SQLXMLBulkLoad object are a number of parameters that can be set-some mandatory, others optional, depending on the needs of your organization.


set objBL = CreateObject("SQLXMLBulkLoad.SQLXMLBulkload.4.0") objBL.ConnectionString = "provider=SQLOLEDB;data source=CULLENWS01\PRD2008;database=MSSQLTIPS;integrated security=SSPI"
objBL.KeepIdentity=False
objBL.Transaction=True
objBL.ErrorLogFile = "F:\xml\error.xml"
objBL.Execute "F:\xml\ComputerInventorySchema.xml", "F:\xml\ComputerInventory02.xml"
set objBL=Nothing
MsgBox "The SQLXMLBulkLoad process completed successfully",0,"XML Bulk Load Process"

In our case we created a DTS package that is executed by a SQL Server Agent job on a regular basis. The package actually loops each file in the network share and attempts the bulk load on each one.

One interesting thing to note is the dataset below. The first time I executed the bulk load statement, the IPAddress and MACAddress were not imported. Now, no error was thrown during the process, but the bulk process ignored the columns because IPAddress and MACAddress were spelled IPaddress and MACaddress. Once corrected to the actual case sensitive spelling, the import populated all of the information. Yet another reminder that XML is case-sensitive and to keep that in mind when creating the XML Schema Definition file. Also, there is another method for formatting XML files (click here for a sample) where all tags are enclosed in a single line. Although this file format is technically "properly formed", the XML Schema Definition file could not map the information to their respective columns. The bulk load process, however, did not fail-when a query was executed the rows that could not be mapped were NULL.

During testing with live data, I found that the process works for both local files and files located on network shares. For an XML file with over 10,700 rows, and four columns in each row, the average execution time was 1.8 seconds for local files and around 2.6 seconds for files located on network shares.

View of data imported using SQLXMLBulkLoad
Next Steps


sql server categories

sql server webinars

subscribe to mssqltips

sql server tutorials

sql server white papers

next tip



About the author
MSSQLTips author Tim Cullen Tim Cullen has been working in the IT industry since 2003 and currently works as a SQL Server Reports Developer.

This author pledges the content of this article is based on professional experience and not AI generated.

View all my tips


Article Last Updated: 2010-03-05

Comments For This Article




Tuesday, February 21, 2012 - 4:48:42 PM - Linda Leslie Back To Top (16120)

We pull the file contents into a varchar(max) field and select and pass to a stored procedure to import the XML since we usually just load scratch tables first and then add/update/delete the active application table. No SSIS and not VB Script. Just SQL.
The file location is stored in a field in our media table when the file is uploaded.

Thanks,
Linda

 


Friday, March 5, 2010 - 11:09:19 AM - timothyrcullen Back To Top (5015)

Hey Mike:

Just out of curiosity, how long does it take for your process to import XML?

Tim


Friday, March 5, 2010 - 9:27:30 AM - mdv3441 Back To Top (5014)

Although a little non-SQL Server, we use MS ACCESS 2007 to import XML then import the ACCESS table, so much easier, XML is a native import.  a simple XML datasource should be added to SQL Server  like it handles a flat file.

 

Mike


Friday, March 5, 2010 - 7:15:39 AM - ChadMiller Back To Top (5011)
BTW -- You can also use Powershell instead of VBScript. I have an example in this article http://www.sqlservercentral.com/articles/powershell/65196/














get free sql tips
agree to terms