By: Nat Sundar | Updated: 2018-03-23 | Comments (8) | Related: More > Integration Services Development
Problem
I would like to import JSON data using SQL Server Integration Services. Are there any default control flow tasks and data flow source adapters available to manage JSON data files in SSIS?
Solution
SQL Server Integration Services does not support JSON natively, so we need to develop a custom source for a data flow task to load JSON file. In this tip, I will walkthrough a method to develop a bespoke source to load JSON files using .Net libraries.
Basic JSON Format
A JSON document must have an object or an array at its root. A JSON object is represented by {} and a JSON array is represented by []. A JSON object can contain another JSON object, a JSON array or string.
The JSON property details will be represented as key/value pairs. These key value sets are separated by using a colon “:” and multiple sets are separated using a comma, “,”.
Understanding JSON file format
The first important step in the process is to understand the format of the JSON. If you are using Notepad++, then the JSON plugins are a good starting point to learn about different JSON formats.
You can install the JSON viewer from here and install JSToolAPP from here. The JSON viewer plugin will help you to understand the format of the JSON. The second plugin JSToolApp will help you to format the JSON. In this tip, I will use Notepad++ and these plugins to showcase the example.
Sample JSON file
Let us assume that we have been supplied a JSON file for orders. This JSON file contain two order details and each order has the details such as order id, customer id and order status.
Now let us open this file in the Notepad++ and select the data contents and press Ctrl + Alt+ Shift + J. This will open the JSON viewer and will display the JSON object model.
From the JSON viewer, it is evident that the JSON file has an array of objects. Each object represents an order. The object attributes are representing the details of an order.
It is observed that the supplied JSON file has an object array within the JSON document. An order has three attributes namely OrderID, CustomerID and OrderStatus.
Developing Custom Data Source for JSON File
As Microsoft has not supplied the default data source for JSON files, we have to develop a custom/bespoke data source using the script component. In this tip, we will use a script component to build a data source for JSON.
Let's add a script component to the data flow task. This script component will read the JSON file and generate the output records with the help of .Net libraries. So, let's configure this script component as a source.
As a first step, let's prepare the script component to generate an order dataset. Let's add the order attributes (OrderID, CustomerID & OrderStatus) as output columns with suitable datatypes as mentioned in the image below.
Now we are ready to write C# code to read the JSON file. We will be using the function in the .Net library System.Web.Extension to read the JSON file. So, to access the function, we need to add this library as a reference. This can be achieved by selecting the project and by right clicking on Add Reference.
In the dialog box, under the .NET tab, look for "System.Web.Extension" library and select it to add it as a reference.
Once done, the library will be visible under the reference section.
To enable us to use the library, we need to add that in the namespace region as shown in the below image.
using System; using System.Data; using Microsoft.SqlServer.Dts.Pipeline.Wrapper; using Microsoft.SqlServer.Dts.Runtime.Wrapper; using System.Collections.Generic; using System.Text; using System.Web.Script.Serialization; using System.IO; using OrderNamespace;
Before we start coding on the C#, let us learn some basics.
Deserialization
Deserialization is a process that helps to transform JSON document to a runtime object. Once the data is available as a runtime object, then it can be parsed by using the .Net libraries.
Now we need to read the JSON file content and deserialize it to convert into runtime object.
Creating an Order Class
We need to create object that can hold the JSON content. So, let's create a class in the C#. This class must have the same structure and properties as the JSON content.
A C# class can be created by selecting the project and click Add > Select Class as mentioned in the image below.
Type the name of the class as "Order".
Now we need to rename the namespace as "OrderNamespace" in the class as mentioned below. Also, we need to create an Order class with three properties (OrderID, CustomerID and OrderStatus).
using System; using System.Collections.Generic; using System.Linq; using System.Text; namespace OrderNamespace { public class Order { public int orderid { get; set; } public int customerID { get; set; } public string orderstatus { get; set; } } }
The name of the properties has to match the name of the attributes mentioned in the JSON file.
Let's assume that the Orders.JSON file is available in the local folder and let's read the file content into a string using File.ReadallText function. The JavaScriptSeralizaer is an object defined in the System.Web.extension assembly that will be used to deserialize the JSON file content. So, create an instance of the object JavaScriptSerializer. The "Deserialize" function in the JavaScriptSerializer object will deserialize and return a runtime object of type "Order". As the Orders JSON file contains an array the Deserialize function will return a List of type "Order".
Once the JSON file content has been read, the list can be iterated through using a for each loop as shown in the picture below.
As there are two order details in the JSON file, the loop will iterate through two times. In the for each loop the order attributes can be read and provide the output to output buffer.
public override void CreateNewOutputRows() { /* Add rows by calling the AddRow method on the member variable named "<Output Name>Buffer". For example, call MyOutputBuffer.AddRow() if your output was named "MyOutput". */ String jsonFileContent = File.ReadAllText(@"E:\WorkArea\Analysis\JSON\Data\Orders.JSON"); JavaScriptSerializer js = new JavaScriptSerializer(); List<Order> orders = js.Deserialize<List<Order>>(jsonFileContent); foreach (Order order in orders) { Output0Buffer.AddRow(); Output0Buffer.OrderID = order.orderid; Output0Buffer.CustomerID = order.customerID; Output0Buffer.OrderStatus = order.orderstatus; } }
Let's add a union all after the script component and add a data viewer to see the actual data. Now let's execute the package and you can see there are two order details available in the data pipeline.
Summary
In this tip, we have learned about importing JSON data using SQL Server Integration Services. Also, we have learned about deserializing JSON content into a JSON runtime object.
Next Steps
About the author
This author pledges the content of this article is based on professional experience and not AI generated.
View all my tips
Article Last Updated: 2018-03-23