By: Daniel Farina | Updated: 2015-01-20 | Comments (7) | Related: More > Integration Services Data Flow Transformations
Problem
In my previous tip, I introduced aspects to consider when importing data from a mainframe environment to SQL Server. We have seen how to handle EBCDIC coded files and COBOL’s numeric unpacked or zoned values. In this tip, I will show you how to handle packed numeric values.
Solution
Packed numbers are amongst the hardest data sets to import into a SQL Server database using Integration Services. When a text file contains packed numbers or any other kind of binary data we can’t handle its content as simple text anymore, we need to perform analysis about its content to see if, for example the file contains binary zeros or something else that would make our package end with an error. But as I told you in my previous tip, SSIS offers us the possibility to extend its functionality by adding custom .NET code.
What are Packed Numbers?
A packed number is a type of Binary Coded Decimal (BCD) number that holds two decimal digits per byte in contrary of the Zoned number representation that holds one digit per byte. For example, the number 48 can be represented into a byte as the hex number 0x48. Also, like zoned numbers, packed numbers can have implied decimal digits.
But just like with the zoned numbers, the less significant nibble of the first byte is used to hold the sign by using a hexadecimal C for positive numbers, a D for the negatives and an F for unsigned values. On the next image you can see a representation of the previous statement.
This allowed programmers to economize space by using half of the size for numerical representation. Considering that back in a day the cost for storage and memory was much higher than nowadays, it was a wise decision to implement packed numbers at the expense of increased code complexity.
The Binary Zero Problem
Suppose we have the number 40.06. This number when packed, will be represented as 0x04 0x00 0x6C. Notice that the second byte is a binary zero, also known as a NULL value, which is used as the string terminator. If you take a look at the SSIS documentation about Data Types for the DT_STR type, you will see that Integration Services truncates the string at the occurrence of the first NULL, so we won’t be able to handle this field as a string. Instead we have to use the DT_BYTES data type, but as you will see further on, it is not enough. We have to instruct the SSIS pipeline to handle the input data in binary format.
Setting up the Test Environment
First we are going to create a sample database.
USE [master] GO CREATE DATABASE [SampleDB] CONTAINMENT = NONE ON PRIMARY ( NAME = N'SampleDB_file1', FILENAME = N'E:\MSSQL\SampleDB_1.mdf', SIZE = 128MB , MAXSIZE = UNLIMITED, FILEGROWTH = 64MB) LOG ON ( NAME = N'SampleDB_log_file1', FILENAME = N'E:\MSSQL\SampleDB_1.ldf', SIZE = 64MB, MAXSIZE = 2048GB, FILEGROWTH = 32MB) GO
We need some sample data in a text file to load into our previously created table. On this link you can download a sample text file.
The next is close to what you will see in a COBOL’s Copy File (a file that contains the record definition) for the file of the previous step. I added a brief description about the data types just in case you don’t know about Cobol’s syntax.
**---------------------------------------------- **-- CUSTOMERS DATA **---------------------------------------------- 01 CUSTOMER-RECORD. 05 CUSTOMERNAME PIC X(50). 05 CUSTOMERADDRESS PIC X(50). 05 CUSTOMERCATEGORY PIC 9 COMP-3. 05 CUSTOMERBALANCE PIC 9(8)V99 COMP-3.
A suitable definition for a table to load this file is next.
USE SampleDB GO IF OBJECT_ID('Customers','U') IS NOT NULL BEGIN DROP TABLE Customers END GO CREATE TABLE Customers ( CustomerID INT IDENTITY (1, 1) NOT NULL, CustomerName VARCHAR (50) NOT NULL, CustomerAddress VARCHAR (50) NOT NULL, CustomerCategory TINYINT NOT NULL, CustomerBalance DECIMAL(10,2) NOT NULL PRIMARY KEY CLUSTERED (CustomerID) )
The next step is to create a SSIS project and add a Data Flow task, a Flat File Connection to the file created in the previous step and an OLEDB connection to our sample database as on the next image.
Then setup the flat file connection with a fixed width format, an ANSI code page and select {CR}{LF} as the row delimiter like on the next image.
On the Advanced view we are going to add four columns named “Name”, “Address”, “Category” and “Balance”.
According to the file’s record definition we will configure columns Name and Address with an input and output of 50 and select string as the Data Type. Also we must set the TextQualified option to False. For the Category and Balance fields we need to select DT_BYTES as the Data Type; and 1 and 6 for its lengths respectively. The next image should clarify the steps to follow.
Now we need to drop into our package a Flat File Source, an OLEDB Destination and a Script Component. After dropping the Script Component a window will pop up asking us how we want to use this Script Component. Since we are going to perform a transformation we check the Transformation radio button and click the OK button.
This step is crucial to successfully import this file. It is so important that a whole tip can be written about this. We are going to instruct the SSIS pipeline to handle the imported data for columns Category and Balance as binary data. To do so, just right click on the Flat File Source and select Show Advanced Editor from the drop down list.
After the Advanced Editor window opens, go to the Input and Output Properties Tab. Browse the Inputs and Outputs Tree by expanding the Flat File Source Output and the Output columns branches. Step into the Categories and Balance leaves and on the properties frame set to true the UseBinaryFormat property. The next image will clarify things. Setting this option to true will instruct the SSIS pipeline to handle the columns as binary data.
On the Script Tab select Visual Basic as the Script Language.
The next step is to set up the Inputs and Outputs tab. We have to add four columns into the Output folder tree; those columns will be in where we will return the transformed data to be inserted into our database.
After adding the columns we have to configure the output data types for each of them. According to our file definition, the data types for CustomerName and CustomerAddress are String with a length of 50. Also the data types for CustomerCategory and CustomerBalance are Single Byte Unsigned Integer and Numeric with Precision 10 and Scale 2 respectively. The next image may be used as a guide.
Now, on the Script Tab, press the Edit Script button. Another instance of Visual Studio will arise on which we will paste the next script code.
#Region "Imports" Imports System Imports System.Data Imports System.Math Imports Microsoft.SqlServer.Dts.Pipeline.Wrapper Imports Microsoft.SqlServer.Dts.Runtime.Wrapper #End Region ' This is the class to which to add your code. Do not change the name, attributes, or parent ' of this class. <Microsoft.SqlServer.Dts.Pipeline.SSISScriptComponentEntryPointAttribute()> _ <CLSCompliant(False)> _ Public Class ScriptMain Inherits UserComponent Private ASCIItoEBCDIC As String = Chr(&H0) + Chr(&H1) + Chr(&H2) + Chr(&H3) + Chr(&H37) + Chr(&H2D) + Chr(&H2E) + Chr(&H2F) + Chr(&H16) + Chr(&H5) + Chr(&H25) + Chr(&HB) + _ Chr(&HC) + Chr(&HD) + Chr(&HE) + Chr(&HF) + Chr(&H10) + Chr(&H11) + Chr(&H12) + Chr(&H13) + Chr(&H3C) + Chr(&H3D) + Chr(&H32) + Chr(&H26) + _ Chr(&H18) + Chr(&H19) + Chr(&H3F) + Chr(&H27) + Chr(&H1C) + Chr(&H1D) + Chr(&H1E) + Chr(&H1F) + Chr(&H40) + Chr(&H5A) + Chr(&H7F) + Chr(&H7B) + _ Chr(&H5B) + Chr(&H6C) + Chr(&H50) + Chr(&H7D) + Chr(&H4D) + Chr(&H5D) + Chr(&H5C) + Chr(&H4E) + Chr(&H6B) + Chr(&H60) + Chr(&H4B) + Chr(&H61) + _ Chr(&HF0) + Chr(&HF1) + Chr(&HF2) + Chr(&HF3) + Chr(&HF4) + Chr(&HF5) + Chr(&HF6) + Chr(&HF7) + Chr(&HF8) + Chr(&HF9) + Chr(&H7A) + Chr(&H5E) + _ Chr(&H4C) + Chr(&H7E) + Chr(&H6E) + Chr(&H6F) + Chr(&H7C) + Chr(&HC1) + Chr(&HC2) + Chr(&HC3) + Chr(&HC4) + Chr(&HC5) + Chr(&HC6) + Chr(&HC7) + _ Chr(&HC8) + Chr(&HC9) + Chr(&HD1) + Chr(&HD2) + Chr(&HD3) + Chr(&HD4) + Chr(&HD5) + Chr(&HD6) + Chr(&HD7) + Chr(&HD8) + Chr(&HD9) + Chr(&HE2) + _ Chr(&HE3) + Chr(&HE4) + Chr(&HE5) + Chr(&HE6) + Chr(&HE7) + Chr(&HE8) + Chr(&HE9) + Chr(&HAD) + Chr(&HE0) + Chr(&HBD) + Chr(&H5F) + Chr(&H6D) + _ Chr(&H79) + Chr(&H81) + Chr(&H82) + Chr(&H83) + Chr(&H84) + Chr(&H85) + Chr(&H86) + Chr(&H87) + Chr(&H88) + Chr(&H89) + Chr(&H91) + Chr(&H92) + _ Chr(&H93) + Chr(&H94) + Chr(&H95) + Chr(&H96) + Chr(&H97) + Chr(&H98) + Chr(&H99) + Chr(&HA2) + Chr(&HA3) + Chr(&HA4) + Chr(&HA5) + Chr(&HA6) + _ Chr(&HA7) + Chr(&HA8) + Chr(&HA9) + Chr(&HC0) + Chr(&H4F) + Chr(&HD0) + Chr(&HA1) + Chr(&H7) + Chr(&H20) + Chr(&H21) + Chr(&H22) + Chr(&H23) + _ Chr(&H24) + Chr(&H15) + Chr(&H6) + Chr(&H17) + Chr(&H28) + Chr(&H29) + Chr(&H2A) + Chr(&H2B) + Chr(&H2C) + Chr(&H9) + Chr(&HA) + Chr(&H1B) + _ Chr(&H30) + Chr(&H31) + Chr(&H1A) + Chr(&H33) + Chr(&H34) + Chr(&H35) + Chr(&H36) + Chr(&H8) + Chr(&H38) + Chr(&H39) + Chr(&H3A) + Chr(&H3B) + _ Chr(&H4) + Chr(&H14) + Chr(&H3E) + Chr(&HE1) + Chr(&H41) + Chr(&H42) + Chr(&H43) + Chr(&H44) + Chr(&H45) + Chr(&H46) + Chr(&H47) + Chr(&H48) + _ Chr(&H49) + Chr(&H51) + Chr(&H52) + Chr(&H53) + Chr(&H54) + Chr(&H55) + Chr(&H56) + Chr(&H57) + Chr(&H58) + Chr(&H59) + Chr(&H62) + Chr(&H63) + _ Chr(&H64) + Chr(&H65) + Chr(&H66) + Chr(&H67) + Chr(&H68) + Chr(&H69) + Chr(&H70) + Chr(&H71) + Chr(&H72) + Chr(&H73) + Chr(&H74) + Chr(&H75) + _ Chr(&H76) + Chr(&H77) + Chr(&H78) + Chr(&H80) + Chr(&H8A) + Chr(&H8B) + Chr(&H8C) + Chr(&H8D) + Chr(&H8E) + Chr(&H8F) + Chr(&H90) + Chr(&H9A) + _ Chr(&H9B) + Chr(&H9C) + Chr(&H9D) + Chr(&H9E) + Chr(&H9F) + Chr(&HA0) + Chr(&HAA) + Chr(&HAB) + Chr(&HAC) + Chr(&H4A) + Chr(&HAE) + Chr(&HAF) + _ Chr(&HB0) + Chr(&HB1) + Chr(&HB2) + Chr(&HB3) + Chr(&HB4) + Chr(&HB5) + Chr(&HB6) + Chr(&HB7) + Chr(&HB8) + Chr(&HB9) + Chr(&HBA) + Chr(&HBB) + _ Chr(&HBC) + Chr(&H6A) + Chr(&HBE) + Chr(&HBF) + Chr(&HCA) + Chr(&HCB) + Chr(&HCC) + Chr(&HCD) + Chr(&HCE) + Chr(&HCF) + Chr(&HDA) + Chr(&HDB) + _ Chr(&HDC) + Chr(&HDD) + Chr(&HDE) + Chr(&HDF) + Chr(&HEA) + Chr(&HEB) + Chr(&HEC) + Chr(&HED) + Chr(&HEE) + Chr(&HEF) + Chr(&HFA) + Chr(&HFB) + _ Chr(&HFC) + Chr(&HFD) + Chr(&HFE) + Chr(&HFF) Private EBCDICtoASCII As String = Chr(&H0&) + Chr(&H1&) + Chr(&H2&) + Chr(&H3&) + Chr(&H9C&) + Chr(&H9&) + Chr(&H86&) + Chr(&H7F&) + Chr(&H97&) + Chr(&H8D&) + Chr(&H8E&) + _ Chr(&HB&) + Chr(&HC&) + Chr(&HD&) + Chr(&HE&) + Chr(&HF&) + Chr(&H10&) + Chr(&H11&) + Chr(&H12&) + Chr(&H13&) + Chr(&H9D&) + Chr(&H85&) + _ Chr(&H8&) + Chr(&H87&) + Chr(&H18&) + Chr(&H19&) + Chr(&H92&) + Chr(&H8F&) + Chr(&H1C&) + Chr(&H1D&) + Chr(&H1E&) + Chr(&H1F&) + Chr(&H80&) + _ Chr(&H81&) + Chr(&H82&) + Chr(&H83&) + Chr(&H84&) + Chr(&HA&) + Chr(&H17&) + Chr(&H1B&) + Chr(&H88&) + Chr(&H89&) + Chr(&H8A&) + Chr(&H8B&) + _ Chr(&H8C&) + Chr(&H5&) + Chr(&H6&) + Chr(&H7&) + Chr(&H90&) + Chr(&H91&) + Chr(&H16&) + Chr(&H93&) + Chr(&H94&) + Chr(&H95&) + Chr(&H96&) + _ Chr(&H4&) + Chr(&H98&) + Chr(&H99&) + Chr(&H9A&) + Chr(&H9B&) + Chr(&H14&) + Chr(&H15&) + Chr(&H9E&) + Chr(&H1A&) + Chr(&H20&) + Chr(&HA0&) + _ Chr(&HA1&) + Chr(&HA2&) + Chr(&HA3&) + Chr(&HA4&) + Chr(&HA5&) + Chr(&HA6&) + Chr(&HA7&) + Chr(&HA8&) + Chr(&HD5&) + Chr(&H2E&) + Chr(&H3C&) + _ Chr(&H28&) + Chr(&H2B&) + Chr(&H7C&) + Chr(&H26&) + Chr(&HA9&) + Chr(&HAA&) + Chr(&HAB&) + Chr(&HAC&) + Chr(&HAD&) + Chr(&HAE&) + Chr(&HAF&) + _ Chr(&HB0&) + Chr(&HB1&) + Chr(&H21&) + Chr(&H24&) + Chr(&H2A&) + Chr(&H29&) + Chr(&H3B&) + Chr(&H5E&) + Chr(&H2D&) + Chr(&H2F&) + Chr(&HB2&) + _ Chr(&HB3&) + Chr(&HB4&) + Chr(&HB5&) + Chr(&HB6&) + Chr(&HB7&) + Chr(&HB8&) + Chr(&HB9&) + Chr(&HE5&) + Chr(&H2C&) + Chr(&H25&) + Chr(&H5F&) + _ Chr(&H3E&) + Chr(&H3F&) + Chr(&HBA&) + Chr(&HBB&) + Chr(&HBC&) + Chr(&HBD&) + Chr(&HBE&) + Chr(&HBF&) + Chr(&HC0&) + Chr(&HC1&) + Chr(&HC2&) + _ Chr(&H60&) + Chr(&H3A&) + Chr(&H23&) + Chr(&H40&) + Chr(&H27&) + Chr(&H3D&) + Chr(&H22&) + Chr(&HC3&) + Chr(&H61&) + Chr(&H62&) + Chr(&H63&) + _ Chr(&H64&) + Chr(&H65&) + Chr(&H66&) + Chr(&H67&) + Chr(&H68&) + Chr(&H69&) + Chr(&HC4&) + Chr(&HC5&) + Chr(&HC6&) + Chr(&HC7&) + Chr(&HC8&) + _ Chr(&HC9&) + Chr(&HCA&) + Chr(&H6A&) + Chr(&H6B&) + Chr(&H6C&) + Chr(&H6D&) + Chr(&H6E&) + Chr(&H6F&) + Chr(&H70&) + Chr(&H71&) + Chr(&H72&) + _ Chr(&HCB&) + Chr(&HCC&) + Chr(&HCD&) + Chr(&HCE&) + Chr(&HCF&) + Chr(&HD0&) + Chr(&HD1&) + Chr(&H7E&) + Chr(&H73&) + Chr(&H74&) + Chr(&H75&) + _ Chr(&H76&) + Chr(&H77&) + Chr(&H78&) + Chr(&H79&) + Chr(&H7A&) + Chr(&HD2&) + Chr(&HD3&) + Chr(&HD4&) + Chr(&H5B&) + Chr(&HD6&) + Chr(&HD7&) + _ Chr(&HD8&) + Chr(&HD9&) + Chr(&HDA&) + Chr(&HDB&) + Chr(&HDC&) + Chr(&HDD&) + Chr(&HDE&) + Chr(&HDF&) + Chr(&HE0&) + Chr(&HE1&) + Chr(&HE2&) + _ Chr(&HE3&) + Chr(&HE4&) + Chr(&H5D&) + Chr(&HE6&) + Chr(&HE7&) + Chr(&H7B&) + Chr(&H41&) + Chr(&H42&) + Chr(&H43&) + Chr(&H44&) + Chr(&H45&) + _ Chr(&H46&) + Chr(&H47&) + Chr(&H48&) + Chr(&H49&) + Chr(&HE8&) + Chr(&HE9&) + Chr(&HEA&) + Chr(&HEB&) + Chr(&HEC&) + Chr(&HED&) + Chr(&H7D&) + _ Chr(&H4A&) + Chr(&H4B&) + Chr(&H4C&) + Chr(&H4D&) + Chr(&H4E&) + Chr(&H4F&) + Chr(&H50&) + Chr(&H51&) + Chr(&H52&) + Chr(&HEE&) + Chr(&HEF&) + _ Chr(&HF0&) + Chr(&HF1&) + Chr(&HF2&) + Chr(&HF3&) + Chr(&H5C&) + Chr(&H9F&) + Chr(&H53&) + Chr(&H54&) + Chr(&H55&) + Chr(&H56&) + Chr(&H57&) + _ Chr(&H58&) + Chr(&H59&) + Chr(&H5A&) + Chr(&HF4&) + Chr(&HF5&) + Chr(&HF6&) + Chr(&HF7&) + Chr(&HF8&) + Chr(&HF9&) + Chr(&H30&) + Chr(&H31&) + _ Chr(&H32&) + Chr(&H33&) + Chr(&H34&) + Chr(&H35&) + Chr(&H36&) + Chr(&H37&) + Chr(&H38&) + Chr(&H39&) + Chr(&HFA&) + Chr(&HFB&) + Chr(&HFC&) + _ Chr(&HFD&) + Chr(&HFE&) + Chr(&HFF&) Private F0 As Byte = &HF0 ' Bit Mask to get High Order Nibble Private F As Byte = &HF ' Bit Mask to get Low Order Nibble 'This method is called once for every row that passes through the component from Input0. Public Overrides Sub Input0_ProcessInputRow(ByVal Row As Input0Buffer) Row.CustomerName = Translate(Row.Name, EBCDICtoASCII) Row.CustomerAddress = Translate(Row.Address, EBCDICtoASCII) Row.CustomerCategory = Convert.ToByte(unpack(Row.Category, 0)) Row.CustomerBalance = unpack(Row.Balance, 2) End Sub Public Function Translate(ByVal str As String, sConversionTable As String) As String Dim Temp As String, I As Int32 Temp = Space(Len(Str)) For I = 1 To Len(str) Mid(Temp, I, 1) = Mid(sConversionTable, Asc(Mid(str, I, 1)) + 1, 1) Next Translate = Temp End Function Private Function unpack(ByVal bPacked As Byte(), ByVal iDecimalPlaces As Integer) As Decimal Dim i As Integer Dim HiNibble As Byte Dim LoNibble As Byte Dim dResult As Decimal = 0 For i = 0 To bPacked.Length - 2 UnpackNibblesToBytes(bPacked(i), LoNibble, HiNibble) dResult = Convert.ToDecimal(dResult * 10 ^ 2I + HiNibble * 10 + LoNibble) Next UnpackNibblesToBytes(bPacked(bPacked.Length - 1), LoNibble, HiNibble) dResult = dResult * 10 + HiNibble If LoNibble = &HD& Then dResult = -dResult End If unpack = Convert.ToDecimal(dResult * 10 ^ (-iDecimalPlaces)) End Function Private Sub UnpackNibblesToBytes(ByVal InputNibbles As Byte, ByRef LoNibble As Byte, ByRef HiNibble As Byte) LoNibble = (Me.F And InputNibbles) HiNibble = ((Me.F0 And InputNibbles) >> 4) End Sub End Class
How the Script Works
If you have read my previous tip about importing mainframe data, you will remember that the Visual Studio Editor gives us a Class Template to add our code onto it. This template defines a method called Input0_ProcessInputRow that is where we put the code for the data transformation. I included two functions to handle data conversion: Translate and Unpack.
- Translate: Receives a string with the data to be converted and a string holding the conversion table to convert from EBCDIC to ASCII and vice versa and returns the converted string.
- Unpack: Receives an array of bytes containing the packed number and an integer specifying the implied decimal position. The function returns a decimal number. Also the Unpack function uses the UnpackNibblesToBytes helper function to do the math in order to split a byte into its composing nibbles.
Next is a table that reviews the functions and procedures used in this tip.
Function | Description | Parameters | Return Value |
---|---|---|---|
Translate | Converts a string from EBCDIC to ASCII and vice versa. | Str (String): The string to convert. sConversionTable: Lookup table used to map characters from one code page to another. | A string containing the converted string. |
Unpack | Converts a packed number to decimal format. | bPacked (Byte Array): A byte array containing the packed number to be converted to decimal format. iDecimalPlaces (Integer): The number of implied decimal digits for the packed number. | The decimal number representation of the input packed number. |
UnpackNibblesToBytes | Splits a byte into its composing nibbles. | InputByte (Byte): A byte to be splitted into its composing nibbles. LoNibble (ByRef Byte): Used to return the low order nibble. HiNibble (ByRef Byte): Used to return the High order nibble. |
N/A. Since this is not a function it doesn’t returns a value. Instead the nibbles are returned into the ByRef parameters. |
Now we are ready to execute the SSIS package and after it completes we can perform a SELECT to the destination table and you will see that 1000 rows have been loaded into our sample table.
Next Steps
- You can download the SSIS package and the sample text file used for this tip here.
- Read my previous tip about Importing Mainframe Data with SSIS.
- For more information about conversion between data types check out this tip: SQL Server Integration Services Data Type Conversion Testing.
- Browse Integration Services Data Flow Transformations Tips Category for more tips about Data Flow Transformations.
- If you want to enrich your SSIS packages and make them look more professional, take a look at Integration Services Best Practices Tips Category.
- Also check out SQL Server Business Intelligence Tips and Tricks.
About the author
This author pledges the content of this article is based on professional experience and not AI generated.
View all my tips
Article Last Updated: 2015-01-20