SQL Server Min Max Column Normalization for Data Mining

By:   |   Updated: 2014-12-23   |   Comments   |   Related: > TSQL


Problem

In data mining, we often need to perform min-max normalization on numeric data type columns to prevent one column from skewing or dominating the models produced by the machine learning algorithms.

Solution

In this tip, we will demonstrate how to use T-SQL to perform a min-max normalization from 0.0 to 1.0 inclusive, which can be represented as [0.0, 1.0]. The normalized value equals (the original column value minus the minimum column value) divided by the (maximum column value minus the minimum column value).

We will create a table that contains two data columns, AttributeX and AttributeY.

IF EXISTS (SELECT name FROM sys.tables WHERE name = N'tblExample')
  DROP TABLE dbo.tblExample
go
create table dbo.tblExample
(
  pKey integer identity(1,1) PRIMARY KEY,
  AttributeX decimal(7,3),
  AttributeY decimal(7,3)
)

Next, we will populate this table with 27 rows of data. AttributeY will be 10 times the value of AttributeX.

insert into dbo.tblExample values (24, 240)
insert into dbo.tblExample values (26, 260)
insert into dbo.tblExample values (27, 270)
insert into dbo.tblExample values (27, 270)
insert into dbo.tblExample values (30, 300)
insert into dbo.tblExample values (31, 310)
insert into dbo.tblExample values (31, 310)
insert into dbo.tblExample values (32, 320)
insert into dbo.tblExample values (33, 330)
insert into dbo.tblExample values (33, 330)
insert into dbo.tblExample values (36, 360)
insert into dbo.tblExample values (36, 360)
insert into dbo.tblExample values (36, 360)
insert into dbo.tblExample values (36, 360)
insert into dbo.tblExample values (41, 410)
insert into dbo.tblExample values (44, 440)
insert into dbo.tblExample values (44, 440)
insert into dbo.tblExample values (46, 460)
insert into dbo.tblExample values (46, 460)
insert into dbo.tblExample values (46, 460)
insert into dbo.tblExample values (46, 460)
insert into dbo.tblExample values (47, 470)
insert into dbo.tblExample values (51, 510)
insert into dbo.tblExample values (56, 560)
insert into dbo.tblExample values (57, 570)
insert into dbo.tblExample values (63, 630)
insert into dbo.tblExample values (81, 810)

After inserting the rows, we will select the row count and the minimum and maximum values from our data columns.

select 
count(*) as RowQuantity,
min(AttributeX) as MinimumX,
max(AttributeX) as MaximumX,
min(AttributeY) as MinimumY,
max(AttributeY) as MaximumY
from dbo.tblExample


Data statistics

T-SQL Code for Min Max Normalization in SQL Server

Now we will use the T-SQL code below to perform min-max normalizations on AttributeX and AttributeY.

--Declare the variables
declare @minX as decimal(7,3)
declare @maxX as decimal(7,3)
declare @minY as decimal(7,3)
declare @maxY as decimal(7,3)

--Set the variables
set @minX=(select min(AttributeX) from dbo.tblExample)
set @maxX=(select max(AttributeX) from dbo.tblExample)
set @minY=(select min(AttributeY) from dbo.tblExample)
set @maxY=(select max(AttributeY) from dbo.tblExample)

--Perform the normalization
select 
AttributeX, 
(AttributeX - @minX)/(@maxX - @minX) as NormalizedX,
AttributeY, 
(AttributeY - @minY)/(@maxY - @minY) as NormalizedY 
from dbo.tblExample

The results from the previous query are shown below. Notice how the normalized values for AttributeX and AttributeY are equal on each row, while the original values differ.

Top 10 rows from device B
Next Steps

Normalized values can be inserted into a table and then read into data mining algorithms. You can read more about SQL Server data mining in the following tips from MSSQLTips.com.



sql server categories

sql server webinars

subscribe to mssqltips

sql server tutorials

sql server white papers

next tip



About the author
MSSQLTips author Dallas Snider Dr. Dallas Snider is an Assistant Professor in the Computer Science Department at the University of West Florida and has 18+ years of SQL experience.

This author pledges the content of this article is based on professional experience and not AI generated.

View all my tips


Article Last Updated: 2014-12-23

Comments For This Article

















get free sql tips
agree to terms